On 19 October 2015 at 16:43, Bill Fischofer <[email protected]>
wrote:

> We could do that in linux-generic, which has a fairly small number of
> threads supported.  I'd be concerned about how that would scale to systems
> that can support many more threads, especially when NUMA considerations
> come into play.  Is it simply unacceptable to have some sort of "finished"
> API call?  That would seem so solve the problem in a clean and scalable
> manner.
>
Isn't this conceptually similar to the stop scheduling call so that I can
drain the prescheduling queue and then stop participating in event
processing? In order to allow for "non-ideal" implementations (because
instant sharing of all resources isn't always very performant), we create
API's that tell ODP that this thread wishes to withdraw from processing
using shared resources.


>
> On Mon, Oct 19, 2015 at 9:15 AM, Savolainen, Petri (Nokia - FI/Espoo) <
> [email protected]> wrote:
>
>> A SW implementation can place the per thread stash into shared memory
>> where the thread calling destroy() can see stashes of all  other threads.
>> Since  application must synchronize the destroy call (to happen only after
>> all free() calls have returned), implementation must just ensure that the
>> destroy call reads fresh stash status data (== it has correct memory
>> read/write barriers in place). Performance should be still good – it’s
>> matter of moving the per thread stash from TLS to shared memory (no
>> additional synchronization per alloc/free).
>>
>> -Petri
>>
>>
>>
>>
>>
>> *From:* EXT Bill Fischofer [mailto:[email protected]]
>> *Sent:* Monday, October 19, 2015 2:26 PM
>> *To:* Savolainen, Petri (Nokia - FI/Espoo)
>> *Cc:* LNG ODP Mailman List
>> *Subject:* Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure
>>
>>
>>
>> This is an important discussion, especially as we look to
>> high-performance SW implementations of ODP. Obviously we can stipulate any
>> functional behavior we want. The question is how much overhead is
>> acceptable to achieve such stipulated functionality? One of the reasons
>> DPDK does not support mempool destroys is this issue of distributed cache
>> management. If we don't want the application to take any responsibility in
>> this area, then the implementation needs to impose additional bookkeeping
>> overhead that will likely impact the performance of normal operation.
>>
>>
>>
>> What's needed is some sort of indication that a thread is not just
>> freeing a buffer, but is done with operations on a pool. One way of doing
>> this is to add an odp_pool_finished() API that tells the implementation
>> that this thread is done with the pool (e.g., asserts that no further
>> alloc() calls will be made by this thread on it).  My suggestion in the
>> response to the bug was that odp_pool_destory() can serve this purpose,
>> however I'd have no problem with adding another API that serves the same
>> notification purpose.
>>
>>
>>
>> Without such an API, it's not clear how we can achieve the desired
>> functionality without a lot of additional overhead or removing any sort of
>> safety checks.  If the latter is acceptable, we could say that
>> odp_pool_destroy() always succeeds and if the application had any
>> outstanding buffers or tries to use the pool handle following a destroy()
>> call then the result is undefined.
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Oct 19, 2015 at 5:48 AM, Savolainen, Petri (Nokia - FI/Espoo) <
>> [email protected]> wrote:
>>
>> Hi,
>>
>> Linux-generic pool implementation has a bug (
>> https://bugs.linaro.org/show_bug.cgi?id=1851 ) that prevents dynamic
>> pool destroy. From API point of view, any resource (e.g. pool) is created
>> once ( xxx_create call returns a handle) and destroyed once (pass the
>> handle to xxx_destroy). Any thread can create a resource and any thread can
>> destroy it. Application threads  must synchronize resource usage and
>> destroy call, but not implementation specifics like potential usage of per
>> thread stashes or flush of those.
>>
>> For example, this valid usage of the pool API:
>>
>> Thread 1            Thread 2              Thread 3
>> --------------------------------------------------
>>
>> init_global()
>> init_local()        init_local()          init_local()
>>
>>                     pool = pool_create()
>>
>> barrier()           barrier()             barrier()
>> buf = alloc(pool)   buf = alloc(pool)     buf = alloc(pool)
>> free(buf)           free(buf)             free(buf)
>> barrier()           barrier()             barrier()
>>
>> pool_destroy(pool)
>>
>> barrier()           barrier()             barrier()
>> do_something()      do_something()        do_something()
>> term_local()        term_local()          term_local()
>>                                           term_global()
>>
>>
>> So, e.g. pool_destroy must succeed when all buffers have been freed
>> before the call - no matter:
>> * which thread calls it
>> * has the calling thread itself called alloc or free
>> * have other threads called already term_local
>>
>>
>> -Petri
>>
>>
>> _______________________________________________
>> lng-odp mailing list
>> [email protected]
>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>>
>
>
> _______________________________________________
> lng-odp mailing list
> [email protected]
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to