Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure

Bill Fischofer Mon, 19 Oct 2015 08:25:18 -0700

On Mon, Oct 19, 2015 at 10:07 AM, Ola Liljedahl <[email protected]>
wrote:


> On 19 October 2015 at 16:43, Bill Fischofer <[email protected]>
> wrote:
>
>> We could do that in linux-generic, which has a fairly small number of
>> threads supported.  I'd be concerned about how that would scale to systems
>> that can support many more threads, especially when NUMA considerations
>> come into play.  Is it simply unacceptable to have some sort of "finished"
>> API call?  That would seem so solve the problem in a clean and scalable
>> manner.
>>
> Isn't this conceptually similar to the stop scheduling call so that I can
> drain the prescheduling queue and then stop participating in event
> processing? In order to allow for "non-ideal" implementations (because
> instant sharing of all resources isn't always very performant), we create
> API's that tell ODP that this thread wishes to withdraw from processing
> using shared resources.
>

I think that's a useful analogy.  We've recently added stop/start APIs to
pktio for similar reasons, and of course we have odp_schedule_pause() that
serves the same advisory function.  We don't need a "start" API for pools
(though if you wanted one for symmetry I don't see any harm there) but you
really do want a "stop" API.


>
>
>>
>> On Mon, Oct 19, 2015 at 9:15 AM, Savolainen, Petri (Nokia - FI/Espoo) <
>> [email protected]> wrote:
>>
>>> A SW implementation can place the per thread stash into shared memory
>>> where the thread calling destroy() can see stashes of all  other threads.
>>> Since  application must synchronize the destroy call (to happen only after
>>> all free() calls have returned), implementation must just ensure that the
>>> destroy call reads fresh stash status data (== it has correct memory
>>> read/write barriers in place). Performance should be still good – it’s
>>> matter of moving the per thread stash from TLS to shared memory (no
>>> additional synchronization per alloc/free).
>>>
>>> -Petri
>>>
>>>
>>>
>>>
>>>
>>> *From:* EXT Bill Fischofer [mailto:[email protected]]
>>> *Sent:* Monday, October 19, 2015 2:26 PM
>>> *To:* Savolainen, Petri (Nokia - FI/Espoo)
>>> *Cc:* LNG ODP Mailman List
>>> *Subject:* Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure
>>>
>>>
>>>
>>> This is an important discussion, especially as we look to
>>> high-performance SW implementations of ODP. Obviously we can stipulate any
>>> functional behavior we want. The question is how much overhead is
>>> acceptable to achieve such stipulated functionality? One of the reasons
>>> DPDK does not support mempool destroys is this issue of distributed cache
>>> management. If we don't want the application to take any responsibility in
>>> this area, then the implementation needs to impose additional bookkeeping
>>> overhead that will likely impact the performance of normal operation.
>>>
>>>
>>>
>>> What's needed is some sort of indication that a thread is not just
>>> freeing a buffer, but is done with operations on a pool. One way of doing
>>> this is to add an odp_pool_finished() API that tells the implementation
>>> that this thread is done with the pool (e.g., asserts that no further
>>> alloc() calls will be made by this thread on it).  My suggestion in the
>>> response to the bug was that odp_pool_destory() can serve this purpose,
>>> however I'd have no problem with adding another API that serves the same
>>> notification purpose.
>>>
>>>
>>>
>>> Without such an API, it's not clear how we can achieve the desired
>>> functionality without a lot of additional overhead or removing any sort of
>>> safety checks.  If the latter is acceptable, we could say that
>>> odp_pool_destroy() always succeeds and if the application had any
>>> outstanding buffers or tries to use the pool handle following a destroy()
>>> call then the result is undefined.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Oct 19, 2015 at 5:48 AM, Savolainen, Petri (Nokia - FI/Espoo) <
>>> [email protected]> wrote:
>>>
>>> Hi,
>>>
>>> Linux-generic pool implementation has a bug (
>>> https://bugs.linaro.org/show_bug.cgi?id=1851 ) that prevents dynamic
>>> pool destroy. From API point of view, any resource (e.g. pool) is created
>>> once ( xxx_create call returns a handle) and destroyed once (pass the
>>> handle to xxx_destroy). Any thread can create a resource and any thread can
>>> destroy it. Application threads  must synchronize resource usage and
>>> destroy call, but not implementation specifics like potential usage of per
>>> thread stashes or flush of those.
>>>
>>> For example, this valid usage of the pool API:
>>>
>>> Thread 1            Thread 2              Thread 3
>>> --------------------------------------------------
>>>
>>> init_global()
>>> init_local()        init_local()          init_local()
>>>
>>>                     pool = pool_create()
>>>
>>> barrier()           barrier()             barrier()
>>> buf = alloc(pool)   buf = alloc(pool)     buf = alloc(pool)
>>> free(buf)           free(buf)             free(buf)
>>> barrier()           barrier()             barrier()
>>>
>>> pool_destroy(pool)
>>>
>>> barrier()           barrier()             barrier()
>>> do_something()      do_something()        do_something()
>>> term_local()        term_local()          term_local()
>>>                                           term_global()
>>>
>>>
>>> So, e.g. pool_destroy must succeed when all buffers have been freed
>>> before the call - no matter:
>>> * which thread calls it
>>> * has the calling thread itself called alloc or free
>>> * have other threads called already term_local
>>>
>>>
>>> -Petri
>>>
>>>
>>> _______________________________________________
>>> lng-odp mailing list
>>> [email protected]
>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> lng-odp mailing list
>> [email protected]
>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>

_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure

Reply via email to