We could do that in linux-generic, which has a fairly small number of threads supported. I'd be concerned about how that would scale to systems that can support many more threads, especially when NUMA considerations come into play. Is it simply unacceptable to have some sort of "finished" API call? That would seem so solve the problem in a clean and scalable manner.
On Mon, Oct 19, 2015 at 9:15 AM, Savolainen, Petri (Nokia - FI/Espoo) < [email protected]> wrote: > A SW implementation can place the per thread stash into shared memory > where the thread calling destroy() can see stashes of all other threads. > Since application must synchronize the destroy call (to happen only after > all free() calls have returned), implementation must just ensure that the > destroy call reads fresh stash status data (== it has correct memory > read/write barriers in place). Performance should be still good – it’s > matter of moving the per thread stash from TLS to shared memory (no > additional synchronization per alloc/free). > > -Petri > > > > > > *From:* EXT Bill Fischofer [mailto:[email protected]] > *Sent:* Monday, October 19, 2015 2:26 PM > *To:* Savolainen, Petri (Nokia - FI/Espoo) > *Cc:* LNG ODP Mailman List > *Subject:* Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure > > > > This is an important discussion, especially as we look to high-performance > SW implementations of ODP. Obviously we can stipulate any functional > behavior we want. The question is how much overhead is acceptable to > achieve such stipulated functionality? One of the reasons DPDK does not > support mempool destroys is this issue of distributed cache management. If > we don't want the application to take any responsibility in this area, then > the implementation needs to impose additional bookkeeping overhead that > will likely impact the performance of normal operation. > > > > What's needed is some sort of indication that a thread is not just freeing > a buffer, but is done with operations on a pool. One way of doing this is > to add an odp_pool_finished() API that tells the implementation that this > thread is done with the pool (e.g., asserts that no further alloc() calls > will be made by this thread on it). My suggestion in the response to the > bug was that odp_pool_destory() can serve this purpose, however I'd have no > problem with adding another API that serves the same notification purpose. > > > > Without such an API, it's not clear how we can achieve the desired > functionality without a lot of additional overhead or removing any sort of > safety checks. If the latter is acceptable, we could say that > odp_pool_destroy() always succeeds and if the application had any > outstanding buffers or tries to use the pool handle following a destroy() > call then the result is undefined. > > > > > > > > On Mon, Oct 19, 2015 at 5:48 AM, Savolainen, Petri (Nokia - FI/Espoo) < > [email protected]> wrote: > > Hi, > > Linux-generic pool implementation has a bug ( > https://bugs.linaro.org/show_bug.cgi?id=1851 ) that prevents dynamic pool > destroy. From API point of view, any resource (e.g. pool) is created once ( > xxx_create call returns a handle) and destroyed once (pass the handle to > xxx_destroy). Any thread can create a resource and any thread can destroy > it. Application threads must synchronize resource usage and destroy call, > but not implementation specifics like potential usage of per thread stashes > or flush of those. > > For example, this valid usage of the pool API: > > Thread 1 Thread 2 Thread 3 > -------------------------------------------------- > > init_global() > init_local() init_local() init_local() > > pool = pool_create() > > barrier() barrier() barrier() > buf = alloc(pool) buf = alloc(pool) buf = alloc(pool) > free(buf) free(buf) free(buf) > barrier() barrier() barrier() > > pool_destroy(pool) > > barrier() barrier() barrier() > do_something() do_something() do_something() > term_local() term_local() term_local() > term_global() > > > So, e.g. pool_destroy must succeed when all buffers have been freed before > the call - no matter: > * which thread calls it > * has the calling thread itself called alloc or free > * have other threads called already term_local > > > -Petri > > > _______________________________________________ > lng-odp mailing list > [email protected] > https://lists.linaro.org/mailman/listinfo/lng-odp > > >
_______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
