A SW implementation can place the per thread stash into shared memory where the thread calling destroy() can see stashes of all other threads. Since application must synchronize the destroy call (to happen only after all free() calls have returned), implementation must just ensure that the destroy call reads fresh stash status data (== it has correct memory read/write barriers in place). Performance should be still good – it’s matter of moving the per thread stash from TLS to shared memory (no additional synchronization per alloc/free). -Petri
From: EXT Bill Fischofer [mailto:[email protected]] Sent: Monday, October 19, 2015 2:26 PM To: Savolainen, Petri (Nokia - FI/Espoo) Cc: LNG ODP Mailman List Subject: Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure This is an important discussion, especially as we look to high-performance SW implementations of ODP. Obviously we can stipulate any functional behavior we want. The question is how much overhead is acceptable to achieve such stipulated functionality? One of the reasons DPDK does not support mempool destroys is this issue of distributed cache management. If we don't want the application to take any responsibility in this area, then the implementation needs to impose additional bookkeeping overhead that will likely impact the performance of normal operation. What's needed is some sort of indication that a thread is not just freeing a buffer, but is done with operations on a pool. One way of doing this is to add an odp_pool_finished() API that tells the implementation that this thread is done with the pool (e.g., asserts that no further alloc() calls will be made by this thread on it). My suggestion in the response to the bug was that odp_pool_destory() can serve this purpose, however I'd have no problem with adding another API that serves the same notification purpose. Without such an API, it's not clear how we can achieve the desired functionality without a lot of additional overhead or removing any sort of safety checks. If the latter is acceptable, we could say that odp_pool_destroy() always succeeds and if the application had any outstanding buffers or tries to use the pool handle following a destroy() call then the result is undefined. On Mon, Oct 19, 2015 at 5:48 AM, Savolainen, Petri (Nokia - FI/Espoo) <[email protected]<mailto:[email protected]>> wrote: Hi, Linux-generic pool implementation has a bug ( https://bugs.linaro.org/show_bug.cgi?id=1851 ) that prevents dynamic pool destroy. From API point of view, any resource (e.g. pool) is created once ( xxx_create call returns a handle) and destroyed once (pass the handle to xxx_destroy). Any thread can create a resource and any thread can destroy it. Application threads must synchronize resource usage and destroy call, but not implementation specifics like potential usage of per thread stashes or flush of those. For example, this valid usage of the pool API: Thread 1 Thread 2 Thread 3 -------------------------------------------------- init_global() init_local() init_local() init_local() pool = pool_create() barrier() barrier() barrier() buf = alloc(pool) buf = alloc(pool) buf = alloc(pool) free(buf) free(buf) free(buf) barrier() barrier() barrier() pool_destroy(pool) barrier() barrier() barrier() do_something() do_something() do_something() term_local() term_local() term_local() term_global() So, e.g. pool_destroy must succeed when all buffers have been freed before the call - no matter: * which thread calls it * has the calling thread itself called alloc or free * have other threads called already term_local -Petri _______________________________________________ lng-odp mailing list [email protected]<mailto:[email protected]> https://lists.linaro.org/mailman/listinfo/lng-odp
_______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
