On 19 October 2015 at 16:43, Bill Fischofer <[email protected]> wrote:
> We could do that in linux-generic, which has a fairly small number of > threads supported. I'd be concerned about how that would scale to systems > that can support many more threads, especially when NUMA considerations > come into play. Is it simply unacceptable to have some sort of "finished" > API call? That would seem so solve the problem in a clean and scalable > manner. > Isn't this conceptually similar to the stop scheduling call so that I can drain the prescheduling queue and then stop participating in event processing? In order to allow for "non-ideal" implementations (because instant sharing of all resources isn't always very performant), we create API's that tell ODP that this thread wishes to withdraw from processing using shared resources. > > On Mon, Oct 19, 2015 at 9:15 AM, Savolainen, Petri (Nokia - FI/Espoo) < > [email protected]> wrote: > >> A SW implementation can place the per thread stash into shared memory >> where the thread calling destroy() can see stashes of all other threads. >> Since application must synchronize the destroy call (to happen only after >> all free() calls have returned), implementation must just ensure that the >> destroy call reads fresh stash status data (== it has correct memory >> read/write barriers in place). Performance should be still good – it’s >> matter of moving the per thread stash from TLS to shared memory (no >> additional synchronization per alloc/free). >> >> -Petri >> >> >> >> >> >> *From:* EXT Bill Fischofer [mailto:[email protected]] >> *Sent:* Monday, October 19, 2015 2:26 PM >> *To:* Savolainen, Petri (Nokia - FI/Espoo) >> *Cc:* LNG ODP Mailman List >> *Subject:* Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure >> >> >> >> This is an important discussion, especially as we look to >> high-performance SW implementations of ODP. Obviously we can stipulate any >> functional behavior we want. The question is how much overhead is >> acceptable to achieve such stipulated functionality? One of the reasons >> DPDK does not support mempool destroys is this issue of distributed cache >> management. If we don't want the application to take any responsibility in >> this area, then the implementation needs to impose additional bookkeeping >> overhead that will likely impact the performance of normal operation. >> >> >> >> What's needed is some sort of indication that a thread is not just >> freeing a buffer, but is done with operations on a pool. One way of doing >> this is to add an odp_pool_finished() API that tells the implementation >> that this thread is done with the pool (e.g., asserts that no further >> alloc() calls will be made by this thread on it). My suggestion in the >> response to the bug was that odp_pool_destory() can serve this purpose, >> however I'd have no problem with adding another API that serves the same >> notification purpose. >> >> >> >> Without such an API, it's not clear how we can achieve the desired >> functionality without a lot of additional overhead or removing any sort of >> safety checks. If the latter is acceptable, we could say that >> odp_pool_destroy() always succeeds and if the application had any >> outstanding buffers or tries to use the pool handle following a destroy() >> call then the result is undefined. >> >> >> >> >> >> >> >> On Mon, Oct 19, 2015 at 5:48 AM, Savolainen, Petri (Nokia - FI/Espoo) < >> [email protected]> wrote: >> >> Hi, >> >> Linux-generic pool implementation has a bug ( >> https://bugs.linaro.org/show_bug.cgi?id=1851 ) that prevents dynamic >> pool destroy. From API point of view, any resource (e.g. pool) is created >> once ( xxx_create call returns a handle) and destroyed once (pass the >> handle to xxx_destroy). Any thread can create a resource and any thread can >> destroy it. Application threads must synchronize resource usage and >> destroy call, but not implementation specifics like potential usage of per >> thread stashes or flush of those. >> >> For example, this valid usage of the pool API: >> >> Thread 1 Thread 2 Thread 3 >> -------------------------------------------------- >> >> init_global() >> init_local() init_local() init_local() >> >> pool = pool_create() >> >> barrier() barrier() barrier() >> buf = alloc(pool) buf = alloc(pool) buf = alloc(pool) >> free(buf) free(buf) free(buf) >> barrier() barrier() barrier() >> >> pool_destroy(pool) >> >> barrier() barrier() barrier() >> do_something() do_something() do_something() >> term_local() term_local() term_local() >> term_global() >> >> >> So, e.g. pool_destroy must succeed when all buffers have been freed >> before the call - no matter: >> * which thread calls it >> * has the calling thread itself called alloc or free >> * have other threads called already term_local >> >> >> -Petri >> >> >> _______________________________________________ >> lng-odp mailing list >> [email protected] >> https://lists.linaro.org/mailman/listinfo/lng-odp >> >> >> > > > _______________________________________________ > lng-odp mailing list > [email protected] > https://lists.linaro.org/mailman/listinfo/lng-odp > >
_______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
