On Mon, Oct 19, 2015 at 10:07 AM, Ola Liljedahl <[email protected]> wrote:
> On 19 October 2015 at 16:43, Bill Fischofer <[email protected]> > wrote: > >> We could do that in linux-generic, which has a fairly small number of >> threads supported. I'd be concerned about how that would scale to systems >> that can support many more threads, especially when NUMA considerations >> come into play. Is it simply unacceptable to have some sort of "finished" >> API call? That would seem so solve the problem in a clean and scalable >> manner. >> > Isn't this conceptually similar to the stop scheduling call so that I can > drain the prescheduling queue and then stop participating in event > processing? In order to allow for "non-ideal" implementations (because > instant sharing of all resources isn't always very performant), we create > API's that tell ODP that this thread wishes to withdraw from processing > using shared resources. > I think that's a useful analogy. We've recently added stop/start APIs to pktio for similar reasons, and of course we have odp_schedule_pause() that serves the same advisory function. We don't need a "start" API for pools (though if you wanted one for symmetry I don't see any harm there) but you really do want a "stop" API. > > >> >> On Mon, Oct 19, 2015 at 9:15 AM, Savolainen, Petri (Nokia - FI/Espoo) < >> [email protected]> wrote: >> >>> A SW implementation can place the per thread stash into shared memory >>> where the thread calling destroy() can see stashes of all other threads. >>> Since application must synchronize the destroy call (to happen only after >>> all free() calls have returned), implementation must just ensure that the >>> destroy call reads fresh stash status data (== it has correct memory >>> read/write barriers in place). Performance should be still good – it’s >>> matter of moving the per thread stash from TLS to shared memory (no >>> additional synchronization per alloc/free). >>> >>> -Petri >>> >>> >>> >>> >>> >>> *From:* EXT Bill Fischofer [mailto:[email protected]] >>> *Sent:* Monday, October 19, 2015 2:26 PM >>> *To:* Savolainen, Petri (Nokia - FI/Espoo) >>> *Cc:* LNG ODP Mailman List >>> *Subject:* Re: [lng-odp] Bug 1851 - odp_pool_destroy() failure >>> >>> >>> >>> This is an important discussion, especially as we look to >>> high-performance SW implementations of ODP. Obviously we can stipulate any >>> functional behavior we want. The question is how much overhead is >>> acceptable to achieve such stipulated functionality? One of the reasons >>> DPDK does not support mempool destroys is this issue of distributed cache >>> management. If we don't want the application to take any responsibility in >>> this area, then the implementation needs to impose additional bookkeeping >>> overhead that will likely impact the performance of normal operation. >>> >>> >>> >>> What's needed is some sort of indication that a thread is not just >>> freeing a buffer, but is done with operations on a pool. One way of doing >>> this is to add an odp_pool_finished() API that tells the implementation >>> that this thread is done with the pool (e.g., asserts that no further >>> alloc() calls will be made by this thread on it). My suggestion in the >>> response to the bug was that odp_pool_destory() can serve this purpose, >>> however I'd have no problem with adding another API that serves the same >>> notification purpose. >>> >>> >>> >>> Without such an API, it's not clear how we can achieve the desired >>> functionality without a lot of additional overhead or removing any sort of >>> safety checks. If the latter is acceptable, we could say that >>> odp_pool_destroy() always succeeds and if the application had any >>> outstanding buffers or tries to use the pool handle following a destroy() >>> call then the result is undefined. >>> >>> >>> >>> >>> >>> >>> >>> On Mon, Oct 19, 2015 at 5:48 AM, Savolainen, Petri (Nokia - FI/Espoo) < >>> [email protected]> wrote: >>> >>> Hi, >>> >>> Linux-generic pool implementation has a bug ( >>> https://bugs.linaro.org/show_bug.cgi?id=1851 ) that prevents dynamic >>> pool destroy. From API point of view, any resource (e.g. pool) is created >>> once ( xxx_create call returns a handle) and destroyed once (pass the >>> handle to xxx_destroy). Any thread can create a resource and any thread can >>> destroy it. Application threads must synchronize resource usage and >>> destroy call, but not implementation specifics like potential usage of per >>> thread stashes or flush of those. >>> >>> For example, this valid usage of the pool API: >>> >>> Thread 1 Thread 2 Thread 3 >>> -------------------------------------------------- >>> >>> init_global() >>> init_local() init_local() init_local() >>> >>> pool = pool_create() >>> >>> barrier() barrier() barrier() >>> buf = alloc(pool) buf = alloc(pool) buf = alloc(pool) >>> free(buf) free(buf) free(buf) >>> barrier() barrier() barrier() >>> >>> pool_destroy(pool) >>> >>> barrier() barrier() barrier() >>> do_something() do_something() do_something() >>> term_local() term_local() term_local() >>> term_global() >>> >>> >>> So, e.g. pool_destroy must succeed when all buffers have been freed >>> before the call - no matter: >>> * which thread calls it >>> * has the calling thread itself called alloc or free >>> * have other threads called already term_local >>> >>> >>> -Petri >>> >>> >>> _______________________________________________ >>> lng-odp mailing list >>> [email protected] >>> https://lists.linaro.org/mailman/listinfo/lng-odp >>> >>> >>> >> >> >> _______________________________________________ >> lng-odp mailing list >> [email protected] >> https://lists.linaro.org/mailman/listinfo/lng-odp >> >> >
_______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
