On Tue, Sep 29, 2015 at 09:01:33 +0200, Jakub Jelinek wrote: > On Mon, Sep 28, 2015 at 05:53:42PM +0300, Ilya Verbin wrote: > > Currently the COI emulator is single-threaded, i.e. it is able to run only > > one > > target function at a time, e.g. the following testcase: > > > > #pragma omp parallel sections num_threads(2) > > { > > #pragma omp section > > #pragma omp target > > while (1) > > putchar ('.'); > > > > #pragma omp section > > #pragma omp target > > while (1) > > putchar ('o'); > > } > > > > prints only dots using emul, while using real libcoi it prints: > > ...o.ooooo.o.o...o...o....oooo.oo.o.....o.ooo.oooooo...o.ooooooooo.o...o.ooooooo > > Of course, it's not possible to test new OpenMP 4.1's async features using > > such > > an emulator. > > > > The patch bellow makes it asynchronous, it creates an auxiliary thread for > > each > > COIPipeline in host and in target processes. In general, a new COIPipeline > > is > > created by liboffloadmic for each host thread with offload, i.e. the example > > above has: > > 4 threads in the host process (2 OpenMP threads + 2 auxiliary threads) and > > 3 threads in the target process (1 main thread + 2 auxiliary threads). > > An auxiliary host thread runs a target function in the new thread in target > > process and waits for its completion. When the function is finished, the > > host > > thread signals an event and can run a callback, if it is registered. > > liboffloadmic waits for signalled events by calling COIEventWait. > > This is identical to how real libcoi works. > > > > make check-target-libgomp and some internal tests did not show any > > regression. > > TSan report is clean. Is it OK for trunk? > > For now ok. Though, I'd say I'd prefer if there were no auxiliary threads > on the host side, just whatever thread is asked to send something to/from > the device, wait for something and/or poll for something just polling the > > pipes. Are there auxiliary host threads also for the case when using > the real COI, offloading to hw?
Yes. -- Ilya