> From: Jed Brown [mailto:j...@jedbrown.org] > Sent: Monday, May 6, 2019 16:35
> Nice paper, thanks! Did you investigate latency impact from the IPC counting > semaphore? Is your test code available? Not that deep. Basically I was looking only if its positive effect is enough to overcome the impact of oversubscription or not. It is, but not in all the cases. It is also hard to separate one impact/effect from another, e.g.: some parallel regions ask for all the threads but use a few, which results in undersubscription when serializing parallel regions in OpenMP. IPC for coordinating TBB processes solves the resource exhaustion problem and gives additional performance in some cases. However, Linux is usually good enough for scheduling multiple multithreaded processes. I guess it's because it sees how threads are grouped, which is not the case for multiple concurrent parallel regions with OpenMP threads in the same process. All the results from the blog, paper, talks, and demo are available at https://github.com/IntelPython/composability_bench Regards // Anton