Re: [Mesa-dev] threading in OSMesa and gallium swr driver

Alexandre Sun, 20 May 2018 11:07:30 -0700

> 
> Yes, getting different threading libraries to agree can be tricky.  Does your 
> application overlap heavy compute with graphics rendering?  If not, the 
> oversubscription point might be moot.  One bit of advice we give to TBB 
> library users is to initialize the TBB library before creating an OpenGL/SWR 
> context.  This allows TBB to size its thread pool to the entire machine, and 
> then SWR will come in and create all its threads.  The other way round, SWR 
> binds threads to cores, which TBB understands as unavailable resources 
> resulting in a thread pool size of one.
> 
> If your concern is multiple SWR contexts running simultaneously and 
> oversubscribing, it’s true that the swr thread pool creation is per-context 
> and as Bruce says the only way to prevent that currently is setting the 
> environmental variable to limit the number of worker threads.  This number 
> should be greater than 1 for good performance, though.
> 
> -Tim


Well we have libraries that make use of OpenMP for threading internal 
processing, such as CImg https://github.com/dtschump/CImg 
<https://github.com/dtschump/CImg>.

Our application implements a data flow computation graph where potentially 
nodes in the graph can compute concurrently. Each node on its own can also 
access threading either manually by splitting the image into threaded tiles or 
using OpenMP. This graph may also contain nodes that do GPU processing (mainly 
OpenGL). This is challenging because we have to schedule GPU/CPU threads and 
memory, plus we have the constraint of always being able to have a CPU fallback 
for when rendering on a server where there’s no GPU.

OSMesa is a really good fit: it allows us to have (almost) the same OpenGL 
implementation for our nodes.
I guess to achieve greatest performance we have to port a maximum of our node 
operators to OpenGL so that we don’t need to rely on other threading tech in 
the middle.

TBB +  SWR + OpenMP will produce over-threading no matter what.
We had in the past the same kind of issues for example with the OpenEXR library 
that internally uses its own thread pool, however there is a way to use the 
library on the client threads as a workaround, that’s why I was suggesting the 
same thing for OpenSWR.

We don’t have many options technology-wise:
- OSMesa
- Halide (also uses its own thread-pool), see 
https://github.com/halide/Halide/issues/129 
<https://github.com/halide/Halide/issues/129> 
- AMD HIP or NVIDIA Cuda (the latter can only run on Nvidia which we don’t want 
to be tied to)

In the meantime, I will try to play around with the number of threads, while 
waiting for a better solution.

Thank you 

Alexandre 


> 
>> An alternative solution would be to have a callback mechanism in OpenSWR to 
>> launch a task on the application.
>> 
>> Cheers
>> 
>> Alex
>> 
>> 
>>> On 16 May 2018, at 14:34, Cherniak, Bruce <bruce.chern...@intel.com 
>>> <mailto:bruce.chern...@intel.com>> wrote:
>>> 
>>>> 
>>>> On May 14, 2018, at 8:59 AM, Alexandre 
>>>> <alexandre.gauthier-foic...@inria.fr 
>>>> <mailto:alexandre.gauthier-foic...@inria.fr>> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> Sorry for the inconvenience if this message is not appropriate for this 
>>>> mailing list.
>>>> 
>>>> The following is a question for developers of the swr driver of gallium.
>>>> 
>>>> I am the main developer of a motion graphics application. 
>>>> Our application internally has a dependency graph where each node may run 
>>>> concurrently.
>>>> We use OpenGL extensively in the implementation of the nodes (for example 
>>>> with Shadertoy).
>>>> 
>>>> Our application has 2 main requirements: 
>>>> - A GPU backend, mainly for user interaction and fast results
>>>> - A CPU backend for batch rendering
>>>> 
>>>> Internally we use OSMesa for CPU backend so that our code is mostly 
>>>> identical for both GPU and CPU paths.
>>>> However when it comes to CPU, our application is heavily multi-threaded: 
>>>> each processing node can potentially run in parallel of others as a 
>>>> dependency graph.
>>>> We use Intel TBB to schedule the CPU threads.
>>>> 
>>>> For each actual hardware thread (not task) we allocate a new OSMesa 
>>>> context so that we can freely multi-thread operators rendering. It works 
>>>> fine with llvmpipe and also SWR so far (with a  patch to fix some static 
>>>> variables inside state_trackers/osmesa.c).
>>>> 
>>>> However with SWR using its own thread pool, I’m afraid of over-threading, 
>>>> introducing a bottleneck in threads scheduling
>>>> e.g: on a 32 cores processor, we already have lets say 24 threads busy on 
>>>> a TBB task on each core with 1 OSMesa context. 
>>>> I looked at the code and all those concurrent OSMesa contexts will create 
>>>> a SWR context and each will try to initialise its own thread pool in 
>>>> CreateThreadPool in swr/rasterizer/core/api.cpp 
>>>> 
>>>> Is there a way to have a single “static” thread-pool shared across all 
>>>> contexts ?
>>> 
>>> There is not currently a way to create a single thread-pool shared across 
>>> all contexts.  Each context creates unique worker threads.
>>> 
>>> However, OpenSWR provides an environment variable, KNOB_MAX_WORKER_THREADS, 
>>> that overrides the default thread allocation.
>>> Setting this will limit the number of threads created by an OpenSWR context 
>>> *and* prevent the threads from being bound to physical cores.
>>> 
>>> Please, give this a try.  By adjusting the value, you may find the optimal 
>>> value for your situation.
>>> 
>>> Cheers,
>>> Bruce
>>> 
>>>> Thank you
>>>> 
>>>> Alexandre
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org>
>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev 
>>>> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org>
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev 
>> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] threading in OSMesa and gallium swr driver

Reply via email to