Hi Wolfgang,

Thanks for your post. I hope your trip was good.

Your summary of TBB sounds like what I have seen in the various pieces
of documentation also. Yes, I placed the snippet of code in the main
function but this was overly cautious, it simply has to be before the
first time a task is used in the code. Currently that seems to be in
managing the DoFs (the ConstraintMatrix class) although this may be
different for different parts of the library and may change as the
library grows. It doesn't sound like changing the number of threads is
an option, but initializing the threads could wait. Perhaps a deal.II
method that hides a call to initialize the right number of TBB threads
and uses an Assert statement to throw an exception if the TBB threads
are already initialized could work?

One weird thing is that on SMP machines you tend to request resources
in terms of CPUs but use resources with threads. If you request 4 CPUs
and spawn only 4, possibly 5, threads then it is likely that the 4
CPUs are under utilized, especially if some of the threads are writing
to the disk. If you spawn more threads presumably they will be run on
more CPUs at some stage. I suppose this is where spawning a regular
thread to handle heavy disk usage, leaving the TBB threads to handle
computations, comes in.

Do you have any idea why the SparseDirectUMFPACK solver might somehow
run outside of the scope of the main function? (N.B. scope is possibly
another reason to initialize the TBB threads in main.) I believe that
it is a serial solver, but somewhere in the solver a stray call to TBB
tasks_scheduler_init seems to be occurring.

Cheers,
Michael

On Tue, Jun 1, 2010 at 1:21 AM, Wolfgang Bangerth
<[email protected]> wrote:
>
> Michael & others,
> thanks for working on these things -- indeed starting 240 threads does not
> seem like a good idea :-)
>
> I admit knowing little about the internals of TBB. My understanding is the
> following:
> - the first time TBB facilities are used, the TBB sees if a task scheduler
>  has already been set up and if not does so automatically. By default,
>  a task scheduler is set up that uses n_cpus+1 threads. This is
>  overridden by the code you post. This number is unrelated to
>  the setting of MultithreadInfo::n_default_threads
> - after that, the TBB sets up a thread pool -- i.e. it starts as many
>  threads as necessary but immediately blocks them, i.e. they are
>  available for work, but don't do work yet
> - it then schedules work to available threads as necessary. Not all threads
>  do work at all time -- in fact, most do nothing most of the time.
>
> Here are a couple questions to find longer-term solutions:
> - I assume you put the code snippet you posted into main(), right? If so
>  we could document this in the appropriate places as the solution to use.
> - Did you ever figure out a way to change the number of threads a scheduler
>  uses *later on*? If there was one, we could have a function that sets
>  MultithreadInfo::n_default_threads *and* changes the scheduler.
>
> By and large, I think the TBB does the right thing most of the time, but
> there should be ways to work around these issues on large NUMA systems...
>
> Best
>  W.
>
> -------------------------------------------------------------------------
> Wolfgang Bangerth                email:            [email protected]
>                                 www: http://www.math.tamu.edu/~bangerth/
>
>
_______________________________________________
dealii mailing list http://poisson.dealii.org/mailman/listinfo/dealii

Reply via email to