On Wed, Nov 10, 2010 at 08:06:00PM +0000, Garth N. Wells wrote: > > > On 10/11/10 15:47, Anders Logg wrote: > >On Wed, Nov 10, 2010 at 02:47:30PM +0000, Garth N. Wells wrote: > >>Nice to see multi-thread assembly being added. We should look at > >>adding support for the multi-threaded version of SuperLU. What other > >>multi-thread solvers are out there? > > > >Yes, that would be good, but I don't know which solvers are available. > > > >>I haven't looked at the code in great detail, but are element > >>tensors being added to the global tensor is a thread-safe fashion? > >>Both PETSc and Trilinos are not thread-safe. > > > >Yes, they should. That's the main point. It's a very simple algorithm > >which just partitions the matrix row by row and makes each process > >responsible for a chunk of rows. During assembly, all processes > >iterate over the entire mesh and on each cell does one of three things: > > > > 1. all_in_range: tabulate_tensor as usual and add > > 2. none_in_range: skip tabulate_tensor (continue) > > 3. some_in_range: tabulate_tensor and insert only rows in range > > > >Didem Unat (PhD student at UCLA/Simula) tried this in a simple > >prototype code and got very good speedups (up to a factor 7 on an > >eight-core machine) so it's just a matter of doing the same thing as > >part of DOLFIN (which is a bit trickier since some of the data access > >is hidden). The current implementation in DOLFIN seems to work and > >give some small speedup but I need to do some more testing. > > > >>Rather than having two assembly classes, would it be worth using > >>OpenMP instead? I experimented with OpenMP some time ago, but never > >>added it since at the time it required a very recent version of gcc. > >>This shouldn't be a problem now. > > > >I don't think this would work with OpenMP since we need to control how > >the rows are inserted. > > > > The thread number (id) can be accessed in OpenMP. Is there another > barrier to using OpenMP? > > Garth
I haven't used OpenMP much so I can't say for sure this can't be done with OpenMP, but I don't see the point in using it. Using boost threads is very simple and I have complete control over what each thread does, which data is local and which data is shared. And the code is just standard C++, no ugly pragmas that obfuscate the code. -- Anders _______________________________________________ Mailing list: https://launchpad.net/~dolfin Post to : [email protected] Unsubscribe : https://launchpad.net/~dolfin More help : https://help.launchpad.net/ListHelp

