Thanks Aron. That would work but at the cost of wasting idle cores when threads join and the rest of MPI-based code is running, correct?
On Fri, Apr 20, 2012 at 11:44 AM, Aron Ahmadia <aron.ahmadia at kaust.edu.sa>wrote: > If I use, say Np = 16 processes on one node, MPI is running 16 versions of >> the code on a single node (which has 16 cores). How does OpenMP figure out >> how to fork? Does it fork a total of 16 threads/MPI process = 256 threads >> or is it smart to just fork a total of 16 threads/node = 1 thread/core = 16 >> threads? I'm a bit confused here how the job is scheduled when MPI and >> OpenMP are mixed? >> > > This is one important use for OMP_NUM_THREADS. If you're trying to > increase the amount of memory per process, you should map one process per > node and set OMP_NUM_THREADS to the number of OpenMP threads you'd like. > There are lots of tutorials and even textbooks now that discuss hybrid > programming techniques that you should look to for more information (or you > could try scicomp.stackexchange.com). > > Cheers, > Aron > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120420/1059d81f/attachment.htm>