Yes, that's right. Though if you're memory-bound you may not notice as great of an effect.
A On Fri, Apr 20, 2012 at 9:49 PM, Mohammad Mirzadeh <mirzadeh at gmail.com>wrote: > Thanks Aron. That would work but at the cost of wasting idle cores when > threads join and the rest of MPI-based code is running, correct? > > > On Fri, Apr 20, 2012 at 11:44 AM, Aron Ahmadia <aron.ahmadia at > kaust.edu.sa>wrote: > >> If I use, say Np = 16 processes on one node, MPI is running 16 versions >>> of the code on a single node (which has 16 cores). How does OpenMP figure >>> out how to fork? Does it fork a total of 16 threads/MPI process = 256 >>> threads or is it smart to just fork a total of 16 threads/node = 1 >>> thread/core = 16 threads? I'm a bit confused here how the job is scheduled >>> when MPI and OpenMP are mixed? >>> >> >> This is one important use for OMP_NUM_THREADS. If you're trying to >> increase the amount of memory per process, you should map one process per >> node and set OMP_NUM_THREADS to the number of OpenMP threads you'd like. >> There are lots of tutorials and even textbooks now that discuss hybrid >> programming techniques that you should look to for more information (or you >> could try scicomp.stackexchange.com). >> >> Cheers, >> Aron >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120420/24b3a01c/attachment.htm>
