Re: [Cython] prange CEP updated

Dag Sverre Seljebotn Thu, 14 Apr 2011 11:57:12 -0700

On 04/14/2011 08:39 PM, mark florisson wrote:

On 14 April 2011 20:29, Dag Sverre Seljebotn<d.s.seljeb...@astro.uio.no>  wrote:

On 04/13/2011 11:13 PM, mark florisson wrote:


Although there is omp_get_max_threads():

"The omp_get_max_threads routine returns an upper bound on the number
of threads that could be used to form a new team if a parallel region
without a num_threads clause were encountered after execution returns
from this routine."

So we could have threadsvailable() evaluate to that if encountered
outside a parallel region. Inside, it would evaluate to
omp_get_num_threads(). At worst, people would over-allocate a bit.


Well, over-allocating could well mean 1 GB, which could well mean getting an
unecesarry MemoryError (or, like in my case, if I'm not careful to set
ulimit, getting a SIGKILL sent to you 2 minutes after the fact by the
cluster patrol process...)


The upper bound is not "however many threads you think you can start",
but rather "how many threads are considered useful for your machine".
So if you use omp_set_num_threads(), it will return the value you set
there. Otherwise, if you have e.g. a quadcore, it will return 4. The
spec says:

"Note – The return value of the omp_get_max_threads routine can be
used to dynamically allocate sufficient storage for all threads in the
team formed at the subsequent active parallel region."

So this sounds like a viable option.

What would happen here: We have 8 cores. Some code has an OpenMPparallel section with maxthreads=2, and inside the section anotherfunction is called.

That called function uses threadsavailable(), and has a parallel blockthat wants as many threads as it can get.

I don't know the details as well as you do, but my uninformed guess isthat in this case it'd be quite possible with a race whereomp_get_max_threads would return 7 in each case, then the first one tothe parallel would get the 7 threads. The remaining thread then hasallocated storage for 7 threads but only has 1 thread running.

BTW, I'm not sure what the difference is between the original idea andomp_get_max_threads -- in the absence of such races as above, myoriginal idea with entering a parallel section (with the same schedulingparameters) just to see how many threads we got, would work as well?


DS
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Re: [Cython] prange CEP updated

Reply via email to