On 04/13/2011 11:13 PM, mark florisson wrote:

Although there is omp_get_max_threads():

"The omp_get_max_threads routine returns an upper bound on the number
of threads that could be used to form a new team if a parallel region
without a num_threads clause were encountered after execution returns
from this routine."

So we could have threadsvailable() evaluate to that if encountered
outside a parallel region. Inside, it would evaluate to
omp_get_num_threads(). At worst, people would over-allocate a bit.

Well, over-allocating could well mean 1 GB, which could well mean getting an unecesarry MemoryError (or, like in my case, if I'm not careful to set ulimit, getting a SIGKILL sent to you 2 minutes after the fact by the cluster patrol process...)

But even ignoring this, we also have to plan for people misusing the feature. If we put it in there, somebody somewhere *will* write code like this:

nthreads = threadsavailable()
with parallel:
    for i in prange(nthreads):
        for j in range(100*i, 100*(i+1)): [...]

(Yes, they shouldn't. Yes, they will.)

Combined with a race condition that will only very seldomly trigger, this starts to sound like a very bad idea indeed.

So I agree with you that we should just leave it for now, and do single/barrier later.

DS
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to