On 04/13/2011 11:13 PM, mark florisson wrote:
Although there is omp_get_max_threads():
"The omp_get_max_threads routine returns an upper bound on the number
of threads that could be used to form a new team if a parallel region
without a num_threads clause were encountered after execution returns
from this routine."
So we could have threadsvailable() evaluate to that if encountered
outside a parallel region. Inside, it would evaluate to
omp_get_num_threads(). At worst, people would over-allocate a bit.
Well, over-allocating could well mean 1 GB, which could well mean
getting an unecesarry MemoryError (or, like in my case, if I'm not
careful to set ulimit, getting a SIGKILL sent to you 2 minutes after the
fact by the cluster patrol process...)
But even ignoring this, we also have to plan for people misusing the
feature. If we put it in there, somebody somewhere *will* write code
like this:
nthreads = threadsavailable()
with parallel:
for i in prange(nthreads):
for j in range(100*i, 100*(i+1)): [...]
(Yes, they shouldn't. Yes, they will.)
Combined with a race condition that will only very seldomly trigger,
this starts to sound like a very bad idea indeed.
So I agree with you that we should just leave it for now, and do
single/barrier later.
DS
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel