On 04/14/2011 09:08 PM, mark florisson wrote:
On 14 April 2011 20:58, Dag Sverre Seljebotn<d.s.seljeb...@astro.uio.no>  wrote:
On 04/14/2011 08:42 PM, mark florisson wrote:

On 14 April 2011 20:29, Dag Sverre Seljebotn<d.s.seljeb...@astro.uio.no>
  wrote:

On 04/13/2011 11:13 PM, mark florisson wrote:

Although there is omp_get_max_threads():

"The omp_get_max_threads routine returns an upper bound on the number
of threads that could be used to form a new team if a parallel region
without a num_threads clause were encountered after execution returns
from this routine."

So we could have threadsvailable() evaluate to that if encountered
outside a parallel region. Inside, it would evaluate to
omp_get_num_threads(). At worst, people would over-allocate a bit.

Well, over-allocating could well mean 1 GB, which could well mean getting
an
unecesarry MemoryError (or, like in my case, if I'm not careful to set
ulimit, getting a SIGKILL sent to you 2 minutes after the fact by the
cluster patrol process...)

But even ignoring this, we also have to plan for people misusing the
feature. If we put it in there, somebody somewhere *will* write code like
this:

nthreads = threadsavailable()
with parallel:
    for i in prange(nthreads):
        for j in range(100*i, 100*(i+1)): [...]

(Yes, they shouldn't. Yes, they will.)

Combined with a race condition that will only very seldomly trigger, this
starts to sound like a very bad idea indeed.

So I agree with you that we should just leave it for now, and do
single/barrier later.

omp_get_max_threads() doesn't have a race, as it returns the upper
bound. So e.g. if between your call and your parallel section less
OpenMP threads become available, then you might get less threads, but
never more.

Oh, now I'm following you.

Well, my argument was that I think erroring in that direction is pretty bad
as well.

Also, even if we're not making it available in cython.parallel, we're not
stopping people from calling omp_get_max_threads directly themselves, which
should be OK for the people who know enough to do this safely...

True, but it wouldn't be as easy to wrap in a #ifdef _OPENMP. In any
event, we could just put a warning in the docs stating that using
threadsavailable outside parallel sections returns an upper bound on
the actual number of threads in a subsequent parallel section.

I don't think outside or within makes a difference -- what about nested parallel sections? At least my intention in the CEP was that threadsavailable was always for the next section (so often it would be 1 after entering the section).

Perhaps just calling it "maxthreads" instead solves the issue.

(Still, I favour just dropping threadsavailable/maxthreads for the time being. It is much simpler to add something later, when we've had some time to use it and reflect about it, than to remove something that shouldn't have been added.)

Dag Sverre
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to