Thanks Nick,

> >Though Sylvain's original mail (*1) was sent 4 months ago and nobody
> >replied to it, I'm interested in this issue and strongly agree with
> >Sylvain.
> >
> > *1 http://www.open-mpi.org/community/lists/devel/2010/01/7275.php
> >
> >As explained by Sylvain, current Open MPI implementation always returns
> >MPI_THREAD_SINGLE as provided thread level if neither --enable-mpi-threads
> >nor --enable-progress-threads was specified at configure (v1.4).
> 
> I can explain that, from an outside viewpoint.  I can't tell you why
> OpenMPI took that decision, but I can guess.
> 
> That is definitely the correct action.  Unless an application or library
> has been built with thread support, or can guaranteed to be called only
> from a single thread, using threads is catastrophic.  And, regrettably,
> given modern approaches to building software and the **** configure
> design, configure is where the test has to go.

What "with thread support" means?
It means configure --enable-mpi-threads ?

As long as MPI library returns MPI_THREAD_FUNNELED for MPI_Init_thread
and MPI application follows it, MPI functions are guaranteed to be called
only from a single thread. I think it's enough for MPI_THREAD_FUNNELED.
Of course, it's not enough for MPI_THREAD_MULTIPLE.

Ah, "library" in your mail means libc or something other than MPI library?
If so, it makes sense.
Because MPI_THREAD_FUNNELED/SERIALIZED doesn't restrict other threads to
call functions other than those of MPI library, code bellow are not
thread safe if malloc is not thread safe and MPI_Allreduce calls malloc.

    #pragma omp parallel for private(is_master)
    {
        MPI_Is_thread_main(&is_master);
        if (is_master == 0) {   /* master thread */
            MPI_Allreduce(...);
        } else {                /* other threads */
            /* work that calls malloc */
        }
    }

> On some systems, there are certain actions that require thread affinity
> (sometimes including I/O, and often undocumented).  zOS is one, but I
> have seen it under a few Unices, too.
> 
> On others, they use a completely different (and seriously incompatible,
> at both the syntactic and semantic levels) set of libraries.  E.g. AIX.

Sorry, I don't know these issue well.
Do you mean the case I wrote above about malloc?

> >If we use OpenMP with MPI, we need at least MPI_THREAD_FUNNELED even
> >if MPI functions are called only outside of omp parallel region,
> >like below.
> >
> >    #pragma omp parallel for
> >    for (...) {
> >        /* computation */
> >    }
> >    MPI_Allreduce(...);
> >    #pragma omp parallel for
> >    for (...) {
> >        /* computation */
> >    }
> 
> I don't think that's correct.  That would call MPI_Allreduce once for
> each thread, in parallel, on the same process - which wouldn't work.
> I think that what you need is a primitive that OpenMP doesn't have (in
> general), which is a GLOBAL_MASTER construct.  What you have to do is:
> 
>     Each process finds its initial (system) thread id on entry.
>     You test the system thread and call MPI only if on that one.

In C, omp parallel region ends with for-block.
So I think that would call MPI_Allreduce once per process.
# In Fortran, it may require omp end parallel directive to end parallel
# region. But I don't know Fortran well, sorry.

> >This means Open MPI users must specify --enable-mpi-threads or
> >--enable-progress-threads to use OpenMP. Is it true?
> >But this two configure options, i.e. OMPI_HAVE_THREAD_SUPPORT macro,
> >lead to performance penalty by mutex lock/unlock.
> 
> That's unavoidable, in general, with one niggle.  If the programmer
> guarantees BOTH to call MPI on the global master thread AND to ensure
> that all memory is synchronised before it does so, there is no need
> for mutexes.  The MPI specification lacks some of the necessary
> paranoia in this respect.
> 
> >I believe OMPI_HAVE_THREADS (not OMPI_HAVE_THREAD_SUPPORT !) is sufficient
> >to support MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED, and therefore
> >OMPI_HAVE_THREAD_SUPPORT should be OMPI_HAVE_THREADS at following
> >part in ompi_mpi_init function, as suggested by Sylvain.
> 
> I can't comment on that, though I doubt it's quite that simple.  There's
> a big difference between MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED
> in implementation impact.

I can't imagine difference between those two, unless MPI library uses
something thread local. Ah, there may be something on OSes that I don't
know....

Anyway, thanks for your comment!

Regards,
Kawashima

Reply via email to