On May 10 2010, Sylvain Jeaugey wrote:

That is definitely the correct action. Unless an application or library has been built with thread support, or can guaranteed to be called only from a single thread, using threads is catastrophic.

I personnaly see that as a bug, but I certainly lack some knowledge on non-linux OSes.

It's not a bug, except possibly in Linux.  Threading is optional in
POSIX and not all MPI hosts are Unices, anyway.  It would be
reasonable for OpenMPI to demand a certain minimum level of threading
support, given that non-threaded systems are more-or-less dead.

From my point of view, any normal library should be THREAD_SERIALIZED, and thread-safe library should be THREAD_MULTIPLE.

I am not disagreeing, but that's a matter of the system designer's
choice.

I don't see other libraries which claims to be "totally incompatible with the use of threads". They may not be thread-safe, in which case the programmer must ensure locking and memory coherency to use them in conjunction with threads, but that's about what THREAD_SERIALIZED is about IMO.

No, that's not what I meant.  There really are systems out there where
you must compile with a threading option to ensure that threads can be
supported.  AIX is (or, at least, was) one - and a right pain it was,
too!  I had to edit the compile scripts to get them to work with IBM's
own MPI - and I didn't support the MPI+OpenMP mixture, either.

I don't think that's correct.  That would call MPI_Allreduce once for
each thread, in parallel, on the same process - which wouldn't work.

I think the idea is precisely _not_ to call MPI_Allreduce within parallel sections, i.e. only have the master thread call MPI.

Then you need the extra code I mentioned, but that doesn't affect
your main point.  Let's ignore this one.

In my understanding of MPI_THREAD_SERIALIZED, the memory coherency was guaranteed. If not, the programmer has to ensure it.

It can't guarantee it when running under a POSIX-like system; the
programmer has to ensure it.  There is no one-sided synchronisation
mechanism in POSIX.

I can't comment on that, though I doubt it's quite that simple.  There's
a big difference between MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED
in implementation impact.

I don't see the relationship between THREAD_SERIALIZED/FUNNELED and OMPI_HAVE_THREAD_SUPPORT. Actually, OMPI_HAVE_THREAD_SUPPORT seems to have no relationship with how the OS supports threads (that's why I think it is misleading).

Only someone who knows the history of OpenMPI's configure mechanism
could answer that.  I can't.

But I don't see a big difference between THREAD_SERIALIZED and THREAD_FUNNELED anyway. Do you have more information on systems where the caller thread id makes a difference in MPI ?

MPI isn't the issue - it's the underlying facilities.  For example,
some uses of sockets might fail if they weren't on the same thread
as the one that opened the socket.  That was a documented feature of
MVS (now zOS) and I have seen it on several Unices.  POSIX doesn't
specify this properly, and says several things that add unnecessary
confusion.  There is also a potential problem with InfiniBand and
similar transports, but that's tricky to explain.

I don't know how serious this problem is, today, on OpenMPI's target
systems, but I know that it's a real problem and truly evil when it
occurs.  That is, after all, why MPI distinguishes those cases.

Just for the record, we (at Bull) patched our MPI library and had no problem so far with applications using MPI + Threads or MPI + OpenMP, given that they don't call MPI within parallel sections. But of course, we only use linux, so your mileage may vary.

That doesn't surprise me.  Linux is usually free from such gotchas.


Regards,
Nick Maclaren.

Reply via email to