Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-12-02 Thread Gilles Gouaillardet
+1 if i remember correctly, all the interfaces are scanned, so there should be some room to display a user-friendly message (on Linux and impacted architectures) such as "there is no loopback interface, you will likely run into some trouble" Gilles On 2014/12/03 13:50, Paul Hargrove wrote: >

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-12-02 Thread Paul Hargrove
IMHO the lack of a loopback interface should be a very uncommon occurrence. So, I believe that improving the error message to mention that possibility would help a great deal. -Paul On Tue, Dec 2, 2014 at 8:28 PM, Ralph Castain wrote: > We talked about this on the weekly

Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at execution on an embedded ARM Linux kernel version 3.15.0

2014-12-02 Thread Ralph Castain
We talked about this on the weekly conference call, and adding the usock component to 1.8 is just not within our procedures. It would involve bringing over much more of the OOB revisions (we’d have to handle the transfer of messages between components, if nothing else), and that involves a lot

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Ralph Castain
It is working for me, but I’m not sure if that is because of these changes or if it always worked for me. I haven’t tested the slurm integration in awhile. > On Dec 2, 2014, at 7:59 PM, Artem Polyakov wrote: > > Howard, does current mater fix your problems? > > среда, 3

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Artem Polyakov
Howard, does current mater fix your problems? среда, 3 декабря 2014 г. пользователь Artem Polyakov написал: > > 2014-12-03 8:30 GMT+06:00 Jeff Squyres (jsquyres) >: > >> On Dec 2, 2014, at 8:43 PM, Artem Polyakov

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Artem Polyakov
2014-12-03 8:30 GMT+06:00 Jeff Squyres (jsquyres) : > On Dec 2, 2014, at 8:43 PM, Artem Polyakov wrote: > > > Jeff, your fix brakes my system again. Actually you just reverted my > changes. > > No, I didn't just revert them -- I made changes. I did forget

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Jeff Squyres (jsquyres)
On Dec 2, 2014, at 8:43 PM, Artem Polyakov wrote: > Jeff, your fix brakes my system again. Actually you just reverted my changes. No, I didn't just revert them -- I made changes. I did forget about the second -I, though (to be fair, the 2nd -I was the *only* -I in there

[hwloc-devel] Create success (hwloc git 1.10.0-15-g6f484fd)

2014-12-02 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.10.0-15-g6f484fd Start time: Tue Dec 2 21:04:24 EST 2014 End time: Tue Dec 2 21:05:55 EST 2014 Your friendly daemon, Cyrador

[hwloc-devel] Create success (hwloc git 1.9.1-17-g71da0f1)

2014-12-02 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.9.1-17-g71da0f1 Start time: Tue Dec 2 21:03:01 EST 2014 End time: Tue Dec 2 21:04:24 EST 2014 Your friendly daemon, Cyrador

[hwloc-devel] Create success (hwloc git dev-288-ga606d35)

2014-12-02 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc dev-288-ga606d35 Start time: Tue Dec 2 21:01:01 EST 2014 End time: Tue Dec 2 21:02:52 EST 2014 Your friendly daemon, Cyrador

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Artem Polyakov
Hello, Jeff, your fix brakes my system again. Actually you just reverted my changes. Here is what I have: configure:5441: *** GNU libltdl setup configure:296939: checking location of libltdl configure:296952: result: internal copy configure:297028: OPAL configuring in opal/libltdl

Re: [OMPI devel] RFC: update opal lifo class and add fifo class

2014-12-02 Thread Nathan Hjelm
On Tue, Dec 02, 2014 at 05:54:04PM -0500, George Bosilca wrote: >The FIFO implementation doesn't look right to me. I don't have time to >look at it right now, but just looking at the push it will not correctly >succeed if two threads are pushing items in same time. >A FIFO is a

Re: [OMPI devel] RFC: update opal lifo class and add fifo class

2014-12-02 Thread George Bosilca
The FIFO implementation doesn't look right to me. I don't have time to look at it right now, but just looking at the push it will not correctly succeed if two threads are pushing items in same time. A FIFO is a very sensitive algorithm, and should be treated accordingly. Moreover, there is no

Re: [OMPI devel] Introducing memkind + Adding component in mpool framework

2014-12-02 Thread Jeff Squyres (jsquyres)
Vish -- In general, this sounds like a great idea. We talked about this on the call today, and it looks like it's going to take a bit of thought into how to integrate this into OMPI. I.e., we might have to adjust the mpool and/or allocator frameworks a bit first. Is there any chance that you

Re: [OMPI devel] Introducing memkind + Adding component in mpool framework

2014-12-02 Thread Ralph Castain
Hi Vish We talked about this on today’s telecon and people are generally supportive of the concept. However, the feeling was that this will take some thought and fair amount of work to modify mpool and the allocators properly to do this the “right way”. So people asked if you could come to

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Jeff Squyres (jsquyres)
I'm able to replicate Edgar's problem. I'm investigating... On Dec 2, 2014, at 10:39 AM, Edgar Gabriel wrote: > the mailing list refused to let me add the config.log file, since it is too > large, I can forward the output to you directly as well (as I did to Jeff). > > I

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Edgar Gabriel
the mailing list refused to let me add the config.log file, since it is too large, I can forward the output to you directly as well (as I did to Jeff). I honestly have not looked into the configure logic, I can just tell that OPAL_HAVE_LTDL_ADVISE is not set on my linux system for master, but

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Artem Polyakov
2014-12-02 20:59 GMT+06:00 Edgar Gabriel : > didn't want to interfere with this thread, although I have a similar > issue, since I have the solution nearly fully cooked up. But anyway, this > last email gave the hint on why we have suddenly the problem in ompio: > > it looks

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Edgar Gabriel
I checked with the debugger, that it did skip the entire section On 12/2/2014 9:04 AM, Jeff Squyres (jsquyres) wrote: Oy -- I thought we fixed that. :-( Are you saying that configure output says that ltdladvise is not found? On Dec 2, 2014, at 9:59 AM, Edgar Gabriel

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Jeff Squyres (jsquyres)
Oy -- I thought we fixed that. :-( Are you saying that configure output says that ltdladvise is not found? On Dec 2, 2014, at 9:59 AM, Edgar Gabriel wrote: > didn't want to interfere with this thread, although I have a similar issue, > since I have the solution nearly

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Edgar Gabriel
didn't want to interfere with this thread, although I have a similar issue, since I have the solution nearly fully cooked up. But anyway, this last email gave the hint on why we have suddenly the problem in ompio: it looks like OPAL_HAVE_LTDL_ADVISE (at least on my systems) is not set

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Jeff Squyres (jsquyres)
Looks like I was totally lying in http://www.open-mpi.org/community/lists/devel/2014/12/16381.php (where I said we should not use RTLD_GLOBAL). We *do* use RTLD_GLOBAL: https://github.com/open-mpi/ompi/blob/master/opal/mca/base/mca_base_component_repository.c#L124 This ltdl advice object is

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Artem Polyakov
Agree. First you should check is to what value OPAL_HAVE_LTDL_ADVISE is set. If it is zero - very probably this is the same bug as mine. 2014-12-02 17:33 GMT+06:00 Ralph Castain : > It does look similar - question is: why didn’t this fix the problem? Will > have to

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Artem Polyakov
2014-12-02 17:13 GMT+06:00 Ralph Castain : > Hmmm…if that is true, then it didn’t fix this problem as it is being > reported in the master. > I had this problem on my laptop installation. You can check my report it was detailed enough and see if you hitting the same issue. My

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-02 Thread Artem Polyakov
I think this might be related to the configuration problem I was fixing with Jeff few months ago. Refer here: https://github.com/open-mpi/ompi/pull/240 2014-12-02 10:15 GMT+06:00 Ralph Castain : > If it isn’t too much trouble, it would be good to confirm that it remains >