Re: [OMPI devel] reduce_scatter bug with hierarch

2009-01-13 Thread Tim Mattox
George, I suggest that you file a CMR for r20267 and we can go from there. If it makes 1.3 it makes it, otherwise we have it ready for 1.3.1 At this point the earliest 1.3 will go out is Wednesday late morning (presuming I'm the one moving the bits), and is more likely to hit the website in the

Re: [OMPI devel] reduce_scatter bug with hierarch

2009-01-13 Thread Jeff Squyres
Let's debate tomorrow when people are around, but first you have to file a CMR... :-) On Jan 13, 2009, at 10:28 PM, George Bosilca wrote: Unfortunately, this pinpoint the fact that we didn't test enough the collective module mixing thing. I went over the tuned collective functions and

Re: [OMPI devel] reduce_scatter bug with hierarch

2009-01-13 Thread George Bosilca
Unfortunately, this pinpoint the fact that we didn't test enough the collective module mixing thing. I went over the tuned collective functions and changed all instances to use the correct module information. It is now on the trunk, revision 20267. Simultaneously,I checked that all other

Re: [OMPI devel] Open MPI v1.2.9rc2 has been posted

2009-01-13 Thread Jeff Squyres
I ran the 1.2.9rc2 tarball thought a full gambit of MTT and give it a thumbs up. On Jan 13, 2009, at 1:35 PM, Tim Mattox wrote: Hi All, The second release candidate of Open MPI v1.2.9 is now available: http://www.open-mpi.org/software/ompi/v1.2/ Please run it through it's paces as best you

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread George Bosilca
That's pretty weird, but you're right. Here is the code that do exactly what you state. } else if(strcmp(ompi_mtl_base_selected_component- >mtl_version.mca_component_name, "psm") != 0) { /* if mtl is not PSM then back down priority, and require the user to */ /* specify

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Brian Barrett
George - I don't care what we end up doing, but what you state is wrong. We do not use the CM for all other MTLs by default. PSM is the *ONLY* MTL that will cause CM to be used by default. Portals still falls back to OB1 by default. Again, don't care, don't want to change, just want

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread George Bosilca
This topic was raised on the mailing list quite a few times. There is a major difference between the PSM and the MX support. For PSM there is just an MTL, which makes everything a lot simpler. The problem with MX is that we have an MTL and a BTL. In order to figure out which one to use, we

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Jeff Squyres
Thanks Brian -- I updated the README here and will CMR it over to v1.3: https://svn.open-mpi.org/trac/ompi/changeset/20265 On Jan 13, 2009, at 8:18 PM, Brian Barrett wrote: On Jan 13, 2009, at 5:48 PM, Patrick Geoffray wrote: Jeff Squyres wrote: Gaah! I specifically asked Patrick and

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Brian Barrett
On Jan 13, 2009, at 5:48 PM, Patrick Geoffray wrote: Jeff Squyres wrote: Gaah! I specifically asked Patrick and George about this and they said that the README text was fine. Grr... When I looked at that time, I vaguely remember that _both_ PMLs were initialized but CM was eventually

Re: [OMPI devel] autosizing the shared memory backing file

2009-01-13 Thread George Bosilca
The simple answer is you can't. The mpool is loaded before the BTLs and on Linux the loader use the RTLD_NOW flag (i.e. all symbols have to be defined or the dlopen call will fail). Moreover, there is no way in Open MPI to exchange information between components except a global variable or

Re: [OMPI devel] reduce_scatter bug with hierarch

2009-01-13 Thread Jeff Squyres
Thanks for digging into this. Can you file a bug? Let's mark it for v1.3.1. I say 1.3.1 instead of 1.3.0 because this *only* affects hierarch, and since hierarch isn't currently selected by default (you must specifically elevate hierarch's priority to get it to run), there's no danger

Re: [OMPI devel] FLOSS Weekly and comment about Mercurial

2009-01-13 Thread Jeff Squyres
It looks like the TracMercurial plugin does most of what is on my wish list -- the IU sysadmins graciously installed a sandbox version of it today that is feeding off the official HG mirror of OMPI's SVN (so it updates about once an hour): https://svn.open-mpi.org/trac/ompi_hg/

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Patrick Geoffray
Jeff Squyres wrote: Gaah! I specifically asked Patrick and George about this and they said that the README text was fine. Grr... When I looked at that time, I vaguely remember that _both_ PMLs were initialized but CM was eventually used because it was the last one. It looked broken, but it

[OMPI devel] autosizing the shared memory backing file

2009-01-13 Thread Eugene Loh
With the sm BTL, there is a file that each process mmaps in for shared memory. I'm trying to get mpool_sm to size the file appropriately. So, I would like mpool_sm to call some mca_btl_sm function that provides a good guess of the size. (mpool_sm creates and mmaps the file, but the size

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Jeff Squyres
Gaah! I specifically asked Patrick and George about this and they said that the README text was fine. Grr... I'll update. On Jan 13, 2009, at 3:38 PM, Bogdan Costescu wrote: On Tue, 13 Jan 2009, Brian W. Barrett wrote: The bottom line, however, is that the OB1 PML will be the default

Re: [OMPI devel] 1.3rc4 AUTHORS nit

2009-01-13 Thread Paul H. Hargrove
OK, fair enough answer. -Paul Jeff Squyres wrote: He hasn't committed anything while he has been at Myricom. On Jan 13, 2009, at 5:57 PM, Paul H. Hargrove wrote: AUTHORS in 1.3rc4 doesn't list csbell's current affiliation (Myricom). -Paul -- Paul H. Hargrove

Re: [OMPI devel] 1.3rc4 AUTHORS nit

2009-01-13 Thread Jeff Squyres
He hasn't committed anything while he has been at Myricom. On Jan 13, 2009, at 5:57 PM, Paul H. Hargrove wrote: AUTHORS in 1.3rc4 doesn't list csbell's current affiliation (Myricom). -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group

[OMPI devel] 1.3rc4 AUTHORS nit

2009-01-13 Thread Paul H. Hargrove
AUTHORS in 1.3rc4 doesn't list csbell's current affiliation (Myricom). -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Tel: +1-510-495-2352 HPC Research Department Fax: +1-510-486-6900 Lawrence Berkeley National

Re: [OMPI devel] 1.3rc4 README "nit"

2009-01-13 Thread Jeff Squyres
Thanks Paul! https://svn.open-mpi.org/trac/ompi/changeset/20260 On Jan 13, 2009, at 4:44 PM, Paul H. Hargrove wrote: Again, sorry for the last-minute input. I noticed the following in README: - Open MPI includes support for a wide variety of supplemental hardware and software package.

Re: [OMPI devel] openmpi-1.3rc4 build failure with qsnet4.30

2009-01-13 Thread George Bosilca
Paul, Thanks for noticing the Elan problem. It appears we miss one patch in the 1.3 (https://svn.open-mpi.org/trac/ompi/changeset/20122). I'll fill a CMR asap. Thanks, george. On Jan 13, 2009, at 16:31 , Paul H. Hargrove wrote: Since it looks like you guys are very close to

[OMPI devel] 1.3rc4 README "nit"

2009-01-13 Thread Paul H. Hargrove
Again, sorry for the last-minute input. I noticed the following in README: - Open MPI includes support for a wide variety of supplemental hardware and software package. When configuring Open MPI, you may need to supply additional flags to the "configure" script in order to tell Open MPI

[OMPI devel] openmpi-1.3rc4 build failure with qsnet4.30

2009-01-13 Thread Paul H. Hargrove
Since it looks like you guys are very close to release, I just grabbed the 1.3rc4 tarball to give it a spin. Unfortunately, the elan BTL is not building: $ ../configure --prefix= CC= CXX=g++-4.3.2> FC= ... $ make ... Making all in mca/btl/elan make[2]: Entering directory

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Bogdan Costescu
On Tue, 13 Jan 2009, Brian W. Barrett wrote: The bottom line, however, is that the OB1 PML will be the default *UNLESS* the PSM (PathScale/Qlogic) MTL can be chosen, in which case the CM PML is used by default. OK, then the README file is wrong or I don't read it properly... In the section

Re: [OMPI devel] -display-map

2009-01-13 Thread Ralph Castain
Hmmm...well, I can't do either for 1.3.0 as it is departing this afternoon. The first option would be very hard to do. I would have to expose the display-map option across the code base and check it prior to printing anything about resolving node names. I guess I should ask: do you only

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Brian W. Barrett
The selection logic for the PML is very confusing and doesn't follow the standard priority selection. The reasons for this are convoluted and not worth discussing here. The bottom line, however, is that the OB1 PML will be the default *UNLESS* the PSM (PathScale/Qlogic) MTL can be chosen, in

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Bogdan Costescu
On Tue, 13 Jan 2009, Tim Mattox wrote: The cm PML does not use BTLs..., only MTLs, so ... the BTL selection is ignored. OK, thanks for clarifying this bit, but... The README for 1.3b2 specifies that CM is now chosen if possible; in my trials, when I specify CM+BTL, it doesn't complain and

Re: [OMPI devel] OpenMPI question

2009-01-13 Thread Jeff Squyres
On Jan 13, 2009, at 7:37 AM, Alex A. Granovsky wrote: Am I correct assuming that OpenMPI memory registration/cache module is completely broken by design on any 32-bit system allowing physical address space larger than 4 GB, and especially when compiled for 32-bit under 64-bit OS (e.g., Linux)?

Re: [OMPI devel] size of shared-memory backing file + maffinity

2009-01-13 Thread Eugene Loh
Lenny Verkhovsky wrote: Actually the size is suppose to be the same, Yes, I would think that that is how it is supposed to work. It just suppose to bind process to it's closer memory node, instead of leaving it to OS. mpool_sm_module.c:82: opal_maffinity_base_bind(, 1,

Re: [OMPI devel] -display-map

2009-01-13 Thread Greg Watson
Ralph, The XML is looking better now, but there is still one problem. To be valid, there needs to be only one root element, but currently you don't have any (or many). So rather than:

[OMPI devel] reduce_scatter bug with hierarch

2009-01-13 Thread Edgar Gabriel
I just debugged the Reduce_scatter bug mentioned previously. The bug is unfortunately not in hierarch, but in tuned. Here is the code snipplet causing the problems: int reduce_scatter (, mca_coll_base_module_t *module) { ... err = comm->c_coll.coll_reduce (, module) ... } but

Re: [OMPI devel] Open MPI v1.3rc4 has been posted

2009-01-13 Thread Jeff Squyres
Per the teleconf this morning: 1. Cisco sanity checks on 1.3rc4 look good 2. Cisco MTT failures that I saw were mostly due to: - coll hierarch is failing with intel test Reduce_scatter_user_c - intel tests failing when OMPI configured --without-mpi-param-check The first of which is

[OMPI devel] Open MPI v1.2.9rc2 has been posted

2009-01-13 Thread Tim Mattox
Hi All, The second release candidate of Open MPI v1.2.9 is now available: http://www.open-mpi.org/software/ompi/v1.2/ Please run it through it's paces as best you can, if you care. -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmat...@gmail.com || timat...@open-mpi.org

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Tim Mattox
Hi Bogdan, Sorry for such a late reply to your e-mail. Glad to hear that the performance anomaly you mentioned below is now gone with 1.3rc3. But I noticed that we either didn't explain something well enough, or not at all... The cm PML does not use BTLs..., only MTLs, so your suggested

Re: [OMPI devel] [Pkg-openmpi-maintainers] Building with rpath disabled

2009-01-13 Thread Jeff Squyres
See this thread on the pkg-openmpi-maintainers list: http://lists.alioth.debian.org/pipermail/pkg-openmpi-maintainers/2009-January/001278.html On Jan 13, 2009, at 12:52 PM, Ralf Wildenhues wrote: Hello Jeff, * Jeff Squyres wrote on Tue, Jan 13, 2009 at 03:39:28PM CET: On Jan 13, 2009,

[OMPI devel] Open MPI v1.3rc4 has been posted

2009-01-13 Thread Tim Mattox
Hi All, The fourth release candidate of Open MPI v1.3 is now available: http://www.open-mpi.org/software/ompi/v1.3/ Please run it through it's paces as best you can. Anticipated release of 1.3 is tonight/tomorrow. -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmat...@gmail.com ||

Re: [OMPI devel] [Pkg-openmpi-maintainers] Building with rpath disabled

2009-01-13 Thread Ralf Wildenhues
Hello Jeff, * Jeff Squyres wrote on Tue, Jan 13, 2009 at 03:39:28PM CET: > On Jan 13, 2009, at 4:54 AM, Manuel Prinz wrote: >> >> You have to pass --disable-rpath explicitely. Building with rpath is >> still the default. I verified by building without passing any option >> to configure and the

Re: [OMPI devel] RFC: Component-izing MPI_Op

2009-01-13 Thread Jeff Squyres
On the call today, no one had any objections to bringing this stuff to the trunk. v1.2.9 and v1.3.0 releases have a higher priority, so I'll bring this stuff over to the trunk when those two releases are done (hopefully tomorrow!). On Jan 10, 2009, at 2:21 PM, Jeff Squyres wrote: FWIW,

Re: [OMPI devel] OpenMPI rpm build 1.3rc3r20226 build failed

2009-01-13 Thread Lenny Verkhovsky
I don't want to move changes ( default value of the flag), since there are important people, for whom it works :) I also think that this is VT issue, but I guess we are the only one who experience the errors. we can now overwrite this params from the environment as a workaround, Mike comitted

Re: [OMPI devel] OpenMPI rpm build 1.3rc3r20226 build failed

2009-01-13 Thread Jeff Squyres
I'm still guessing that this is a distro / compiler issue -- I can build with the default flags just fine...? Can you specify what distro / compiler you were using? Also, if you want to move the changes that have been made to buildrpm.sh to the v1.3 branch, just file a CMR. That file is

Re: [OMPI devel] [Pkg-openmpi-maintainers] Building with rpath disabled

2009-01-13 Thread Jeff Squyres
Just for the web archives: per some off-list discussion, we decided not to take the patch because the Debian guys have a simpler workaround for what they want. On Jan 13, 2009, at 4:54 AM, Manuel Prinz wrote: Am Montag, den 12.01.2009, 18:04 -0500 schrieb Jeff Squyres: I don't see much

Re: [OMPI devel] FLOSS Weekly and comment about Mercurial

2009-01-13 Thread Jeff Squyres
On Jan 12, 2009, at 11:40 PM, Paul Franz wrote: I will see what I can do. Many thanks! I'm going to take full advantage of your offer and ask for the moon, even thought it's a large list. :-) BTW, what kind of integration are you looking for? Do you just want the changeset to be logged

Re: [OMPI devel] Open MPI v1.3rc3 has been posted

2009-01-13 Thread Jeff Squyres
Here's the diff from the NEWS files in the two tarballs (note that some of the items are listed in the [unreleased] 1.2.9 section, meaning that the fixes were applied to both the 1.2 series and 1.3): --- openmpi-1.3rc2/NEWS 2008-12-02 13:50:46.0 -0500 +++ openmpi-1.3rc3/NEWS

[OMPI devel] OpenMPI question

2009-01-13 Thread Alex A. Granovsky
Dear OpenMPI developers, Am I correct assuming that OpenMPI memory registration/cache module is completely broken by design on any 32-bit system allowing physical address space larger than 4 GB, and especially when compiled for 32-bit under 64-bit OS (e.g., Linux)? Thanks so much! Best

Re: [OMPI devel] OpenMPI rpm build 1.3rc3r20226 build failed

2009-01-13 Thread Lenny Verkhovsky
it seems that setting use_default_rpm_opt_flags to 0 solves the problem. Maybe vt developers should take a look on it. Lenny. On Sun, Jan 11, 2009 at 2:40 PM, Jeff Squyres wrote: > This sounds like a distro/compiler version issue. > > Can you narrow down the issue at all? >

Re: [OMPI devel] Open MPI v1.3rc3 has been posted

2009-01-13 Thread Gregor Dschung
Hi, could you please outline the changes between RC2 and RC3? Regards, Gregor > Hi All, >The "third" release candidate of v1.3 is now up on the website: > http://www.open-mpi.org/software/ompi/v1.3/ > Please run it through it's paces as best you can

Re: [OMPI devel] [Pkg-openmpi-maintainers] Building with rpath disabled

2009-01-13 Thread Manuel Prinz
Am Montag, den 12.01.2009, 18:04 -0500 schrieb Jeff Squyres: > I don't see much harm in including this as long as rpath builds are > still the default. If there's a non-default option to disable rpath > builds, that would be fine with me. > > Does this patch disable rpath by default, or do

Re: [OMPI devel] size of shared-memory backing file + maffinity

2009-01-13 Thread Lenny Verkhovsky
Actually the size is suppose to be the same, It just suppose to bind process to it's closer memory node, instead of leaving it to OS. see: mpool_sm_module.c:82:opal_maffinity_base_bind(, 1, mpool_sm->mem_node); Best regards Lenny. On Mon, Jan 12, 2009 at 10:02 PM, Eugene Loh