George, I suggest that you file a CMR for r20267 and we can
go from there. If it makes 1.3 it makes it, otherwise we have
it ready for 1.3.1 At this point the earliest 1.3 will go out is
Wednesday late morning (presuming I'm the one moving
the bits), and is more likely to hit the website in the
Let's debate tomorrow when people are around, but first you have to
file a CMR... :-)
On Jan 13, 2009, at 10:28 PM, George Bosilca wrote:
Unfortunately, this pinpoint the fact that we didn't test enough the
collective module mixing thing. I went over the tuned collective
functions and
Unfortunately, this pinpoint the fact that we didn't test enough the
collective module mixing thing. I went over the tuned collective
functions and changed all instances to use the correct module
information. It is now on the trunk, revision 20267. Simultaneously,I
checked that all other
I ran the 1.2.9rc2 tarball thought a full gambit of MTT and give it a
thumbs up.
On Jan 13, 2009, at 1:35 PM, Tim Mattox wrote:
Hi All,
The second release candidate of Open MPI v1.2.9 is now available:
http://www.open-mpi.org/software/ompi/v1.2/
Please run it through it's paces as best you
That's pretty weird, but you're right. Here is the code that do
exactly what you state.
} else if(strcmp(ompi_mtl_base_selected_component-
>mtl_version.mca_component_name, "psm") != 0) {
/* if mtl is not PSM then back down priority, and require the
user to */
/* specify
George -
I don't care what we end up doing, but what you state is wrong. We do
not use the CM for all other MTLs by default. PSM is the *ONLY* MTL
that will cause CM to be used by default. Portals still falls back to
OB1 by default. Again, don't care, don't want to change, just want
This topic was raised on the mailing list quite a few times. There is
a major difference between the PSM and the MX support. For PSM there
is just an MTL, which makes everything a lot simpler. The problem with
MX is that we have an MTL and a BTL. In order to figure out which one
to use, we
Thanks Brian -- I updated the README here and will CMR it over to v1.3:
https://svn.open-mpi.org/trac/ompi/changeset/20265
On Jan 13, 2009, at 8:18 PM, Brian Barrett wrote:
On Jan 13, 2009, at 5:48 PM, Patrick Geoffray wrote:
Jeff Squyres wrote:
Gaah! I specifically asked Patrick and
On Jan 13, 2009, at 5:48 PM, Patrick Geoffray wrote:
Jeff Squyres wrote:
Gaah! I specifically asked Patrick and George about this and they
said that the README text was fine. Grr...
When I looked at that time, I vaguely remember that _both_ PMLs were
initialized but CM was eventually
The simple answer is you can't. The mpool is loaded before the BTLs
and on Linux the loader use the RTLD_NOW flag (i.e. all symbols have
to be defined or the dlopen call will fail).
Moreover, there is no way in Open MPI to exchange information between
components except a global variable or
Thanks for digging into this. Can you file a bug? Let's mark it for
v1.3.1.
I say 1.3.1 instead of 1.3.0 because this *only* affects hierarch, and
since hierarch isn't currently selected by default (you must
specifically elevate hierarch's priority to get it to run), there's no
danger
It looks like the TracMercurial plugin does most of what is on my wish
list -- the IU sysadmins graciously installed a sandbox version of it
today that is feeding off the official HG mirror of OMPI's SVN (so it
updates about once an hour):
https://svn.open-mpi.org/trac/ompi_hg/
Jeff Squyres wrote:
Gaah! I specifically asked Patrick and George about this and they said
that the README text was fine. Grr...
When I looked at that time, I vaguely remember that _both_ PMLs were
initialized but CM was eventually used because it was the last one. It
looked broken, but it
With the sm BTL, there is a file that each process mmaps in for shared
memory.
I'm trying to get mpool_sm to size the file appropriately. So, I would
like mpool_sm to call some mca_btl_sm function that provides a good
guess of the size. (mpool_sm creates and mmaps the file, but the size
Gaah! I specifically asked Patrick and George about this and they
said that the README text was fine. Grr...
I'll update.
On Jan 13, 2009, at 3:38 PM, Bogdan Costescu wrote:
On Tue, 13 Jan 2009, Brian W. Barrett wrote:
The bottom line, however, is that the OB1 PML will be the default
OK, fair enough answer.
-Paul
Jeff Squyres wrote:
He hasn't committed anything while he has been at Myricom.
On Jan 13, 2009, at 5:57 PM, Paul H. Hargrove wrote:
AUTHORS in 1.3rc4 doesn't list csbell's current affiliation (Myricom).
-Paul
--
Paul H. Hargrove
He hasn't committed anything while he has been at Myricom.
On Jan 13, 2009, at 5:57 PM, Paul H. Hargrove wrote:
AUTHORS in 1.3rc4 doesn't list csbell's current affiliation (Myricom).
-Paul
--
Paul H. Hargrove phhargr...@lbl.gov
Future Technologies Group
AUTHORS in 1.3rc4 doesn't list csbell's current affiliation (Myricom).
-Paul
--
Paul H. Hargrove phhargr...@lbl.gov
Future Technologies Group Tel: +1-510-495-2352
HPC Research Department Fax: +1-510-486-6900
Lawrence Berkeley National
Thanks Paul!
https://svn.open-mpi.org/trac/ompi/changeset/20260
On Jan 13, 2009, at 4:44 PM, Paul H. Hargrove wrote:
Again, sorry for the last-minute input.
I noticed the following in README:
- Open MPI includes support for a wide variety of supplemental
hardware and software package.
Paul,
Thanks for noticing the Elan problem. It appears we miss one patch in
the 1.3 (https://svn.open-mpi.org/trac/ompi/changeset/20122). I'll
fill a CMR asap.
Thanks,
george.
On Jan 13, 2009, at 16:31 , Paul H. Hargrove wrote:
Since it looks like you guys are very close to
Again, sorry for the last-minute input.
I noticed the following in README:
- Open MPI includes support for a wide variety of supplemental
hardware and software package. When configuring Open MPI, you may
need to supply additional flags to the "configure" script in order
to tell Open MPI
Since it looks like you guys are very close to release, I just grabbed
the 1.3rc4 tarball to give it a spin.
Unfortunately, the elan BTL is not building:
$ ../configure --prefix= CC= CXX=g++-4.3.2> FC=
...
$ make
...
Making all in mca/btl/elan
make[2]: Entering directory
On Tue, 13 Jan 2009, Brian W. Barrett wrote:
The bottom line, however, is that the OB1 PML will be the default
*UNLESS* the PSM (PathScale/Qlogic) MTL can be chosen, in which case
the CM PML is used by default.
OK, then the README file is wrong or I don't read it properly... In
the section
Hmmm...well, I can't do either for 1.3.0 as it is departing this
afternoon.
The first option would be very hard to do. I would have to expose the
display-map option across the code base and check it prior to printing
anything about resolving node names. I guess I should ask: do you only
The selection logic for the PML is very confusing and doesn't follow the
standard priority selection. The reasons for this are convoluted and not
worth discussing here. The bottom line, however, is that the OB1 PML will
be the default *UNLESS* the PSM (PathScale/Qlogic) MTL can be chosen, in
On Tue, 13 Jan 2009, Tim Mattox wrote:
The cm PML does not use BTLs..., only MTLs, so
... the BTL selection is ignored.
OK, thanks for clarifying this bit, but...
The README for 1.3b2 specifies that CM is now chosen if possible; in my
trials, when I specify CM+BTL, it doesn't complain and
On Jan 13, 2009, at 7:37 AM, Alex A. Granovsky wrote:
Am I correct assuming that OpenMPI memory registration/cache module
is completely broken by design on any 32-bit system allowing
physical address space larger than 4 GB, and especially when
compiled for 32-bit under 64-bit OS (e.g., Linux)?
Lenny Verkhovsky wrote:
Actually the size is suppose to be the same,
Yes, I would think that that is how it is supposed to work.
It just suppose to bind process to it's closer memory node, instead of
leaving it to OS.
mpool_sm_module.c:82: opal_maffinity_base_bind(, 1,
Ralph,
The XML is looking better now, but there is still one problem. To be
valid, there needs to be only one root element, but currently you
don't have any (or many). So rather than:
I just debugged the Reduce_scatter bug mentioned previously. The bug is
unfortunately not in hierarch, but in tuned.
Here is the code snipplet causing the problems:
int reduce_scatter (, mca_coll_base_module_t *module)
{
...
err = comm->c_coll.coll_reduce (, module)
...
}
but
Per the teleconf this morning:
1. Cisco sanity checks on 1.3rc4 look good
2. Cisco MTT failures that I saw were mostly due to:
- coll hierarch is failing with intel test Reduce_scatter_user_c
- intel tests failing when OMPI configured --without-mpi-param-check
The first of which is
Hi All,
The second release candidate of Open MPI v1.2.9 is now available:
http://www.open-mpi.org/software/ompi/v1.2/
Please run it through it's paces as best you can, if you care.
--
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
Hi Bogdan,
Sorry for such a late reply to your e-mail. Glad to hear that the
performance anomaly you mentioned below is now gone with 1.3rc3.
But I noticed that we either didn't explain something well enough, or not
at all... The cm PML does not use BTLs..., only MTLs, so your
suggested
See this thread on the pkg-openmpi-maintainers list:
http://lists.alioth.debian.org/pipermail/pkg-openmpi-maintainers/2009-January/001278.html
On Jan 13, 2009, at 12:52 PM, Ralf Wildenhues wrote:
Hello Jeff,
* Jeff Squyres wrote on Tue, Jan 13, 2009 at 03:39:28PM CET:
On Jan 13, 2009,
Hi All,
The fourth release candidate of Open MPI v1.3 is now available:
http://www.open-mpi.org/software/ompi/v1.3/
Please run it through it's paces as best you can.
Anticipated release of 1.3 is tonight/tomorrow.
--
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
tmat...@gmail.com ||
Hello Jeff,
* Jeff Squyres wrote on Tue, Jan 13, 2009 at 03:39:28PM CET:
> On Jan 13, 2009, at 4:54 AM, Manuel Prinz wrote:
>>
>> You have to pass --disable-rpath explicitely. Building with rpath is
>> still the default. I verified by building without passing any option
>> to configure and the
On the call today, no one had any objections to bringing this stuff to
the trunk. v1.2.9 and v1.3.0 releases have a higher priority, so I'll
bring this stuff over to the trunk when those two releases are done
(hopefully tomorrow!).
On Jan 10, 2009, at 2:21 PM, Jeff Squyres wrote:
FWIW,
I don't want to move changes ( default value of the flag), since there
are important people, for whom it works :)
I also think that this is VT issue, but I guess we are the only one
who experience the errors.
we can now overwrite this params from the environment as a workaround,
Mike comitted
I'm still guessing that this is a distro / compiler issue -- I can
build with the default flags just fine...?
Can you specify what distro / compiler you were using?
Also, if you want to move the changes that have been made to
buildrpm.sh to the v1.3 branch, just file a CMR. That file is
Just for the web archives: per some off-list discussion, we decided
not to take the patch because the Debian guys have a simpler
workaround for what they want.
On Jan 13, 2009, at 4:54 AM, Manuel Prinz wrote:
Am Montag, den 12.01.2009, 18:04 -0500 schrieb Jeff Squyres:
I don't see much
On Jan 12, 2009, at 11:40 PM, Paul Franz wrote:
I will see what I can do.
Many thanks! I'm going to take full advantage of your offer and ask
for the moon, even thought it's a large list. :-)
BTW, what kind of integration are you looking for? Do you just want
the changeset to be logged
Here's the diff from the NEWS files in the two tarballs (note that
some of the items are listed in the [unreleased] 1.2.9 section,
meaning that the fixes were applied to both the 1.2 series and 1.3):
--- openmpi-1.3rc2/NEWS 2008-12-02 13:50:46.0 -0500
+++ openmpi-1.3rc3/NEWS
Dear OpenMPI developers,
Am I correct assuming that OpenMPI memory registration/cache module
is completely broken by design on any 32-bit system allowing
physical address space larger than 4 GB, and especially when
compiled for 32-bit under 64-bit OS (e.g., Linux)?
Thanks so much!
Best
it seems that setting use_default_rpm_opt_flags to 0 solves the problem.
Maybe vt developers should take a look on it.
Lenny.
On Sun, Jan 11, 2009 at 2:40 PM, Jeff Squyres wrote:
> This sounds like a distro/compiler version issue.
>
> Can you narrow down the issue at all?
>
Hi,
could you please outline the changes between RC2 and RC3?
Regards,
Gregor
> Hi All,
>The "third" release candidate of v1.3 is now up on the website:
> http://www.open-mpi.org/software/ompi/v1.3/
> Please run it through it's paces as best you can
Am Montag, den 12.01.2009, 18:04 -0500 schrieb Jeff Squyres:
> I don't see much harm in including this as long as rpath builds are
> still the default. If there's a non-default option to disable rpath
> builds, that would be fine with me.
>
> Does this patch disable rpath by default, or do
Actually the size is suppose to be the same,
It just suppose to bind process to it's closer memory node, instead of
leaving it to OS.
see:
mpool_sm_module.c:82:opal_maffinity_base_bind(, 1,
mpool_sm->mem_node);
Best regards
Lenny.
On Mon, Jan 12, 2009 at 10:02 PM, Eugene Loh
47 matches
Mail list logo