Re: [OMPI users] How to keep multiple installations at same time

2014-08-05 Thread David Turner

I checked with my colleague, who is one of the
module developers.  His response:

> That's a surprise to me?!
> I will admit that I'm a little slow on releases,
> but it's still quite active.

On 8/5/14 11:39 AM, Fabricio Cannini wrote:

On 05-08-2014 13:54, Ralph Castain wrote:

Check the repo - hasn't been touched in a very long time


Yes, the cvs repo hasn't been touched in a long long time, but they have
apparently migrated to git.

cvs:
http://modules.cvs.sourceforge.net/viewvc/modules/

git:
http://sourceforge.net/p/modules/git/ci/master/tree/


There is still activity on git, patches for newest tcl version.
It may not be bursting, but I wouldn't call it "dead". Yet. ;)

[ ]'s
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/08/24921.php



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] current release?

2013-03-13 Thread David Turner

Hi,

On Feb 21, Jeff Squyres announced Open MPI 1.6.4.  However,
on the Open MPI home page, 1.6.3 is still indicated as the
current release.  Going to the download page shows 1.6.4 as
the current release, so I think the problem is isolated to
the home page.  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] limiting tasks/ranks

2012-11-01 Thread David Turner

Hi,

Is there a way to limit the number of tasks started by mpirun?
For example, on our 48-core SMP, I'd like to limit MPI jobs to
a maximum of 12 tasks.  That is, "mpirun -np 16 ..." would
return an error.  Note that this is a strictly interactive
system; no batch environment available.

I've just quickly scanned the MCA parameters:

ompi_info --param all all

and couldn't find the answer to my question.

Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] problems with 1.6

2012-05-15 Thread David Turner

Hi Ralph,

Sorry for the false alarm, and thanks for the tip:


... version confusion where the mpirun being used doesn't match the backend 
daemons.


Yes, my test environment was wonky.  All is well now.


On May 14, 2012, at 3:41 PM, David Turner wrote:

...

[c0667:24962] [[39579,1],11] ORTE_ERROR_LOG: Data unpack had inadequate space 
in file util/nidmap.c at line 118


--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] problems with 1.6

2012-05-14 Thread David Turner

Hi all,

I am having troubles with the newly-available 1.6 release (tar.gz).
I built it with my "normal" configure options, with no obvious
configure or make errors.  I used both PGI 12.4, and GCC 4.7.0, under
Scientific Linux 5.5.

I then compiled my "normal" matrix-multiply test case.  Upon execution,
I get (with either compiler):

[c0667:24962] [[39579,1],11] ORTE_ERROR_LOG: Data unpack had inadequate 
space in file util/nidmap.c at line 118
[c0667:24962] [[39579,1],11] ORTE_ERROR_LOG: Data unpack had inadequate 
space in file ess_env_module.c at line 174
[c0667:24966] [[39579,1],15] ORTE_ERROR_LOG: Data unpack had inadequate 
space in file util/nidmap.c at line 118
[c0667:24966] [[39579,1],15] ORTE_ERROR_LOG: Data unpack had inadequate 
space in file ess_env_module.c at line 174

...
It looks like orte_init failed for some reason; ...
...
It looks like MPI_INIT failed for some reason; ...

I can provide additional details if needed, but again:  I did nothing
different than what I have done with previous OMPI and compiler
releases.  Thoughts?  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] open-mpi.org site

2012-05-07 Thread David Turner

Hi all,

Currently getting "You don't have permission to access / on this
server" on the www.open-mpi-org website.

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] UC EXTERNAL: Re: How to set up state-less node /tmpfor OpenMPI usage

2011-11-04 Thread David Turner

Indeed, my terminology is inexact.  I believe you are correct; our
diskless nodes use tmpfs, not ramdisk.  Thanks for the clarification!

On 11/4/11 11:00 AM, Rushton Martin wrote:

There appears to be some confusion about ramdisks and tmpfs.  A ramdisk
sets aside a fixed amount of memory for its exclusive use, so that a
file being written to ramdisk goes first to the cache, then to ramdisk,
and may exist in both for some time.  tmpfs however opens up the cache
to programs so that a file being written goes to cache and stays there.
The "size" of a tmpfs pseudo-disk is the maximum it can grow to (which
according to the mount man page defaults to 50% of memory).  Hence only
enough memory to hold the data is actually used which ties up with David
Turner's figures.

You can easily tell which method is in use from df.  A traditional
ramdisk will appears as /dev/ramN (N = 0, 1 ...) whereas a tmpfs device
will be a simple name, often tmpfs.  I would guess that the single "-"
in David's df command is precisely this.  On our diskless nodes root
shows as device compute_x86_64, whilst /tmp, /dev/shm and /var/tmp show
as "none".

HTH,

Martin Rushton
HPC System Manager, Weapons Technologies
Tel: 01959 514777, Mobile: 07939 219057
email: jmrush...@qinetiq.com
www.QinetiQ.com
QinetiQ - Delivering customer-focused solutions

Please consider the environment before printing this email.
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Blosch, Edwin L
Sent: 04 November 2011 16:19
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmpfor OpenMPI usage

OK, I wouldn't have guessed that the space for /tmp isn't actually in
RAM until it's needed.  That's the key piece of knowledge I was missing;
I really appreciate it.  So you can allow /tmp to be reasonably sized,
but if you aren't actually using it, then it doesn't take up 11 GB of
RAM.  And you prevent users from crashing the node by setting mem limit
to 4 GB less than the available memory. Got it.

I agree with your earlier comment:  these are fairly common systems now.
We have program- and owner-specific disks where I work, and after the
program ends, the disks are archived or destroyed.  Before the stateless
configuration option, the entire computer, nodes and switches as well as
disks, were archived or destroyed after each program.  Not too
cost-effective.

Is this a reasonable final summary? :  OpenMPI uses temporary files in
such a way that it is performance-critical that these so-called session
files, used for shared-memory communications, must be "local".  For
state-less clusters, this means the node image must include a /tmp or
/wrk partition, intelligently sized so as not to enable an application
to exhaust the physical memory of the node, and care must be taken not
to mask this in-memory /tmp with an NFS mounted filesystem.  It is not
uncommon for cluster enablers to exclude /tmp from a typical base Linux
filesystem image or mount it over NFS, as a means of providing users
with a larger-sized /tmp that is not limited to a fraction of the node's
physical memory, or to avoid garbage accumulation in /tmp taking up the
physical RAM.  But not having /tmp or mounting it over NFS is not a
viable stateless-node configuration option if you intend to run OpenMPI.
Instead you could have a /bigtmp which is NFS-mounted and a /tmp whi!
  ch is local, for example. Starting in OpenMPI 1.7.x, shared-memory
communication will no longer go through memory-mapped files, and
vendors/users will no longer need to be vigilant concerning this OpenMPI
performance requirement on stateless node configuration.


Is that a reasonable summary?

If so, would it be helpful to include this as an FAQ entry under General
category?  Or the "shared memory" category?  Or the "troubleshooting"
category?


Thanks



-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Friday, November 04, 2011 1:38 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmp for OpenMPI usage

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is consumed by
the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they do

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread David Turner

I should have been more careful.  When we first started using OpenMPI,
version 1.4.1, there was a bug that caused session directories to be
left behind.  This was fixed in subsequent releases (and via a patch
for 1.4.1).

Our batch epilogue still removes everything in /tmp that belongs to the
owner of the batch job.  It is invoked after the user's application has
terminated, so the session directories are already gone by that time.

Sorry for the confusion!

On 11/4/11 3:43 AM, TERRY DONTJE wrote:

David, are you saying your jobs consistently leave behind session files
after the job exits? It really shouldn't even in the case when a job
aborts, I thought, mpirun took great pains to cleanup after itself. Can
you tell us what version of OMPI you are running with? I think I could
see kill -9 of mpirun and processes below would cause turds to be left
behind.

--td

On 11/4/2011 2:37 AM, David Turner wrote:

% df /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
- 12330084 822848 11507236 7% /
% df /
Filesystem 1K-blocks Used Available Use% Mounted on
- 12330084 822848 11507236 7% /

That works out to 11GB. But...

The compute nodes have 24GB. Freshly booted, about 3.2GB is
consumed by the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that. I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:

Thanks very much, exactly what I wanted to hear. How big is /tmp?

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Thursday, November 03, 2011 6:36 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmp
for OpenMPI usage

I'm not a systems guy, but I'll pitch in anyway. On our cluster,
all the compute nodes are completely diskless. The root file system,
including /tmp, resides in memory (ramdisk). OpenMPI puts these
session directories therein. All our jobs run through a batch
system (torque). At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user). This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:

Thanks for the help. A couple follow-up-questions, maybe this starts to

go outside OpenMPI:


What's wrong with using /dev/shm? I think you said earlier in this
thread

that this was not a safe place.


If the NFS-mount point is moved from /tmp to /work, would a /tmp
magically

appear in the filesystem for a stateless node? How big would it be,
given
that there is no local disk, right? That may be something I have to
ask the
vendor, which I've tried, but they don't quite seem to get the question.


Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Ralph Castain

Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less
node /tmp

for OpenMPI usage



On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:


I might be missing something here. Is there a side-effect or
performance

loss if you don't use the sm btl? Why would it exist if there is a
wholly
equivalent alternative? What happens to traffic that is intended for
another process on the same node?


There is a definite performance impact, and we wouldn't recommend doing

what Eugene suggested if you care about performance.


The correct solution here is get your sys admin to make /tmp local.
Making

/tmp NFS mounted across multiple nodes is a major "faux pas" in the
Linux
world - it should never be done, for the reasons stated by Jeff.





Thanks


-Original Message-
From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


Right. Actually "--mca btl ^sm". (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a

session file on /tmp, which is NFS-mounted and thus not a good choice.


Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI user

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread David Turner

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is
consumed by the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:

Thanks very much, exactly what I wanted to hear. How big is /tmp?

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Thursday, November 03, 2011 6:36 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp
for OpenMPI usage

I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
all the compute nodes are completely diskless.  The root file system,
including /tmp, resides in memory (ramdisk).  OpenMPI puts these
session directories therein.  All our jobs run through a batch
system (torque).  At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user).  This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:

Thanks for the help.  A couple follow-up-questions, maybe this starts to

go outside OpenMPI:


What's wrong with using /dev/shm?  I think you said earlier in this thread

that this was not a safe place.


If the NFS-mount point is moved from /tmp to /work, would a /tmp magically

appear in the filesystem for a stateless node?  How big would it be, given
that there is no local disk, right?  That may be something I have to ask the
vendor, which I've tried, but they don't quite seem to get the question.


Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Ralph Castain

Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp

for OpenMPI usage



On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:


I might be missing something here. Is there a side-effect or performance

loss if you don't use the sm btl?  Why would it exist if there is a wholly
equivalent alternative?  What happens to traffic that is intended for
another process on the same node?


There is a definite performance impact, and we wouldn't recommend doing

what Eugene suggested if you care about performance.


The correct solution here is get your sys admin to make /tmp local. Making

/tmp NFS mounted across multiple nodes is a major "faux pas" in the Linux
world - it should never be done, for the reasons stated by Jeff.





Thanks


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a

session file on /tmp, which is NFS-mounted and thus not a good choice.


Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


I've not been following closely.  Why must one use shared-memory
communications?  How about using other BTLs in a "loopback" fashion?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
us

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread David Turner

I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
all the compute nodes are completely diskless.  The root file system,
including /tmp, resides in memory (ramdisk).  OpenMPI puts these
session directories therein.  All our jobs run through a batch
system (torque).  At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user).  This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:

Thanks for the help.  A couple follow-up-questions, maybe this starts to go 
outside OpenMPI:

What's wrong with using /dev/shm?  I think you said earlier in this thread that 
this was not a safe place.

If the NFS-mount point is moved from /tmp to /work, would a /tmp magically 
appear in the filesystem for a stateless node?  How big would it be, given that 
there is no local disk, right?  That may be something I have to ask the vendor, 
which I've tried, but they don't quite seem to get the question.

Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage


On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:


I might be missing something here. Is there a side-effect or performance loss 
if you don't use the sm btl?  Why would it exist if there is a wholly 
equivalent alternative?  What happens to traffic that is intended for another 
process on the same node?


There is a definite performance impact, and we wouldn't recommend doing what 
Eugene suggested if you care about performance.

The correct solution here is get your sys admin to make /tmp local. Making /tmp NFS 
mounted across multiple nodes is a major "faux pas" in the Linux world - it 
should never be done, for the reasons stated by Jeff.




Thanks


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Eugene Loh
Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session 
file on /tmp, which is NFS-mounted and thus not a good choice.

Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Eugene Loh
Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

I've not been following closely.  Why must one use shared-memory
communications?  How about using other BTLs in a "loopback" fashion?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] Displaying MAIN in Totalview

2011-03-21 Thread David Turner

Hi,

About a month ago, this topic was discussed with no real resolution:

http://www.open-mpi.org/community/lists/users/2011/02/15538.php

We noticed the same problem (TV does not display the user's MAIN
routine upon initial startup), and contacted the TV developers.
They suggested a simple OMPI code modification, which we implemented
and tested; it seems to work fine.  Hopefully, this capability
can be restored in future releases.

Here is the body of our communication with the TV developers:

--

Interestingly enough, someone else asked this very same question 
recently and I finally dug into it last week and figured out what was 
going on. TotalView publishes a public interface which allows any MPI 
implementor to set things up so that it should work fairly seamless with 
TotalView. I found that one of the defines in the interface is


MPIR_force_to_main

and when we find this symbol defined in mpirun (or orterun in Open MPI's 
case) then we spend a bit more effort to focus the source pane on the 
main routine. As you may guess, this is NOT being defined in OpenMPI 
1.4.2. It was being defined in the 1.2.x builds though, in a routine 
called totalview.c. OpenMPI has been re-worked significantly since then, 
and totalview.c has been replaced by debuggers.c in orte/tools/orterun. 
About line 130 to 140 (depending on any changes since my look at the 
1.4.1 sources) you should find a number of MPIR_ symbols being defined.


struct MPIR_PROCDESC *MPIR_proctable = NULL;
int MPIR_proctable_size = 0;
int MPIR_being_debugged = 0;
volatile int MPIR_debug_state = 0;
volatile int MPIR_i_am_starter = 0;
volatile int MPIR_partial_attach_ok = 1;


I believe you should be able to insert the line:

int MPIR_force_to_main = 0;

into this section, and then the behavior you are looking for should work 
after you rebuild OpenMPI. I haven't yet had the time to do that myself, 
but that was all that existed in the 1.2.x sources, and I know those 
achieved the desired effect. It's quite possible that someone realized 
the symbol was initialized, but wasn't be used anyplace, so they just 
removed it. Without realizing we were looking for it in the debugger. 
When I pointed this out to the other user, he said he would try it out 
and pass it on to the Open MPI group. I just checked on that thread, and 
didn't see any update, so I passed on the info myself.


--

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] memory limits on remote nodes

2010-10-12 Thread David Turner

Hi,

Various people contributed:


Isn't it possible to set this up in torque/moab directly? In SGE I would simply 
define h_vmem and it's per slot then; and with a tight integration all Open MPI 
processes will be children of sge_execd and the limit will be enforced.


I could be wrong, but I -think- the issue here is that the soft limits need to 
be set on a per-job basis.


This I also thought, and `qsub -l h_vmem=4G ...` should do it. It can be 
requested on a per job basis (with further limits on a queue level if 
necessary).


Well, this sent me in the right direction.  I believe h_vmem is
SGE-specific, but our torque environment provides the "pmem"
(physical memory) and "pvmem" (virtual memory) resources on
a per-job basis.  These seem to provide exactly the functionality
we need.

Sorry to bother you an issue that ended up being independent
of Open MPI!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] memory limits on remote nodes

2010-10-07 Thread David Turner

Hi Ralph,


There is an MCA param that tells the orted to set its usage limits to the hard 
limit:

 MCA opal: parameter "opal_set_max_sys_limits" (current 
value:<0>, data source: default value)
   Set to non-zero to automatically set any 
system-imposed limits to the maximum allowed

The orted could be used to set the soft limit down from that value on a per-job 
basis, but we didn't provide a mechanism for specifying it. Would be relatively 
easy to do, though.

What version are you using? If I create a patch, would you be willing to test 
it?


1.4.2, with 1.4.1 available, and 1.4.3 waiting in the wings.
I would love to test any patch you could come up with.
The ability to set any valid limit to any valid value,
applied equally to all processes, would go a long way in
making our environment more stable.  Thanks!


Hi,

We would like to set process memory limits (vmemoryuse, in csh
terms) on remote processes.  Our batch system is torque/moab.

The nodes of our cluster each have 24GB of physical memory, of
which 4GB is taken up by the kernel and the root file system.
Note that these are diskless nodes, so no swap either.

We can globally set the per-process limit to 2.5GB.  This works
fine if applications run "packed":  8 MPI tasks running on each
8-core node, for an aggregate limit of 20GB.  However, if a job
only wants to run 4 tasks, the soft limit can safely be raised
to 5GB.  2 tasks, 10GB.  1 task, the full 20GB.

Upping the soft limit in the batch script itself only affects
the "head node" of the job.  Since limits are not part of the
"environment", I can find no way propagate them to remote nodes.

If I understand how this all works, the remote processes are
started by orted, and therefore inherit its limits.  Is there
any sort of orted configuration that can help here?  Any other
thoughts about how to approach this?

Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] memory limits on remote nodes

2010-10-06 Thread David Turner

Hi,

We would like to set process memory limits (vmemoryuse, in csh
terms) on remote processes.  Our batch system is torque/moab.

The nodes of our cluster each have 24GB of physical memory, of
which 4GB is taken up by the kernel and the root file system.
Note that these are diskless nodes, so no swap either.

We can globally set the per-process limit to 2.5GB.  This works
fine if applications run "packed":  8 MPI tasks running on each
8-core node, for an aggregate limit of 20GB.  However, if a job
only wants to run 4 tasks, the soft limit can safely be raised
to 5GB.  2 tasks, 10GB.  1 task, the full 20GB.

Upping the soft limit in the batch script itself only affects
the "head node" of the job.  Since limits are not part of the
"environment", I can find no way propagate them to remote nodes.

If I understand how this all works, the remote processes are
started by orted, and therefore inherit its limits.  Is there
any sort of orted configuration that can help here?  Any other
thoughts about how to approach this?

Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] location of ompi libraries

2010-10-05 Thread David Turner

Hi Jeff,

Thanks for the response.  Reviewing my builds, I realized that for
1.4.2, I had configured using

contrib/platform/lanl/tlcc/optimized-nopanasas

per Ralph Castain's suggestion.  That file includes both:

enable_dlopen=no
enable_shared=yes
enable_static=yes

Here is my *real* issue.  I am trying to test Voltaire's Fabric
Collective Accelerator, which extends mca_component_path, and
adds a few additional .so files.  It appears I must have
enable_dlopen=yes for this to work, which makes sense.

I assume that the shared/static settings above result in
*both* .a and .so versions of the ompi libraries getting
built.  I'm not sure if this will affect my ability to
use Voltaire's mca plugins, but I have determined that
simply removing the enable_dlopen=no is not sufficient
to restore all the ompi .so files.  I assume (haven't
tried it yet) that removing the enable_static=yes will
result in the ompi .so files getting created.

I guess I'm just looking for some guidance in the use
of the above options.  I have read many warnings on
the ompi website about trying to link statically.

Thanks!

On 10/5/10 7:17 AM, Jeff Squyres wrote:

It is more than likely that you compiled Open MPI with --enable-static and/or 
--disable-dlopen.  In this case, all of Open MPI's plugins are slurped up into 
the libraries themselves (e.g., libmpi.so or libmpi.a).  That's why everything 
continues to work properly.


On Oct 4, 2010, at 6:58 PM, David Turner wrote:


Hi,

In Open MPI 1.4.1, the directory lib/openmpi contains about 130
entries, including such things as mca_btl_openib.so.  In my
build of Open MPI 1.4.2, lib/openmpi contains exactly three
items:
libompi_dbg_msgq.a  libompi_dbg_msgq.la  libompi_dbg_msgq.so

I have searched my 1.4.2 installation for mca_btl_openib.so,
to no avail.  And yet, 1.4.2 seems to work "fine".  Is my
installation broken, or is the organization significantly
different between the two versions?  A quick scan of the
release notes didn't help.

Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] location of ompi libraries

2010-10-04 Thread David Turner

Hi,

In Open MPI 1.4.1, the directory lib/openmpi contains about 130
entries, including such things as mca_btl_openib.so.  In my
build of Open MPI 1.4.2, lib/openmpi contains exactly three
items:
libompi_dbg_msgq.a  libompi_dbg_msgq.la  libompi_dbg_msgq.so

I have searched my 1.4.2 installation for mca_btl_openib.so,
to no avail.  And yet, 1.4.2 seems to work "fine".  Is my
installation broken, or is the organization significantly
different between the two versions?  A quick scan of the
release notes didn't help.

Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] compiler upgrades require openmpi rebuild?

2010-08-30 Thread David Turner

Hi,

We have recently upgraded our default compiler suite from
PGI 10.5 to PGI 10.8.  We use the "module" system to manage
third-party software.  The module for PGI sets PATH and
LD_LIBRARY_PATH.

Using Open MPI 1.4.2, built with PGI 10.5, I have verified
that changing PATH is sufficient for the Open MPI compiler
wrappers to pick up version 10.8 of the PGI compilers.
However, it appears that the 10.5 PGI libraries are "wired"
into the wrappers somehow.  So I get an executable that
has been compiled with PGI 10.8 but linked against PGI 10.5
libraries.

Short of rebuilding Open MPI with PGI 10.8, is there any
(safe, reliable) way to get the compiler wrappers to link
against the PGI 10.8 libraries?  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] problem with -npernode

2010-06-18 Thread David Turner

Hi,

On 06/17/2010 03:34 PM, Ralph Castain wrote:

No more info required - it's a bug. Fixed and awaiting release of 1.4.3.


I downloaded openmpi-1.4.3a1r23261.tar.gz, dated June 9.  It behaves the
same as 1.4.2.  Is there a newer version available for testing?


On Jun 17, 2010, at 3:50 PM, David Turner wrote:


Hi,

Recently, Christopher Maestas reported a problem with -npernode in
Open MPI 1.4.2 ("running a ompi 1.4.2 job with -np versus -npernode").
I have also encountered this problem, with a simple "hello, world"
program:

% mpirun -np 16 ./a.out
myrank, icount = 0   16
myrank, icount = 2   16
myrank, icount = 5   16
myrank, icount = 7   16
myrank, icount = 1   16
myrank, icount = 4   16
myrank, icount = 6   16
myrank, icount = 3   16
myrank, icount = 8   16
myrank, icount = 9   16
myrank, icount =10   16
myrank, icount =12   16
myrank, icount =13   16
myrank, icount =15   16
myrank, icount =11   16
myrank, icount =14   16
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP

% mpirun -np 16 -npernode 8 ./a.out
[c1146:15313] *** Process received signal ***
[c1146:15313] Signal: Segmentation fault (11)
[c1146:15313] Signal code: Address not mapped (1)
[c1146:15313] Failing at address: 0x50
[c1146:15313] *** End of error message ***
Segmentation fault
[c1138:26571] [[62315,0],1] routed:binomial: Connection to lifeline 
[[62315,0],0] lost
 % module swap openmpi openmpi/1.4.1
 % mpirun -np 16 -npernode 8 ./a.out
myrank, icount = 8   16
myrank, icount =13   16
myrank, icount =10   16
myrank, icount =11   16
myrank, icount =15   16
myrank, icount =14   16
myrank, icount =12   16
myrank, icount = 5   16
myrank, icount = 2   16
myrank, icount = 3   16
myrank, icount = 1   16
myrank, icount = 0   16
myrank, icount = 9   16
myrank, icount = 6   16
myrank, icount = 7   16
myrank, icount = 4   16
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP

Compilers are PGI/10.5, OS is Scientific Linux 5.4, resource manager is
torque 2.4.5.  Please let me know if you need more information.  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] problem with -npernode

2010-06-17 Thread David Turner

Hi,

Recently, Christopher Maestas reported a problem with -npernode in
Open MPI 1.4.2 ("running a ompi 1.4.2 job with -np versus -npernode").
I have also encountered this problem, with a simple "hello, world"
program:

% mpirun -np 16 ./a.out
 myrank, icount = 0   16
 myrank, icount = 2   16
 myrank, icount = 5   16
 myrank, icount = 7   16
 myrank, icount = 1   16
 myrank, icount = 4   16
 myrank, icount = 6   16
 myrank, icount = 3   16
 myrank, icount = 8   16
 myrank, icount = 9   16
 myrank, icount =10   16
 myrank, icount =12   16
 myrank, icount =13   16
 myrank, icount =15   16
 myrank, icount =11   16
 myrank, icount =14   16
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP


% mpirun -np 16 -npernode 8 ./a.out
[c1146:15313] *** Process received signal ***
[c1146:15313] Signal: Segmentation fault (11)
[c1146:15313] Signal code: Address not mapped (1)
[c1146:15313] Failing at address: 0x50
[c1146:15313] *** End of error message ***
Segmentation fault
[c1138:26571] [[62315,0],1] routed:binomial: Connection to lifeline 
[[62315,0],0] lost


 % module swap openmpi openmpi/1.4.1

 % mpirun -np 16 -npernode 8 ./a.out
 myrank, icount = 8   16
 myrank, icount =13   16
 myrank, icount =10   16
 myrank, icount =11   16
 myrank, icount =15   16
 myrank, icount =14   16
 myrank, icount =12   16
 myrank, icount = 5   16
 myrank, icount = 2   16
 myrank, icount = 3   16
 myrank, icount = 1   16
 myrank, icount = 0   16
 myrank, icount = 9   16
 myrank, icount = 6   16
 myrank, icount = 7   16
 myrank, icount = 4   16
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP
FORTRAN STOP

Compilers are PGI/10.5, OS is Scientific Linux 5.4, resource manager is
torque 2.4.5.  Please let me know if you need more information.  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316



[OMPI users] Threading models with openib

2010-06-08 Thread David Turner

Hi all,

Please verify:  if using openib BTL, the only threading model
is MPI_THREAD_SINGLE?

Is there a timeline for full support of MPI_THREAD_MULTIPLE
in Open MPI's openib BTL?

Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] excluding hosts

2010-04-06 Thread David Turner

Hi Ralph,


Are you using a scheduler of some kind? If so, you can add this to your default 
mca param file:


Yes, we are running torque/moab.


orte_allocation_required = 1

This will prevent anyone running without having an allocation. You can also set


Ah.  An "allocation".  Not much info on this on the open-mpi website.
I believe this is what we will want, to prevent mpirun on login nodes.


rmaps_base_no_schedule_local = 1

which tells mpirun not to schedule any MPI procs on the local node.


In our batch environment, mpirun will be executing on one of the
compute nodes.  That is, we don't have dedicated MOM nodes.
Therefore, I think we will want to schedule (at least) one MPI
task on the same node.  Actually, when somebody wants to run
(for example) 256 tasks packed on 32 8-core nodes, I think we'll
need mpirun to share a *core* with one of the MPI tasks.  The above
option would prevent that, correct?


Does that solve the problem?


I'll give it a try and let you know.  Thanks!


Ralph


On Apr 6, 2010, at 3:28 PM, David Turner wrote:


Hi,

Our cluster has a handful of login nodes, and then a bunch of
compute nodes.  OpenMPI is installed in a global file system
visible from both sets of nodes.  This means users can type
"mpirun" from an interactive prompt, and quickly oversubscribe
the login node.

So, is there a way to explicitly exclude hosts from consideration
for mpirun?  To prevent (what is usually accidental) running
MPI apps on our login nodes?  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] excluding hosts

2010-04-06 Thread David Turner

Hi,

Our cluster has a handful of login nodes, and then a bunch of
compute nodes.  OpenMPI is installed in a global file system
visible from both sets of nodes.  This means users can type
"mpirun" from an interactive prompt, and quickly oversubscribe
the login node.

So, is there a way to explicitly exclude hosts from consideration
for mpirun?  To prevent (what is usually accidental) running
MPI apps on our login nodes?  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] IPoIB

2010-04-05 Thread David Turner

Hi all,

Please, some clarification.  I have built Open MPI 1.4.1 against
our IB verbs layer, and all seems well.  But a question has come
up about IPoIB.  While all communications are using the "native"
IB interface (verbs), will mpirun use IPoIB during job launch
and teardown?

If it matters, resource allocation is via torque.

Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] Questions on /tmp/openmpi-sessions-userid directory

2010-03-07 Thread David Turner

Hi Ralph

> ... that is fixed in the upcoming 1.4.2 release.

Can you say when this release will be generally available?
Proliferating session directories are a problem for us too.

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] sm btl choices

2010-03-01 Thread David Turner

Hi Ralph,


Which version of OMPI are you using? We know that the 1.2 series was unreliable 
about removing the session directories, but 1.3 and above appear to be quite 
good about it. If you are having problems with the 1.3 or 1.4 series, I would 
definitely like to know about it.

When I was at LANL, I ran a number of tests in exactly this configuration. 
While the sm btl did provide some performance advantage, it wasn't very much 
(the bandwidth was only about 10% greater, and the latency wasn't all that 
different either). I set the default configuration for users to include sm as 
10% isn't something to sneer at, but you could disable it without an enormous 
impact.


I realize I have another question about this.  When you say "exactly"
this configuration, do you mean the mmap files were backed to /tmp
via ramdisk, or to a remote file system over the communications fabric?

We have historically redefined TMPDIR to point somewhere other than
/tmp, and have told our users *never* to use /tmp (if possible).
I suppose that if OMPI cleans up after itself, and we use a
prologue/epilogue, and regular scrubbing, we can keep /tmp under
control.


Another option would be to run an epilog that hammers the session directory. 
That's what LANL does, even though we didn't see much trouble with cleanup 
starting with the 1.3 series (still have a bunch of users stuck on 1.2). 
Depending on what environment you are running, you might contact folks there 
and get a copy of their epilog script.


On Mar 1, 2010, at 1:42 AM, David Turner wrote:


Hi all,

Running on a large cluster of 8-core nodes.  I understand
that the SM BTL is a "good thing".  But I'm curious about
its use of memory-mapped files.  I believe these files will
be in $TMPDIR, which defaults to /tmp.

In our cluster, the compute nodes are stateless, so /tmp
is actually in RAM.  Keeping memory-mapped "files" in
memory seems kind of circular, although I know little
about these things.  A bigger problem is that it appears
OMPI does not remove the files upon completion.

Another option is to redefine $TMPDIR to point to a
"real" file system.  In our cluster, all the available
file systems are accessed over the IB fabric.  So it
seems that there will be IB traffic, even though the
point of the SM BTL is to avoid this traffic.

Given the above two constraints, might it just be
better to disable the SM BTL entirely, and use the
IB BTL even within a node?  Of course, the "self"
BTL should still be used if appropriate.

Any thoughts clarifying these issues would be
greatly appreciated.  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] sm btl choices

2010-03-01 Thread David Turner

On 3/1/10 1:51 AM, Ralph Castain wrote:

Which version of OMPI are you using? We know that the 1.2 series was unreliable 
about removing the session directories, but 1.3 and above appear to be quite 
good about it. If you are having problems with the 1.3 or 1.4 series, I would 
definitely like to know about it.


Oops; sorry!  OMPI 1.4.1, compiled with PGI 10.0 compilers,
running on Scientific Linux 5.4, ofed 1.4.2.

The session directories are *frequently* left behind.  I have
not really tried to characterize under what circumstances they
are removed. But please confirm:  they *should* be removed by
OMPI.


When I was at LANL, I ran a number of tests in exactly this configuration. 
While the sm btl did provide some performance advantage, it wasn't very much 
(the bandwidth was only about 10% greater, and the latency wasn't all that 
different either). I set the default configuration for users to include sm as 
10% isn't something to sneer at, but you could disable it without an enormous 
impact.


I'd prefer to provide as much performance as possible, also.


Another option would be to run an epilog that hammers the session directory. 
That's what LANL does, even though we didn't see much trouble with cleanup 
starting with the 1.3 series (still have a bunch of users stuck on 1.2). 
Depending on what environment you are running, you might contact folks there 
and get a copy of their epilog script.


Yes, we are already planning our prologues and epilogues, just
haven't implemented them yet.  Even if I can find and fix a
reason why OMPI is currently not doing this, we will probably
do it an epilogue anyway.

Thanks for your help!


On Mar 1, 2010, at 1:42 AM, David Turner wrote:


Hi all,

Running on a large cluster of 8-core nodes.  I understand
that the SM BTL is a "good thing".  But I'm curious about
its use of memory-mapped files.  I believe these files will
be in $TMPDIR, which defaults to /tmp.

In our cluster, the compute nodes are stateless, so /tmp
is actually in RAM.  Keeping memory-mapped "files" in
memory seems kind of circular, although I know little
about these things.  A bigger problem is that it appears
OMPI does not remove the files upon completion.

Another option is to redefine $TMPDIR to point to a
"real" file system.  In our cluster, all the available
file systems are accessed over the IB fabric.  So it
seems that there will be IB traffic, even though the
point of the SM BTL is to avoid this traffic.

Given the above two constraints, might it just be
better to disable the SM BTL entirely, and use the
IB BTL even within a node?  Of course, the "self"
BTL should still be used if appropriate.

Any thoughts clarifying these issues would be
greatly appreciated.  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] sm btl choices

2010-03-01 Thread David Turner

Hi all,

Running on a large cluster of 8-core nodes.  I understand
that the SM BTL is a "good thing".  But I'm curious about
its use of memory-mapped files.  I believe these files will
be in $TMPDIR, which defaults to /tmp.

In our cluster, the compute nodes are stateless, so /tmp
is actually in RAM.  Keeping memory-mapped "files" in
memory seems kind of circular, although I know little
about these things.  A bigger problem is that it appears
OMPI does not remove the files upon completion.

Another option is to redefine $TMPDIR to point to a
"real" file system.  In our cluster, all the available
file systems are accessed over the IB fabric.  So it
seems that there will be IB traffic, even though the
point of the SM BTL is to avoid this traffic.

Given the above two constraints, might it just be
better to disable the SM BTL entirely, and use the
IB BTL even within a node?  Of course, the "self"
BTL should still be used if appropriate.

Any thoughts clarifying these issues would be
greatly appreciated.  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] which ofed rpms for openmpi

2010-01-23 Thread David Turner
,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] Problem building OpenMPI with PGI compilers

2009-12-11 Thread David Turner

Jeff,


Subject: Re: [OMPI users] Problem building OpenMPI with PGI compilers
From: Jeff Squyres <jsquy...@cisco.com>
Date: Thu, 10 Dec 2009 10:20:32 -0500
To: Open MPI Users <us...@open-mpi.org>

...

Actually, I was wrong.  You *can't* just take the SVN trunk's autogen.sh and 
use it with a v1.4 tarball (for various uninteresting reasons).

Given that we haven't moved this patch to the v1.4 branch yet (i.e., it's not 
yet in a nightly v1.4 tarball), probably the easiest thing to do is to apply 
the attached patch to a v1.4 tarball.  I tried it with my PGI 10.0 install and 
it seems to work.  So -- forget everything about autogen.sh and just apply the 
attached patch.


Thanks; I was able to complete the make process using the provided
patch.

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


[OMPI users] Problem building OpenMPI with PGI compilers

2009-12-09 Thread David Turner

Hi all,

My first ever attempt to build OpenMPI.  Platform is Sun Sunfire x4600
M2 servers, running Scientific Linux version 5.3.  Trying to build
OpenMPI 1.4 (as of today; same problems yesterday with 1.3.4).
Trying to use PGI version 10.0.

As a first attempt, I set CC, CXX, F77, and FC, then did "configure"
and "make".  Make ends with:

libtool: link:  pgCC --prelink_objects --instantiation_dir Template.dir 
  .libs/mpicxx.o .libs/intercepts.o .libs/comm.o .libs/datatype.o 
.libs/win.o .libs/file.o   -Wl,--rpath 
-Wl,/project/projectdirs/mpccc/usg/software/tnt/openmpi/openmpi-1.4/ompi/.libs 
-Wl,--rpath 
-Wl,/project/projectdirs/mpccc/usg/software/tnt/openmpi/openmpi-1.4/orte/.libs 
-Wl,--rpath 
-Wl,/project/projectdirs/mpccc/usg/software/tnt/openmpi/openmpi-1.4/opal/.libs 
-Wl,--rpath -Wl,/global/common/tesla/usg/openmpi/1.4/lib 
-L/project/projectdirs/mpccc/usg/software/tnt/openmpi/openmpi-1.4/orte/.libs 
-L/project/projectdirs/mpccc/usg/software/tnt/openmpi/openmpi-1.4/opal/.libs 
../../../ompi/.libs/libmpi.so 
/project/projectdirs/mpccc/usg/software/tnt/openmpi/openmpi-1.4/orte/.libs/libopen-rte.so 
/project/projectdirs/mpccc/usg/software/tnt/openmpi/openmpi-1.4/opal/.libs/libopen-pal.so 
-ldl -lnsl -lutil -lpthread

pgCC-Error-Unknown switch: --instantiation_dir
make[2]: *** [libmpi_cxx.la] Error 1

So I Googled "instantiation_dir openmpi", which led me to:

http://cia.vc/stats/project/OMPI?s_message=3

where I see:

There's still something wrong with the C++ support, however; I get
errors about a template directory switch when compiling the C++ MPI
bindings (doesn't happen with PGI 9.0). Still working on this... it
feels like it's still a Libtool issue because OMPI is not putting in
this compiler flag as far as I can tell:

{{{
/bin/sh ../../../libtool --tag=CXX --mode=link pgCC -g -version-info 
0:0:0 -export-dynamic -o libmpi_cxx.la -rpath /home/jsquyres/bogus/lib 
mpicxx.lo intercepts.lo comm.lo datatype.lo win.lo file.lo 
../../../ompi/libmpi.la -lnsl -lutil -lpthread

libtool: link: tpldir=Template.dir
libtool: link: rm -rf Template.dir
libtool: link: pgCC --prelink_objects --instantiation_dir Template.dir 
.libs/mpicxx.o .libs/intercepts.o .libs/comm.o .libs/datatype.o 
.libs/win.o .libs/file.o -Wl,--rpath 
-Wl,/users/jsquyres/svn/ompi-1.3/ompi/.libs -Wl,--rpath 
-Wl,/users/jsquyres/svn/ompi-1.3/orte/.libs -Wl,--rpath 
-Wl,/users/jsquyres/svn/ompi-1.3/opal/.libs -Wl,--rpath 
-Wl,/home/jsquyres/bogus/lib -L/users/jsquyres/svn/ompi-1.3/orte/.libs 
-L/users/jsquyres/svn/ompi-1.3/opal/.libs ../../../ompi/.libs/libmpi.so 
/users/jsquyres/svn/ompi-1.3/orte/.libs/libopen-rte.so 
/users/jsquyres/svn/ompi-1.3/opal/.libs/libopen-pal.so -ldl -lnsl -lutil 
-lpthread

pgCC-Error-Unknown switch: --instantiation_dir
make: *** [libmpi_cxx.la] Error 1
}}}

I noticed the comment "doesn't happen with PGI 9.0", so I re-did the
entire process with PGI 9.0 instead of 10.0, but I get the same error!

Any suggestions?  Let me know if I should provide full copies of the
configure and make output.  Thanks!

--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316