Re: [OMPI users] MPI_Info for MPI_Open_port

2006-07-11 Thread Ralph H Castain
On 7/11/06 11:59 AM, "Edgar Gabriel" wrote: > Abhishek Agarwal wrote: >> Hello, >> >> Is there a way of providing a specific port number in MPI_Info when using a >> MPI_Open_port command so that clients know which port number to connect. > > the MPI port-name in Open MPI has nothing to do with

Re: [OMPI users] OpenMPI / PBS / TM interaction

2006-08-03 Thread Ralph H Castain
Depending upon what version you are using, this could be resolved fairly simply. Check to see if your version supports the "nooversubscribe" command line option. If it does, then setting that option may (I believe) resolve the problem - at the least, it will only allow you to run one application p

Re: [OMPI users] LSF with OpenMPI

2006-08-30 Thread Ralph H Castain
On 8/30/06 6:40 AM, "Michael Kluskens" wrote: > I suspect that the problem is not that LSF does not copy the > environment over but that Open MPI is accessing the other nodes not > using LSF's method. Below is a related message by you that I have > not tried to figure out yet, I was hoping fo

Re: [OMPI users] Perl and MPI

2006-09-13 Thread Ralph H Castain
I can't speak to the Perl bindings, but Open MPI's runtime already supports SGE, so all you have to do is "mpirun" like usual and we take care of the rest. You may have to check your version of Open MPI as this capability was added in the more recent releases. Ralph On 9/13/06 8:52 AM, "Renato G

Re: [OMPI users] MPI_Comm_spawn_multiple and BProc

2006-09-27 Thread Ralph H Castain
Could you please clarify - what "Bproc kernel patch" are you referring to? Thanks Ralph On 9/27/06 2:37 AM, "laurent.po...@fr.thalesgroup.com" wrote: > Hi, > > I'm using MPI_Comm_spawn_multiple with Open MPI 1.1.1. > It used to work well, until I used the Bproc kernel patch. > > When I use

Re: [OMPI users] job fails to terminate

2006-10-18 Thread Ralph H Castain
Hi Lydia Could you confirm the version you are using? I think there is a typo there. Also, could you tell us how you configured the code (the configure command line would be nice). Thanks Ralph On 10/18/06 11:03 AM, "Lydia Heck" wrote: > > I have recently installed openmpi 1.3r1212a over tc

Re: [OMPI users] job fails to terminate

2006-10-20 Thread Ralph H Castain
Hi Lydia Thanks - that does help! Could you try this without threads? We have tried to make the system work with threads, but our testing has been limited. First thing I would try is to make sure that we aren't hitting a thread-lock. Thanks Ralph On 10/20/06 2:11 AM, "Lydia Heck" wrote: >

Re: [OMPI users] users Digest, Vol 411, Issue 2

2006-10-20 Thread Ralph H Castain
-threads \ >>> --enable-progress-threads \ >>> --with-threads=solaris > > all of them? > > Lydia > >> >> ---------- >> >> Message: 1 >> Date: Fri, 20 Oct 2006 06:30:36

Re: [OMPI users] how do i link to .la library files?

2006-10-26 Thread Ralph H Castain
Easiest method is just to use the "mpicc" command to compile your code. It will automatically link you to the right libraries, include directories, etc. You can check the $prefix/bin directory to see all the compiler wrappers we provide. Ralph On 10/26/06 7:12 AM, "shane kennedy" wrote: > i'm

Re: [OMPI users] mpirun crashes when compiled in 64-bit mode on Apple Mac Pro

2006-10-26 Thread Ralph H Castain
If you wouldn't mind, could you try it again after applying the attached patch? This looks like a problem we encountered on another release where something in the runtime didn't get initialized early enough. It only shows up in certain circumstances, but this seems to fix it. You can apply the pat

Re: [OMPI users] MPI_Comm_spawn multiple bproc support problem

2006-10-30 Thread Ralph H Castain
On 1.1.2, what that error is telling you is that it didn't find any nodes in the environment. The bproc allocator looks for an environmental variable NODES that contains a list of nodes assigned to you. This error indicates it didn't find anything. Did you get an allocation prior to running the jo

Re: [OMPI users] MPI_Comm_spawn multiple bproc support

2006-10-31 Thread Ralph H Castain
h, nor do I have one for comm_spawn_multiple that uses the "host" field. I can try to concoct something over the next few days, though, and verify that our code is working correctly. > > Regards. > > Herve > > Date: Mon, 30 Oct 2006 09:00:47 -0700 > From: Ralph H C

Re: [OMPI users] Problem starting rank other than zero

2006-10-31 Thread Ralph H Castain
Just out of curiosity ­ what environment (i.e., allocator and launcher) are you running in? POE? I¹m not sure the POE support is all that good, which is why I ask. Ralph On 10/31/06 12:37 PM, "Nader Ahmadi" wrote: > Hello, > > I am a new OpenMPI user. We are planing to move from IBM AIX POE

Re: [OMPI users] MPI_Comm_spawn multiple bproc support

2006-11-03 Thread Ralph H Castain
Okay, I picked up some further info that may help you. >> The "bjsub -i /bin/env" only sets up the NODES for the session of >> /bin/env. Probably what he wants is "bjssub -i /bin/bash" and start >> bpsh/mpirun from the new shell. I would recommend doing as they suggest. Also, they noted that you

Re: [OMPI users] MPI_Comm_spawn multiple bproc support

2006-11-07 Thread Ralph H Castain
if there is really a > incompatibility problem in open mpi. > > Thank you so much for all you support, I wish it is not succesful yet. > > Regards. > > Herve > > Date: Fri, 03 Nov 2006 14:10:20 -0700 > From: Ralph H Castain > Subject: Re: [OMPI users] MPI_Co

Re: [OMPI users] x11 forwarding

2006-11-30 Thread Ralph H Castain
Actually, I believe at least some of this may be a bug on our part. We currently pickup the local environment and forward it on to the remote nodes as the environment for use by the backend processes. I have seen quite a few environment variables in that list, including DISPLAY, which would create

Re: [OMPI users] Any known issues with ksh?

2006-12-05 Thread Ralph H Castain
Hi Katherine Could you tell us which version of OpenMPI you are using? Thanks Ralph On 12/5/06 10:02 AM, "Katherine Holcomb" wrote: > We are in the process of converting our clusters from MPICH to OpenMPI > and have encountered some odd problems. For historical reasons, the > default shell w

Re: [OMPI users] Pernode request

2006-12-13 Thread Ralph H Castain
On 12/12/06 9:18 AM, "Maestas, Christopher Daniel" wrote: > Ralph, > > I figured I should of run an mpi program ...here's what it does (seems > to be by-X-slot style): > --- > $ /apps/x86_64/system/mpiexec-0.82/bin/mpiexec -npernode 2 mpi_hello > Hello, I am node an41 with rank 0 > Hello, I a

Re: [OMPI users] Pernode request

2006-12-13 Thread Ralph H Castain
now. Dang those smp systems. :-) Ja, ist definitely confusing... > > -cdm > >> -Original Message- >> From: users-boun...@open-mpi.org >> [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph H Castain >> Sent: Wednesday, December 13, 2006 6:57 AM >> T

Re: [OMPI users] crashed openmpi job fails to clean up ....

2006-12-19 Thread Ralph H Castain
Hi Lydia I would like to say we clean up perfectly, but :-( The system does try its best. I'm a little surprised here since we usually clean up when an application process fails. Our only known problems are when one or more of the orteds fail, usually due to a node rebooting or failing. We ho

Re: [OMPI users] orted: command not found

2007-01-03 Thread Ralph H Castain
Hi Jose Sorry for entering the discussion late. From tracing the email thread, I somewhat gather the following: 1. you have installed Open MPI 1.1.2 on two 686 boxes 2. you created a hostfile on one of the nodes and execute mpirun from that node. You gave us a prefix indicating where we should f

Re: [OMPI users] openmpi / mpirun problem on aix: poll failed with errno=25, opal_event_loop: ompi_evesel->dispatch() failed.

2007-01-09 Thread Ralph H Castain
Hi Michael I would suggest using the nightly snapshot off of the trunk - the poe module compiles correctly there. I suspect we need an update to bring that fix over to the 1.2 branch. Ralph On 1/9/07 7:55 AM, "Michael Marti" wrote: > Thanks Jeff for the hint. > > Unfortunately neither openm

Re: [OMPI users] Can't start more than one process in a node as normal user

2007-01-17 Thread Ralph H Castain
Hi Eddie Open MPI needs to create a temporary file system ­ what we call our ³session directory² - where it stores things like the shared memory file. From this output, it appears that your /tmp directory is ³locked² to root access only. You have three options for resolving this problem: (a) you

Re: [OMPI users] large jobs hang on startup (deadlock?)

2007-02-06 Thread Ralph H Castain
Well, I can't say for sure about LDAP. I did a quick search and found two things: 1. there are limits imposed in LDAP that may apply to your situation, and 2. that statement varies tremendously depending upon the specific LDAP implementation you are using I would suggest you see which LDAP you a

Re: [OMPI users] large jobs hang on startup (deadlock?)

2007-02-06 Thread Ralph H Castain
ested in any > suggestion, semi-fixes, etc. which might help get to the bottom of this. Right > now: whether the daemons are indeed up and running, or if there are some that > are not (causing MPI_Init to hang). Thanks, Todd -Original > Message- From: users-boun...@open-

Re: [OMPI users] Does Open MPI "Realy" support AIX?

2007-02-08 Thread Ralph H Castain
Hi Ali After conferring with my colleagues, it appears we don't have the cycles right now to really support AIX. As you have noted, the problem is with the io forwarding subsystem - a considerable issue. We will revise the web site to indicate this situation. We will provide an announcement of an

Re: [OMPI users] Does Open MPI "Really" support AIX?

2007-02-13 Thread Ralph H Castain
plan) from Open MPI group to make OMPI support > available for major UNIX and RTOS, > will make the Open MPI the leader in the market, and could open new doors for > R&D grants. > > Ali, > > > Ralph H Castain > Sent by: users-boun...@open-mpi.org 02/08/20

Re: [OMPI users] Open MPI and PBS Pro 8

2007-02-13 Thread Ralph H Castain
On 2/13/07 11:30 AM, "Brock Palen" wrote: > On Feb 13, 2007, at 12:55 PM, Troy Telford wrote: > >> First, the good news: >> I've recently tried PBS Pro 8 with Open MPI 1.1.4. >> >> At least with PBS Pro version 8, you can (finally) do a dynamic/shared >> object for the TM module, rather than

Re: [OMPI users] Open MPI and PBS Pro 8

2007-02-13 Thread Ralph H Castain
Oh, I should have made something clear - I believe those command line options aren't available in the 1.1 series. You'll have to upgrade to 1.2 (available in beta at the moment). On 2/13/07 12:20 PM, "Ralph H Castain" wrote: > > > > On 2/13/07 11:30 AM, "

Re: [OMPI users] MPI_Comm_Spawn

2007-02-27 Thread Ralph H Castain
Now that's interesting! There shouldn't be a limit, but to be honest, I've never tested that mode of operation - let me look into it and see. It sounds like there is some counter that is overflowing, but I'll look. Thanks Ralph On 2/27/07 8:15 AM, "rozzen.vinc...@fr.thalesgroup.com" wrote: > D

Re: [OMPI users] Orted freezes on launch of application

2007-03-13 Thread Ralph H Castain
Hi David I think your tar file didn¹t get attached ­ at least, it didn¹t reach me. Can you please send it again? Thanks Ralph On 3/13/07 1:00 AM, "David Minor" wrote: > Hi, > I'm an MPICH2 user trying out openmpi. I'm running a 1G network under Red Hat > 9, but using the g++ 3.4.3 compiler. O

Re: [OMPI users] MPI_Comm_Spawn

2007-03-13 Thread Ralph H Castain
gc=6, argv=0xb854) at main.c:13 >> (gdb) >> -Message d'origine- >> De : users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]De la >> part de Tim Prins >> Envoyé : lundi 5 mars 2007 22:34 >> À : Open MPI Users >> Objet : R

Re: [OMPI users] Fun with threading

2007-03-15 Thread Ralph H Castain
I can't speak to the MPI problems mentioned in here as my area of focus is solely on the RTE. With that caveat, I can say that - despite the fact there is little thread safety testing in the system - I haven't heard of any trouble launching non-MPI apps. We do it regularly, in both threaded and non

Re: [OMPI users] Orted freezes on launch of application

2007-03-15 Thread Ralph H Castain
resulting output so we can figure out what is going on. Ralph On 3/13/07 9:09 AM, "David Minor" wrote: > with tar > > > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf > Of Ralph H Castain > Sent: Tuesday, March 13, 2007 3:25 PM

Re: [OMPI users] Signal 13

2007-03-15 Thread Ralph H Castain
It isn't a /dev issue. The problem is likely that the system lacks sufficient permissions to either: 1. create the Open MPI session directory tree. We create a hierarchy of subdirectories for temporary storage used for things like your shared memory file - the location of the head of that tree can

Re: [OMPI users] Open MPI error when using MPI_Comm_spawn

2007-04-04 Thread Ralph H Castain
Hi Prakash I can't really test this solution as the Torque dynamic host allocator appears to be something you are adding to that system (so it isn't part of the released code). However, the attached code should cleanly add any nodes to any existing allocation known to OpenRTE. I hope to resume wo

Re: [OMPI users] MPI_Comm_Spawn

2007-04-04 Thread Ralph H Castain
lightly tested, but I >> doubt it is the problem since it always fails after 31 spawns. >> >> Again, I have tried with these configure options and the same version >> of Open MPI and have still have been able to replicate this (after >> letting it spawn over

Re: [OMPI users] Issues running a basic program with spawn

2007-06-05 Thread Ralph H Castain
Hmmm...I think I know what may be happening. Could you send me: 1. what Open MPI version you are using? 2. any MCA parameters you might be setting in your environment (remember that we may be picking up some system configuration file for those) This isn't related to the problem, but I also note

Re: [OMPI users] mpirun hanging when processes started on head node

2007-06-11 Thread Ralph H Castain
Hi Sean Could you please clarify something? I¹m a little confused by your comments about where things are running. I¹m assuming that you mean everything works fine if you type the mpirun command on the head node and just let it launch on your compute nodes ­ that the problems only occur when you s

<    1   2