Re: [OMPI users] Pros and cons of --enable-heterogeneous

2010-10-07 Thread David Ronis
Ralph, thanks for the reply.   

If I build with enable-heterogeneous and then decide to run on a
homogeneous set of nodes, does the additional "overhead" go away or
become completely negligible; i.e., if no conversion is necessary.

David


On Thu, 2010-10-07 at 15:17 -0600, Ralph Castain wrote:
> Hetero operations tend to lose a little performance due to the need to
> convert data, but otherwise there is no real negative. We don't do it
> by default solely because the majority of installations don't need to,
> and there is no reason to lose even a little performance if it isn't
> necessary.
> 
> 
> If you want an application to be able to span that mix, then you'll
> need to set that configure flag.
> 
> On Thu, Oct 7, 2010 at 1:44 PM, David Ronis <david.ro...@mcgill.ca>
> wrote:
> I have various boxes that run openmpi and I can't seem to use
> all of
> them at once because they have different CPU's (e.g.,
> pentiums, athlons
> (both 32 bit) vs Intel I7 (64 bit)).   I'm about the build
> 1.4.3 and was
> wondering if I should add --enable-heterogenous to the
> configure flags.
> Any advice as to why or why not would be appreciated.
> 
> David
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 




[OMPI users] Pros and cons of --enable-heterogeneous

2010-10-07 Thread David Ronis
I have various boxes that run openmpi and I can't seem to use all of
them at once because they have different CPU's (e.g., pentiums, athlons
(both 32 bit) vs Intel I7 (64 bit)).   I'm about the build 1.4.3 and was
wondering if I should add --enable-heterogenous to the configure flags.
Any advice as to why or why not would be appreciated.

David





[OMPI users] Pros and cons of --enable-heterogeneous

2010-10-07 Thread David Ronis
I have various boxes that run openmpi and I can't seem to use all of
them at once because they have different CPU's (e.g., pentiums, athlons
(both 32 bit) vs Intel I7 (64 bit)).   I'm about the build 1.4.3 and was
wondering if I should add --enable-heterogenous to the configure flags.
Any advice as to why or why not would be appreciated.

David




Re: [OMPI users] Abort

2010-08-16 Thread David Ronis
Hi Jeff,

I've reproduced your test here, with the same results.  Moreover, if I
put the nodes with rank>0 into a blocking MPI call (MPI_Bcast or
MPI_Barrier) I still get the same behavior; namely, rank 0's calling
abort() generates a core file and leads to termination, which is the
behavior I want.  I'll look at my code a bit more, but the only
difference I see now is that in my code a floating point exception
triggers a signal-handler that calls abort().   I don't see why that
should be different from your test.

Thanks for your help.

David

On Mon, 2010-08-16 at 09:54 -0700, Jeff Squyres wrote:
> FWIW, I'm unable to replicate your behavior.  This is with Open MPI 1.4.2 on 
> RHEL5:
> 
> 
> [9:52] svbu-mpi:~/mpi % cat abort.c
> #include 
> #include 
> #include 
> 
> int main(int argc, char **argv)
> {
> int rank;
> 
> MPI_Init(, );
> MPI_Comm_rank(MPI_COMM_WORLD, );
> if (0 == rank) {
> abort();
> }
> printf("Rank %d sleeping...\n", rank);
> sleep(600);
> printf("Rank %d finalizing...\n", rank);
> MPI_Finalize();
> return 0;
> }
> [9:52] svbu-mpi:~/mpi % mpicc abort.c -o abort
> [9:52] svbu-mpi:~/mpi % ls -l core*
> ls: No match.
> [9:52] svbu-mpi:~/mpi % mpirun -np 4 --bynode --host svbu-mpi055,svbu-mpi056 
> ./abort
> Rank 1 sleeping...
> [svbu-mpi055:03991] *** Process received signal ***
> [svbu-mpi055:03991] Signal: Aborted (6)
> [svbu-mpi055:03991] Signal code:  (-6)
> [svbu-mpi055:03991] [ 0] /lib64/libpthread.so.0 [0x2b45caac87c0]
> [svbu-mpi055:03991] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x2b45cad05265]
> [svbu-mpi055:03991] [ 2] /lib64/libc.so.6(abort+0x110) [0x2b45cad06d10]
> [svbu-mpi055:03991] [ 3] ./abort(main+0x36) [0x4008ee]
> [svbu-mpi055:03991] [ 4] /lib64/libc.so.6(__libc_start_main+0xf4) 
> [0x2b45cacf2994]
> [svbu-mpi055:03991] [ 5] ./abort [0x400809]
> [svbu-mpi055:03991] *** End of error message ***
> Rank 3 sleeping...
> Rank 2 sleeping...
> --
> mpirun noticed that process rank 0 with PID 3991 on node svbu-mpi055 exited 
> on signal 6 (Aborted).
> --
> [9:52] svbu-mpi:~/mpi % ls -l core*
> -rw--- 1 jsquyres eng5 26009600 Aug 16 09:52 core.abort-1281977540-3991
> [9:52] svbu-mpi:~/mpi % file core.abort-1281977540-3991 
> core.abort-1281977540-3991: ELF 64-bit LSB core file AMD x86-64, version 1 
> (SYSV), SVR4-style, from 'abort'
> [9:52] svbu-mpi:~/mpi % 
> -
> 
> You can see that all processes die immediately, and I get a corefile from the 
> process that called abort().
> 
> 
> On Aug 16, 2010, at 9:25 AM, David Ronis wrote:
> 
> > I've tried both--as you said, MPI_Abort doesn't drop a core file, but
> > does kill off the entire MPI job.   abort() drops core when I'm running
> > on 1 processor, but not in a multiprocessor run.  In addition, a node
> > calling abort() doesn't lead to the entire run being killed off.
> > 
> > David
> > O
> > n Mon, 2010-08-16 at 08:51 -0700, Jeff Squyres wrote:
> >> On Aug 13, 2010, at 12:53 PM, David Ronis wrote:
> >> 
> >>> I'm using mpirun and the nodes are all on the same machin (a 8 cpu box
> >>> with an intel i7).  coresize is unlimited:
> >>> 
> >>> ulimit -a
> >>> core file size  (blocks, -c) unlimited
> >> 
> >> That looks good.
> >> 
> >> In reviewing the email thread, it's not entirely clear: are you calling 
> >> abort() or MPI_Abort()?  MPI_Abort() won't drop a core file.  abort() 
> >> should.
> >> 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 



Re: [OMPI users] Abort

2010-08-16 Thread David Ronis
I've tried both--as you said, MPI_Abort doesn't drop a core file, but
does kill off the entire MPI job.   abort() drops core when I'm running
on 1 processor, but not in a multiprocessor run.  In addition, a node
calling abort() doesn't lead to the entire run being killed off.

David
O
n Mon, 2010-08-16 at 08:51 -0700, Jeff Squyres wrote:
> On Aug 13, 2010, at 12:53 PM, David Ronis wrote:
> 
> > I'm using mpirun and the nodes are all on the same machin (a 8 cpu box
> > with an intel i7).  coresize is unlimited:
> > 
> > ulimit -a
> > core file size  (blocks, -c) unlimited
> 
> That looks good.
> 
> In reviewing the email thread, it's not entirely clear: are you calling 
> abort() or MPI_Abort()?  MPI_Abort() won't drop a core file.  abort() should.
> 



Re: [OMPI users] Abort

2010-08-13 Thread David Ronis
I'm using mpirun and the nodes are all on the same machin (a 8 cpu box
with an intel i7).  coresize is unlimited:


ulimit -a
core file size  (blocks, -c) unlimited

David


n Fri, 2010-08-13 at 13:47 -0400, Jeff Squyres wrote:
> On Aug 13, 2010, at 1:18 PM, David Ronis wrote:
> 
> > Second coredumpsize is unlimited, and indeed I DO get core dumps when
> > I'm running a single-processor version.  
> 
> What launcher are you using underneath Open MPI?
> 
> You might want to make sure that the underlying launcher actually sets the 
> coredumpsize to unlimited on each server where you're running.  E.g., if 
> you're using rsh/ssh, check that your shell startup files set coredumpsize to 
> unlimited for non-interactive logins.  Or, if you're using (for example) 
> Torque, check to ensure that jobs launched under Torque don't have their 
> coredumpsize automatically reset to 0, etc.
> 



Re: [OMPI users] Abort

2010-08-13 Thread David Ronis
Thanks to all who replied.  

First, I'm running openmpi 1.4.2.  

Second coredumpsize is unlimited, and indeed I DO get core dumps when
I'm running a single-processor version.  Third, the problem isn't
stopping the program, MPI_Abort does that just fine, rather it's getting
a cordump.  According to the man page, MPI_Abort sends a SIGTERM, not a
SIGABRT so perhaps that's what should happen.   

Finally, my guess as to what's happening if I use the libc abort is that
the other nodes get stuck in an MPI call (I do lots of MPI_Reduces or
MPI_Bcasts in this code), but this doesn't explain why the node calling
abort doesn't exit with a coredump.

David

On Thu, 2010-08-12 at 20:44 -0600, Ralph Castain wrote:
> Sounds very strange - what OMPI version, on what type of machine, and how was 
> it configured?
> 
> 
> On Aug 12, 2010, at 7:49 PM, David Ronis wrote:
> 
> > I've got a mpi program that is supposed to to generate a core file if
> > problems arise on any of the nodes.   I tried to do this by adding a
> > call to abort() to my exit routines but this doesn't work; I get no core
> > file, and worse, mpirun doesn't detect that one of my nodes has
> > aborted(?) and doesn't kill off the entire job, except in the trivial
> > case where the number of processors I'm running on is 1.   I've replaced
> > abort with MPI_Abort, which kills everything off, but leaves no core
> > file.  Any suggestions how I can get one and still have mpi exit?
> > 
> > Thanks in advance.
> > 
> > David
> > 
> > 
> > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 




[OMPI users] Abort

2010-08-12 Thread David Ronis
I've got a mpi program that is supposed to to generate a core file if
problems arise on any of the nodes.   I tried to do this by adding a
call to abort() to my exit routines but this doesn't work; I get no core
file, and worse, mpirun doesn't detect that one of my nodes has
aborted(?) and doesn't kill off the entire job, except in the trivial
case where the number of processors I'm running on is 1.   I've replaced
abort with MPI_Abort, which kills everything off, but leaves no core
file.  Any suggestions how I can get one and still have mpi exit?

Thanks in advance.

David






Re: [OMPI users] Do MPI calls ever sleep?

2010-07-22 Thread David Ronis
That did it.  Thanks.

David

On Wed, 2010-07-21 at 15:29 -0500, Dave Goodell wrote:
> On Jul 21, 2010, at 2:54 PM CDT, Jed Brown wrote:
> 
> > On Wed, 21 Jul 2010 15:20:24 -0400, David Ronis <david.ro...@mcgill.ca> 
> > wrote:
> >> Hi Jed,
> >> 
> >> Thanks for the reply and suggestion.  I tried adding -mca
> >> yield_when_idle 1 (and later mpi_yield_when_idle 1 which is what
> >> ompi_info reports the variable as) but it seems to have had 0 effect.
> >> My master goes into fftw planning routines for a minute or so (I see the
> >> threads being created), but the overall usage of the slaves remains
> >> close to 100% during this time.  Just to be sure, I put the slaves into
> >> a MPI_Barrier(MPI_COMM_WORLD) while they were waiting for the fftw
> >> planner to finish.   It also didn't help.
> > 
> > They still spin (instead of using e.g. select()), but call sched_yield()
> > so should only be actively spinning when nothing else is trying to run.
> > Are you sure that the planner is always running in parallel?  What OS
> > and OMPI version are you using?
> 
> sched_yield doesn't work as expected in late 2.6 Linux kernels: 
> http://kerneltrap.org/Linux/CFS_and_sched_yield
> 
> If this scheduling behavior change is affecting you, you might be able to fix 
> it with:
> 
> echo "1" >/proc/sys/kernel/sched_compat_yield
> 
> -Dave
> 




Re: [OMPI users] Do MPI calls ever sleep?

2010-07-21 Thread David Ronis
I'm running linux (slackware 12.2), openmpi 1.4.2 and fftw3 3.2.4.

As to the planner always running in parallel, I suspect it isn't.   It's
trying to optimize the split up the fft computation between different
codelets and different numbers of threads (including none).  It tries
something and measures the result.   In fact, if it determines that
threads don't provide any advantage (irrespective of MPI) then it won't
use them, and indeed this might be the case here.  Unfortunately, fftw
doesn't seem to return or provide a way to get the acutal number of
threads it plans to use.

David



On Wed, 2010-07-21 at 21:54 +0200, Jed Brown wrote:
> On Wed, 21 Jul 2010 15:20:24 -0400, David Ronis <david.ro...@mcgill.ca> wrote:
> > Hi Jed,
> > 
> > Thanks for the reply and suggestion.  I tried adding -mca
> > yield_when_idle 1 (and later mpi_yield_when_idle 1 which is what
> > ompi_info reports the variable as) but it seems to have had 0 effect.
> > My master goes into fftw planning routines for a minute or so (I see the
> > threads being created), but the overall usage of the slaves remains
> > close to 100% during this time.  Just to be sure, I put the slaves into
> > a MPI_Barrier(MPI_COMM_WORLD) while they were waiting for the fftw
> > planner to finish.   It also didn't help.
> 
> They still spin (instead of using e.g. select()), but call sched_yield()
> so should only be actively spinning when nothing else is trying to run.
> Are you sure that the planner is always running in parallel?  What OS
> and OMPI version are you using?
> 
> Jed
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] Do MPI calls ever sleep?

2010-07-21 Thread David Ronis
Hi Jed,

Thanks for the reply and suggestion.  I tried adding -mca
yield_when_idle 1 (and later mpi_yield_when_idle 1 which is what
ompi_info reports the variable as) but it seems to have had 0 effect.
My master goes into fftw planning routines for a minute or so (I see the
threads being created), but the overall usage of the slaves remains
close to 100% during this time.  Just to be sure, I put the slaves into
a MPI_Barrier(MPI_COMM_WORLD) while they were waiting for the fftw
planner to finish.   It also didn't help.

Do you know where is yield_when_idle documented?

David



On Wed, 2010-07-21 at 20:24 +0200, Jed Brown wrote:
> On Wed, 21 Jul 2010 14:10:53 -0400, David Ronis <david.ro...@mcgill.ca> wrote:
> > Is there another MPI routine that polls for data and then gives up its
> > time-slice?
> 
> You're probably looking for the runtime option -mca yield_when_idle 1.
> This will slightly increase latency, but allows other threads to run
> without competing with the spinning MPI.
> 
> Jed
> 



[OMPI users] Do MPI calls ever sleep?

2010-07-21 Thread David Ronis
I've got a mpi program on an 8-core box that runs in a master-slave
mode.   The slaves calculate something, pass data to the master, and
then call MPI_Bcast waiting for the master to update and return some
data via a MPI_Bcast originating on the master.  

One of the things the master does while the slaves are waiting is to
make heavy use of fftw3 FFT routines which can support multi-threading.
However, for threading to make sense, the slaves on same physical
machine have to give up their CPU usage, and this doesn't seem to be the
case (top shows them running at close to 100%).  Is there another MPI
routine that polls for data and then gives up its time-slice? 

Any other suggestions?

Thanks in advance.

David





[OMPI users] Network Problem?

2009-06-30 Thread David Ronis
(This may be a duplicate.  An earlier post seems to have been lost).

I'm using openmpi (1.3.2) to run on 3 dual processor machines (running
linux, slackware-12.1, gcc-4.4.0).  Two are directly on my LAN while
the 3rd is connected to my LAN via VPN and NAT (I can communicate in
either direction from any of the machines to the remote machines using
its NAT address).

The program I'm trying to run is very simple in terms of MPI.
Basically it is:

main()
{
[snip];

  MPI_Init(,);
  MPI_Comm_size(MPI_COMM_WORLD,);
  MPI_Comm_rank(MPI_COMM_WORLD,);

[snip]; 

  if(myrank==0)
i=MPI_Reduce(MPI_IN_PLACE, C, N, MPI_DOUBLE, 
 MPI_SUM, 0, MPI_COMM_WORLD);
  else
i=MPI_Reduce(C, MPI_IN_PLACE, N, MPI_DOUBLE, 
 MPI_SUM, 0, MPI_COMM_WORLD);

  if(i!=MPI_SUCCESS)
{

  fprintf(stderr,"MPI_Reduce (C) fails on processor %d\n", myrank);
  MPI_Finalize();
  exit(1);
}
  MPI_Barrier(MPI_COMM_WORLD);


[snip];

}

I run by invoking:

mpirun -v -np ${NPROC} -hostfile ${HOSTFILE} --stdin none $*
> /dev/null

If I run on the 4 nodes that are physically on the LAN it works as
expected.  When I add the nodes on the remote machine things don't
work properly:

1.  If I start with NPROC=6 on one of the LAN machines all 6 nodes
start (as shown by running ps), and all get to the MPI_HARVEST
calls. At that point things hang (I see no network traffic, which
given the size of the array I'm trying to reduce is strange).

2.  If I start on the remote with NPROC=6, the only the mpirun call
shows up under ps on the remote, while nothing shows up on the other
nodes.  Killing the process gives messages like:

 hostname - daemon did not report back when launched

3.  If I start on the remote with NPROC=2, the 2 processes start on
the remote and finish properly.

My suspicion is that there's some bad interaction with NAT and
authentication. 

Any suggestions?

David