Re: [OMPI users] Pros and cons of --enable-heterogeneous
Ralph, thanks for the reply. If I build with enable-heterogeneous and then decide to run on a homogeneous set of nodes, does the additional "overhead" go away or become completely negligible; i.e., if no conversion is necessary. David On Thu, 2010-10-07 at 15:17 -0600, Ralph Castain wrote: > Hetero operations tend to lose a little performance due to the need to > convert data, but otherwise there is no real negative. We don't do it > by default solely because the majority of installations don't need to, > and there is no reason to lose even a little performance if it isn't > necessary. > > > If you want an application to be able to span that mix, then you'll > need to set that configure flag. > > On Thu, Oct 7, 2010 at 1:44 PM, David Ronis <david.ro...@mcgill.ca> > wrote: > I have various boxes that run openmpi and I can't seem to use > all of > them at once because they have different CPU's (e.g., > pentiums, athlons > (both 32 bit) vs Intel I7 (64 bit)). I'm about the build > 1.4.3 and was > wondering if I should add --enable-heterogenous to the > configure flags. > Any advice as to why or why not would be appreciated. > > David > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > >
[OMPI users] Pros and cons of --enable-heterogeneous
I have various boxes that run openmpi and I can't seem to use all of them at once because they have different CPU's (e.g., pentiums, athlons (both 32 bit) vs Intel I7 (64 bit)). I'm about the build 1.4.3 and was wondering if I should add --enable-heterogenous to the configure flags. Any advice as to why or why not would be appreciated. David
[OMPI users] Pros and cons of --enable-heterogeneous
I have various boxes that run openmpi and I can't seem to use all of them at once because they have different CPU's (e.g., pentiums, athlons (both 32 bit) vs Intel I7 (64 bit)). I'm about the build 1.4.3 and was wondering if I should add --enable-heterogenous to the configure flags. Any advice as to why or why not would be appreciated. David
Re: [OMPI users] Abort
Hi Jeff, I've reproduced your test here, with the same results. Moreover, if I put the nodes with rank>0 into a blocking MPI call (MPI_Bcast or MPI_Barrier) I still get the same behavior; namely, rank 0's calling abort() generates a core file and leads to termination, which is the behavior I want. I'll look at my code a bit more, but the only difference I see now is that in my code a floating point exception triggers a signal-handler that calls abort(). I don't see why that should be different from your test. Thanks for your help. David On Mon, 2010-08-16 at 09:54 -0700, Jeff Squyres wrote: > FWIW, I'm unable to replicate your behavior. This is with Open MPI 1.4.2 on > RHEL5: > > > [9:52] svbu-mpi:~/mpi % cat abort.c > #include > #include > #include > > int main(int argc, char **argv) > { > int rank; > > MPI_Init(, ); > MPI_Comm_rank(MPI_COMM_WORLD, ); > if (0 == rank) { > abort(); > } > printf("Rank %d sleeping...\n", rank); > sleep(600); > printf("Rank %d finalizing...\n", rank); > MPI_Finalize(); > return 0; > } > [9:52] svbu-mpi:~/mpi % mpicc abort.c -o abort > [9:52] svbu-mpi:~/mpi % ls -l core* > ls: No match. > [9:52] svbu-mpi:~/mpi % mpirun -np 4 --bynode --host svbu-mpi055,svbu-mpi056 > ./abort > Rank 1 sleeping... > [svbu-mpi055:03991] *** Process received signal *** > [svbu-mpi055:03991] Signal: Aborted (6) > [svbu-mpi055:03991] Signal code: (-6) > [svbu-mpi055:03991] [ 0] /lib64/libpthread.so.0 [0x2b45caac87c0] > [svbu-mpi055:03991] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x2b45cad05265] > [svbu-mpi055:03991] [ 2] /lib64/libc.so.6(abort+0x110) [0x2b45cad06d10] > [svbu-mpi055:03991] [ 3] ./abort(main+0x36) [0x4008ee] > [svbu-mpi055:03991] [ 4] /lib64/libc.so.6(__libc_start_main+0xf4) > [0x2b45cacf2994] > [svbu-mpi055:03991] [ 5] ./abort [0x400809] > [svbu-mpi055:03991] *** End of error message *** > Rank 3 sleeping... > Rank 2 sleeping... > -- > mpirun noticed that process rank 0 with PID 3991 on node svbu-mpi055 exited > on signal 6 (Aborted). > -- > [9:52] svbu-mpi:~/mpi % ls -l core* > -rw--- 1 jsquyres eng5 26009600 Aug 16 09:52 core.abort-1281977540-3991 > [9:52] svbu-mpi:~/mpi % file core.abort-1281977540-3991 > core.abort-1281977540-3991: ELF 64-bit LSB core file AMD x86-64, version 1 > (SYSV), SVR4-style, from 'abort' > [9:52] svbu-mpi:~/mpi % > - > > You can see that all processes die immediately, and I get a corefile from the > process that called abort(). > > > On Aug 16, 2010, at 9:25 AM, David Ronis wrote: > > > I've tried both--as you said, MPI_Abort doesn't drop a core file, but > > does kill off the entire MPI job. abort() drops core when I'm running > > on 1 processor, but not in a multiprocessor run. In addition, a node > > calling abort() doesn't lead to the entire run being killed off. > > > > David > > O > > n Mon, 2010-08-16 at 08:51 -0700, Jeff Squyres wrote: > >> On Aug 13, 2010, at 12:53 PM, David Ronis wrote: > >> > >>> I'm using mpirun and the nodes are all on the same machin (a 8 cpu box > >>> with an intel i7). coresize is unlimited: > >>> > >>> ulimit -a > >>> core file size (blocks, -c) unlimited > >> > >> That looks good. > >> > >> In reviewing the email thread, it's not entirely clear: are you calling > >> abort() or MPI_Abort()? MPI_Abort() won't drop a core file. abort() > >> should. > >> > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > >
Re: [OMPI users] Abort
I've tried both--as you said, MPI_Abort doesn't drop a core file, but does kill off the entire MPI job. abort() drops core when I'm running on 1 processor, but not in a multiprocessor run. In addition, a node calling abort() doesn't lead to the entire run being killed off. David O n Mon, 2010-08-16 at 08:51 -0700, Jeff Squyres wrote: > On Aug 13, 2010, at 12:53 PM, David Ronis wrote: > > > I'm using mpirun and the nodes are all on the same machin (a 8 cpu box > > with an intel i7). coresize is unlimited: > > > > ulimit -a > > core file size (blocks, -c) unlimited > > That looks good. > > In reviewing the email thread, it's not entirely clear: are you calling > abort() or MPI_Abort()? MPI_Abort() won't drop a core file. abort() should. >
Re: [OMPI users] Abort
I'm using mpirun and the nodes are all on the same machin (a 8 cpu box with an intel i7). coresize is unlimited: ulimit -a core file size (blocks, -c) unlimited David n Fri, 2010-08-13 at 13:47 -0400, Jeff Squyres wrote: > On Aug 13, 2010, at 1:18 PM, David Ronis wrote: > > > Second coredumpsize is unlimited, and indeed I DO get core dumps when > > I'm running a single-processor version. > > What launcher are you using underneath Open MPI? > > You might want to make sure that the underlying launcher actually sets the > coredumpsize to unlimited on each server where you're running. E.g., if > you're using rsh/ssh, check that your shell startup files set coredumpsize to > unlimited for non-interactive logins. Or, if you're using (for example) > Torque, check to ensure that jobs launched under Torque don't have their > coredumpsize automatically reset to 0, etc. >
Re: [OMPI users] Abort
Thanks to all who replied. First, I'm running openmpi 1.4.2. Second coredumpsize is unlimited, and indeed I DO get core dumps when I'm running a single-processor version. Third, the problem isn't stopping the program, MPI_Abort does that just fine, rather it's getting a cordump. According to the man page, MPI_Abort sends a SIGTERM, not a SIGABRT so perhaps that's what should happen. Finally, my guess as to what's happening if I use the libc abort is that the other nodes get stuck in an MPI call (I do lots of MPI_Reduces or MPI_Bcasts in this code), but this doesn't explain why the node calling abort doesn't exit with a coredump. David On Thu, 2010-08-12 at 20:44 -0600, Ralph Castain wrote: > Sounds very strange - what OMPI version, on what type of machine, and how was > it configured? > > > On Aug 12, 2010, at 7:49 PM, David Ronis wrote: > > > I've got a mpi program that is supposed to to generate a core file if > > problems arise on any of the nodes. I tried to do this by adding a > > call to abort() to my exit routines but this doesn't work; I get no core > > file, and worse, mpirun doesn't detect that one of my nodes has > > aborted(?) and doesn't kill off the entire job, except in the trivial > > case where the number of processors I'm running on is 1. I've replaced > > abort with MPI_Abort, which kills everything off, but leaves no core > > file. Any suggestions how I can get one and still have mpi exit? > > > > Thanks in advance. > > > > David > > > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] Abort
I've got a mpi program that is supposed to to generate a core file if problems arise on any of the nodes. I tried to do this by adding a call to abort() to my exit routines but this doesn't work; I get no core file, and worse, mpirun doesn't detect that one of my nodes has aborted(?) and doesn't kill off the entire job, except in the trivial case where the number of processors I'm running on is 1. I've replaced abort with MPI_Abort, which kills everything off, but leaves no core file. Any suggestions how I can get one and still have mpi exit? Thanks in advance. David
Re: [OMPI users] Do MPI calls ever sleep?
That did it. Thanks. David On Wed, 2010-07-21 at 15:29 -0500, Dave Goodell wrote: > On Jul 21, 2010, at 2:54 PM CDT, Jed Brown wrote: > > > On Wed, 21 Jul 2010 15:20:24 -0400, David Ronis <david.ro...@mcgill.ca> > > wrote: > >> Hi Jed, > >> > >> Thanks for the reply and suggestion. I tried adding -mca > >> yield_when_idle 1 (and later mpi_yield_when_idle 1 which is what > >> ompi_info reports the variable as) but it seems to have had 0 effect. > >> My master goes into fftw planning routines for a minute or so (I see the > >> threads being created), but the overall usage of the slaves remains > >> close to 100% during this time. Just to be sure, I put the slaves into > >> a MPI_Barrier(MPI_COMM_WORLD) while they were waiting for the fftw > >> planner to finish. It also didn't help. > > > > They still spin (instead of using e.g. select()), but call sched_yield() > > so should only be actively spinning when nothing else is trying to run. > > Are you sure that the planner is always running in parallel? What OS > > and OMPI version are you using? > > sched_yield doesn't work as expected in late 2.6 Linux kernels: > http://kerneltrap.org/Linux/CFS_and_sched_yield > > If this scheduling behavior change is affecting you, you might be able to fix > it with: > > echo "1" >/proc/sys/kernel/sched_compat_yield > > -Dave >
Re: [OMPI users] Do MPI calls ever sleep?
I'm running linux (slackware 12.2), openmpi 1.4.2 and fftw3 3.2.4. As to the planner always running in parallel, I suspect it isn't. It's trying to optimize the split up the fft computation between different codelets and different numbers of threads (including none). It tries something and measures the result. In fact, if it determines that threads don't provide any advantage (irrespective of MPI) then it won't use them, and indeed this might be the case here. Unfortunately, fftw doesn't seem to return or provide a way to get the acutal number of threads it plans to use. David On Wed, 2010-07-21 at 21:54 +0200, Jed Brown wrote: > On Wed, 21 Jul 2010 15:20:24 -0400, David Ronis <david.ro...@mcgill.ca> wrote: > > Hi Jed, > > > > Thanks for the reply and suggestion. I tried adding -mca > > yield_when_idle 1 (and later mpi_yield_when_idle 1 which is what > > ompi_info reports the variable as) but it seems to have had 0 effect. > > My master goes into fftw planning routines for a minute or so (I see the > > threads being created), but the overall usage of the slaves remains > > close to 100% during this time. Just to be sure, I put the slaves into > > a MPI_Barrier(MPI_COMM_WORLD) while they were waiting for the fftw > > planner to finish. It also didn't help. > > They still spin (instead of using e.g. select()), but call sched_yield() > so should only be actively spinning when nothing else is trying to run. > Are you sure that the planner is always running in parallel? What OS > and OMPI version are you using? > > Jed > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Do MPI calls ever sleep?
Hi Jed, Thanks for the reply and suggestion. I tried adding -mca yield_when_idle 1 (and later mpi_yield_when_idle 1 which is what ompi_info reports the variable as) but it seems to have had 0 effect. My master goes into fftw planning routines for a minute or so (I see the threads being created), but the overall usage of the slaves remains close to 100% during this time. Just to be sure, I put the slaves into a MPI_Barrier(MPI_COMM_WORLD) while they were waiting for the fftw planner to finish. It also didn't help. Do you know where is yield_when_idle documented? David On Wed, 2010-07-21 at 20:24 +0200, Jed Brown wrote: > On Wed, 21 Jul 2010 14:10:53 -0400, David Ronis <david.ro...@mcgill.ca> wrote: > > Is there another MPI routine that polls for data and then gives up its > > time-slice? > > You're probably looking for the runtime option -mca yield_when_idle 1. > This will slightly increase latency, but allows other threads to run > without competing with the spinning MPI. > > Jed >
[OMPI users] Do MPI calls ever sleep?
I've got a mpi program on an 8-core box that runs in a master-slave mode. The slaves calculate something, pass data to the master, and then call MPI_Bcast waiting for the master to update and return some data via a MPI_Bcast originating on the master. One of the things the master does while the slaves are waiting is to make heavy use of fftw3 FFT routines which can support multi-threading. However, for threading to make sense, the slaves on same physical machine have to give up their CPU usage, and this doesn't seem to be the case (top shows them running at close to 100%). Is there another MPI routine that polls for data and then gives up its time-slice? Any other suggestions? Thanks in advance. David
[OMPI users] Network Problem?
(This may be a duplicate. An earlier post seems to have been lost). I'm using openmpi (1.3.2) to run on 3 dual processor machines (running linux, slackware-12.1, gcc-4.4.0). Two are directly on my LAN while the 3rd is connected to my LAN via VPN and NAT (I can communicate in either direction from any of the machines to the remote machines using its NAT address). The program I'm trying to run is very simple in terms of MPI. Basically it is: main() { [snip]; MPI_Init(,); MPI_Comm_size(MPI_COMM_WORLD,); MPI_Comm_rank(MPI_COMM_WORLD,); [snip]; if(myrank==0) i=MPI_Reduce(MPI_IN_PLACE, C, N, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); else i=MPI_Reduce(C, MPI_IN_PLACE, N, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); if(i!=MPI_SUCCESS) { fprintf(stderr,"MPI_Reduce (C) fails on processor %d\n", myrank); MPI_Finalize(); exit(1); } MPI_Barrier(MPI_COMM_WORLD); [snip]; } I run by invoking: mpirun -v -np ${NPROC} -hostfile ${HOSTFILE} --stdin none $* > /dev/null If I run on the 4 nodes that are physically on the LAN it works as expected. When I add the nodes on the remote machine things don't work properly: 1. If I start with NPROC=6 on one of the LAN machines all 6 nodes start (as shown by running ps), and all get to the MPI_HARVEST calls. At that point things hang (I see no network traffic, which given the size of the array I'm trying to reduce is strange). 2. If I start on the remote with NPROC=6, the only the mpirun call shows up under ps on the remote, while nothing shows up on the other nodes. Killing the process gives messages like: hostname - daemon did not report back when launched 3. If I start on the remote with NPROC=2, the 2 processes start on the remote and finish properly. My suspicion is that there's some bad interaction with NAT and authentication. Any suggestions? David