Dear Sir/Madam, I'm running OpenMPI 1.4.2 version. The operation system is Ubuntu 9.10 with kernel version 2.6.31-14.
$ mpirun -np 1 -cpus-per-proc 1 -bind-to-core a.out * This works fine on single core P4 machine.* $ mpirun -np 1 -bind-to-core a.out *This also works fine.* $ mpirun -np 1 -cpus-per-proc 1 -bind-to-core a.out * This too works fine sir/madam.* *But i specified rank file as, rank 0=127.0.0.1 slot=0* Run the app as, $ *mpirun -np 1 -rf rankfile a.out* It gives, [ucsc-laptop:03027] *** Process received signal *** [ucsc-laptop:03027] Signal: Segmentation fault (11) [ucsc-laptop:03027] Signal code: Address not mapped (1) [ucsc-laptop:03027] Failing at address: 0x8 [ucsc-laptop:03027] [ 0] [0x867410] [ucsc-laptop:03027] [ 1] a.out(main+0x5f) [0x8048843] [ucsc-laptop:03027] [ 2] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x44cb56] [ucsc-laptop:03027] [ 3] a.out [0x8048751] [ucsc-laptop:03027] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 3027 on node ucsc-laptop exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- and for the following execution, *$ mpirun -np 1 -rf rankfile --bind-to-core a.out* [ucsc-laptop:03053] *** Process received signal *** [ucsc-laptop:03053] Signal: Segmentation fault (11) [ucsc-laptop:03053] Signal code: Address not mapped (1) [ucsc-laptop:03053] Failing at address: 0x8 [ucsc-laptop:03053] [ 0] [0xab0410] [ucsc-laptop:03053] [ 1] a.out(main+0x5f) [0x8048843] [ucsc-laptop:03053] [ 2] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x234b56] [ucsc-laptop:03053] [ 3] a.out [0x8048751] [ucsc-laptop:03053] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 3053 on node ucsc-laptop exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- *I need to execute my program in a manner that,* $ *mpirun -np 5 -rf rankfile a.out* Where rank file: rank 0=10.16.71.14 slot=0 # 10.16.71.14 is Duel core machine rank 1=10.16.71.14 slot=1 rank 2=10.16.71.15 slot=0 # 10.16.71.15 is Duel core machine rank 3=10.16.71.15 slot=1 rank 4=10.16.71.16 slot=0 # 10.16.71.16 is P4 machine with single core This gives segmentation fault as *$mpirun -np 1 -rf rankfile a.out* But if i commented out the line *rank 4=10.16.71.16 slot=0* and execute the program as *$mpirun -np 4 -rf rankfile a.out* then it *executes fine.* Please help me. How can I overcome this. Yours faithfully, Chamila Janath. On Tue, Jun 8, 2010 at 10:11 PM, Terry Dontje <terry.don...@oracle.com>wrote: > Which version of OMPI are you running on and the OS version? > Can you try and replace the rankfile specification with --bind-to-core and > tell me if that works any better? > > --td > > Chamila Janath wrote: > > > *rankfile* > rank 0=10.16.71.1 slot=0 > > I launched my mpi app using, > > $ mpirun -np 1 -rf rankfile appname > > I can run the application on Intel dual-core machine with Linux based OS > nicely. But i can't run it on single core machine(P4). > The execution terminates specifying a problem of slot number. What is the > reason for this? A bug or problem of the slot number I specified.(I tried by > using rank 0=10.16.71.1 slot=p0:0 but it too failed) > Please help me. > > Thanks a lot.... > > ------------------------------ > > _______________________________________________ > users mailing > listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > [image: Oracle] > Terry D. Dontje | Principal Software Engineer > Developer Tools Engineering | +1.650.633.7054 > Oracle * - Performance Technologies* > 95 Network Drive, Burlington, MA 01803 > Email terry.don...@oracle.com > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >