Hi Ralph one more thing i noticed while trying out orte_iof again. The option --report-pid crashes mpirun: [jody@localhost neander]$ mpirun -report-pid -np 2 ./MPITest [localhost:31146] *** Process received signal *** [localhost:31146] Signal: Segmentation fault (11) [localhost:31146] Signal code: Address not mapped (1) [localhost:31146] Failing at address: 0x24 [localhost:31146] [ 0] [0x11040c] [localhost:31146] [ 1] /opt/openmpi/lib/openmpi/mca_odls_default.so [0x1e8f9d] [localhost:31146] [ 2] /opt/openmpi/lib/libopen-rte.so.0(orte_daemon_cmd_processor+0x4d1) [0x132541] [localhost:31146] [ 3] /opt/openmpi/lib/libopen-pal.so.0 [0x170248] [localhost:31146] [ 4] /opt/openmpi/lib/libopen-pal.so.0(opal_event_loop+0x27) [0x170497] [localhost:31146] [ 5] /opt/openmpi/lib/libopen-pal.so.0(opal_progress+0xcb) [0x16399b] [localhost:31146] [ 6] /opt/openmpi/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0x30d) [0x1441ad] [localhost:31146] [ 7] /opt/openmpi/lib/openmpi/mca_plm_rsh.so [0x1c833b] [localhost:31146] [ 8] mpirun [0x804acf6] [localhost:31146] [ 9] mpirun [0x804a0a6] [localhost:31146] [10] /lib/libc.so.6(__libc_start_main+0xe0) [0x98d390] [localhost:31146] [11] mpirun [0x8049fd1] [localhost:31146] *** End of error message *** Segmentation fault
This always happens, irrespective of the number of processes, or whether locally only or with remote machines. Jody On Mon, Feb 2, 2009 at 10:55 AM, jody <jody....@gmail.com> wrote: > Hi Ralph > The new options are great stuff! > Following your suggestion, i downloaded and installed > > http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r20392.tar.gz > > and tested the new options. (i have a simple cluster of > 8 machines over tcp). Not everything worked as specified, though: > * timestamp-output : works > * xterm : doesn't work completely - > comma-separated rank list: > Only for the local processes a xterm is opened. The other processes > (the ones on remote machines) only output to the stdout of the > calling window. > (Just to be sure i started my own script for opening separate xterms > - that did work for the remoties, too) > > If a '-1' is given instead of a list of ranks, it fails (locally & > with remotes): > [jody@localhost neander]$ mpirun -np 4 --xterm -1 ./MPITest > -------------------------------------------------------------------------- > Sorry! You were supposed to get help about: > orte-odls-base:xterm-rank-out-of-bounds > from the file: > help-odls-base.txt > But I couldn't find any file matching that name. Sorry! > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun was unable to start the specified application as it > encountered an error > on node localhost. More information may be available above. > -------------------------------------------------------------------------- > * output-filename : doesn't work here: > [jody@localhost neander]$ mpirun -np 4 --output-filename gnagna ./MPITest > [jody@localhost neander]$ ls -l gna* > -rw-r--r-- 1 jody morpho 549 2009-02-02 09:07 gnagna.%10lu > > There is output from the processes on remote machines on stdout, but none > from the local ones. > > > A question about installing: i installed the usual way (configure, > make all install), > but the new man-files apparently weren't copied to their destination: > If i do 'man mpirun' i get shown the contents of an old man-file > (without the new options). > I had to do ' less /opt//openmpi-1.4a1r20394/share/man/man1/mpirun.1' > to see them. > > About the xterm-option : when the application ends all xterms are > closed immediately. > (when doing things 'by hand' i used the -hold option for xterm) > Would it be possible to add this feature for your xterm option? > Perhaps by adding a '!' at the end of the rank list? > > About orte_iof: with the new version it works, but no matter which > rank i specify, > it only prints out rank0's output: > [jody@localhost ~]$ orte-iof --pid 31049 --rank 4 --stdout > [localhost]I am #0/9 before the barrier > > > > Thanks > > Jody > > On Sun, Feb 1, 2009 at 10:49 PM, Ralph Castain <r...@lanl.gov> wrote: >> I'm afraid we discovered a bug in optimized builds with r20392. Please use >> any tarball with r20394 or above. >> >> Sorry for the confusion >> Ralph >> >> >> On Feb 1, 2009, at 5:27 AM, Jeff Squyres wrote: >> >>> On Jan 31, 2009, at 11:39 AM, Ralph Castain wrote: >>> >>>> For anyone following this thread: >>>> >>>> I have completed the IOF options discussed below. Specifically, I have >>>> added the following: >>>> >>>> * a new "timestamp-output" option that timestamp's each line of output >>>> >>>> * a new "output-filename" option that redirects each proc's output to a >>>> separate rank-named file. >>>> >>>> * a new "xterm" option that redirects the output of the specified ranks >>>> to a separate xterm window. >>>> >>>> You can obtain a copy of the updated code at: >>>> >>>> http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r20392.tar.gz >>> >>> Sweet stuff. :-) >>> >>> Note that the URL/tarball that Ralph cites is a nightly snapshot and will >>> expire after a while -- we only keep the most 5 recent nightly tarballs >>> available. You can find Ralph's new IOF stuff in any 1.4a1 nightly tarball >>> after the one he cited above. Note that the last part of the tarball name >>> refers to the subversion commit number (which increases monotonically); any >>> 1.4 nightly snapshot tarball beyond "r20392" will contain this new IOF >>> stuff. Here's where to get our nightly snapshot tarballs: >>> >>> http://www.open-mpi.org/nightly/trunk/ >>> >>> Don't read anything into the "1.4" version number -- we've just bumped the >>> version number internally to be different than the current stable series >>> (1.3). We haven't yet branched for the v1.4 series; hence, "1.4a1" >>> currently refers to our development trunk. >>> >>> -- >>> Jeff Squyres >>> Cisco Systems >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >