On 18 July 2007 at 19:14, Dirk Eddelbuettel wrote: | | Hi Tim, | | Thanks for the follow-up | | On 18 July 2007 at 17:22, Tim Prins wrote: | | <snip> | | > Yes, this helps tremendously. I installed rsh, and now it pretty much | | > works. | | Glad this worked out for you. | | | | > | | > The one missing detail is that I can't seem to get the stdout/stderr | | > output. For example: | | > | | > $ orterun -np 1 uptime | | > $ uptime | | > 18:24:27 up 13 days, 3:03, 0 users, load average: 0.00, 0.03, 0.00 | | > | | > The man page indicates that stdout/stderr is supposed to come back to | | > the stdout/stderr of the orterun process. Any ideas on why this isn't | | > working? | | It should work. However, we currently have some I/O forwarding problems which | | show up in some environments that will (hopefully) be fixed in the next | | release. As far as I know, the problem seems to happen mostly with non-mpi | | applications. | | | | Try running a simple mpi application, such as: | | | | #include <stdio.h> | | #include "mpi.h" | | | | int main(int argc, char* argv[]) | | { | | int rank, size; | | | | MPI_Init(&argc, &argv); | | MPI_Comm_rank(MPI_COMM_WORLD, &rank); | | MPI_Comm_size(MPI_COMM_WORLD, &size); | | printf("Hello, world, I am %d of %d\n", rank, size); | | MPI_Finalize(); | | | | return 0; | | } | | | | If that works fine, then it is probably our problem, and not a problem with | | your setup. | | | | Sorry I don't have a better answer :( | | That works (and I use the same Debian openmpi 1.2.3-1 set of packages Adam | has): | | edd@basebud:~> opalcc -o /tmp/openmpitest /tmp/openmpitest.c -lmpi | edd@basebud:~> orterun -np 4 /tmp/openmpitest | Hello, world, I am 2 of 4 | Hello, world, I am 1 of 4 | Hello, world, I am 0 of 4 | Hello, world, I am 3 of 4 | edd@basebud:~> | | I was toying with this at work earlier, and it was hanging there (using | hostname or uptime as the token binaries) as soon as I increased the np | parameter beyond 1. | | It works here: | | edd@basebud:~> orterun -np 4 hostname | basebud | basebud | basebud | basebud | edd@basebud:~> | | I have slurm-llnl test packages installed at work but not here. Maybe I need | to a dig a bit more into slurm. (Adam: slurm package should be forthcoming. | I can point you to the snapshots from the fellow whom I mentor on this.)
Indeed, at work it hangs once it up the np parameter: foo:~> orterun -np 4 ./openmpitest Hello, world, I am 0 of 4 Hello, world, I am 1 of 4 Hello, world, I am 2 of 4 Hello, world, I am 3 of 4 orterun: killing job... Killed foo:~> orterun -np 4 -H localhost ./openmpitest Hello, world, I am 1 of 4 Hello, world, I am 0 of 4 Hello, world, I am 2 of 4 Hello, world, I am 3 of 4 foo:~> Restricting it to localhost helps. Any ideas? x86 multicore/multicpu, Open MPI 1.2.3, Slurm 1.2.11, Ubuntu 7.04 plus a handful of handcompiled packages from Debian unstable. More details available just tell what is needed and how best to compile it. Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison