Rahul Nabar wrote:

On Tue, Mar 31, 2009 at 6:43 PM, Don Holmgren <[email protected]> wrote:
Instead of logging into the node directly, you might want to try an
interactive
job (use "qsub -I") and then try your mpirun.  This may give you messages
that
for some reason aren't getting back to you in your job's .o or .e files.

I tried an interactive job; this seems the key:

forrtl: error (78): process killed (SIGTERM)
mpirun noticed that job rank 5 with PID 10580 on node node17 exited on
signal 11 (Segmentation fault).

I do not get this segfault when I run directly on the node but only
when I run via Torque.

Any clues?


We had a problem with resources_max.pmem accidentally set too low for the Torque queue, and the user login shell was getting segfault. Torque showed Exit_status of 267.

...
ling
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to