Re: [OMPI users] Job does not quit even when the simulation dies

2007-11-07 Thread Ralph H Castain
As Jeff indicated, the degree of capability has improved over time - I'm not sure which version this represents. The type of failure also plays a major role in our ability to respond. If a process actually segfaults or dies, we usually pick that up pretty well and abort the rest of the job (certai

Re: [OMPI users] Job does not quit even when the simulation dies

2007-11-07 Thread Jeff Squyres
Support for failure scenarios is something that is getting better over time in Open MPI. It looks like the version you are using either didn't properly catch that there was a failure and/or then cleanly exit all MPI processes. On Nov 6, 2007, at 9:01 PM, Teng Lin wrote: Hi, Just realiz

[OMPI users] Job does not quit even when the simulation dies

2007-11-06 Thread Teng Lin
Hi, Just realize I have a job run for a long time, while some of the nodes already die. Is there any way to ask other nodes to quit ? [kyla-0-1.local:09741] mca_btl_tcp_frag_send: writev failed with errno=104 [kyla-0-1.local:09742] mca_btl_tcp_frag_send: writev failed with errno=104 T