Hello all
The wiki page has been updated with the latest test results from a new
branch that implemented inbound collectives on the modex and barrier
operations. As you will see from the graphs, ORTE/OMPI now exhibits a
negative 2nd-derivative on the launch time curve for mpi_no_op (i.e.,
MPI_Init
Still no luck here,
I launch those three processes :
term1$ ompi-server -d --report-uri URIFILE
term2$ mpirun -mca routed unity -ompi-server file:URIFILE -np 1
simple_accept
term3$ mpirun -mca routed unity -ompi-server file:URIFILE -np 1
simple_connect
The output of ompi-server shows a s
On 4/8/08 2:19 PM, "Ralph H Castain" wrote:
>
>
>
> On 4/8/08 12:10 PM, "Pak Lui" wrote:
>
>> Richard Graham wrote:
>>> What happens if I deliver sigusr2 to mpirun ? What I observe (for both
>>> ssh/rsh and torque) that if I deliver a sigusr2 to mpirun, the signal does
>>> get propagated
On 4/8/08 12:10 PM, "Pak Lui" wrote:
> Richard Graham wrote:
>> What happens if I deliver sigusr2 to mpirun ? What I observe (for both
>> ssh/rsh and torque) that if I deliver a sigusr2 to mpirun, the signal does
>> get propagated to the mpi procs, which do invoke the signal handler I
>> regi
Richard Graham wrote:
What happens if I deliver sigusr2 to mpirun ? What I observe (for both
ssh/rsh and torque) that if I deliver a sigusr2 to mpirun, the signal does
get propagated to the mpi procs, which do invoke the signal handler I
registered, but the job is terminated right after that. H
Hmmm...well, I'll take a look. I haven't seen that behavior, but I haven't
checked it in some time.
On 4/8/08 11:54 AM, "Richard Graham" wrote:
> What happens if I deliver sigusr2 to mpirun ? What I observe (for both
> ssh/rsh and torque) that if I deliver a sigusr2 to mpirun, the signal does
What happens if I deliver sigusr2 to mpirun ? What I observe (for both
ssh/rsh and torque) that if I deliver a sigusr2 to mpirun, the signal does
get propagated to the mpi procs, which do invoke the signal handler I
registered, but the job is terminated right after that. However, if I
deliver the
I found what Pak said a little confusing as the wait_daemon function doesn't
actually receive a signal itself - it only detects that a proc has exited
and checks to see if that happened due to a signal. If so, it flags that
situation and will order the job aborted.
So if the proc continues alive,
First, can your user executable create a signal handler to catch the
SIGUSR2 to not exit? By default on Solaris it is going to exit, unless
you catch the signal and have the process to do nothing.
from signal(3HEAD)
Name Value DefaultEvent
SIGUSR1 16 Ex
I am running into a situation where I am trying to deliver a signal to the
mpi procs (sigusr2). I deliver this to mpirun, which propagates it to the
mpi procs, but then proceeds to kill the children. Is there an easy way
that I can get around this ? I am using this mechanism in a situation where
I'm aware - as we discussed on a recent telecon, I put it on my list of
things to resolve. Solution is known - just busy with other things at the
moment.
On 4/8/08 6:06 AM, "Tim Prins" wrote:
> Hi all,
>
> I reported this before, but it seems that the report got lost. I have
> found some situa
Hi all,
I reported this before, but it seems that the report got lost. I have
found some situations where mpirun will return a '0' when there is an error.
An easy way to reproduce this is to edit the file
'orte/mca/plm/base/plm_base_launch_support.c' and on line 154 put in
'return ORTE_ERROR
12 matches
Mail list logo