Also, if possible, could you offer us a guest account of your cluster?
Compared to email communication, this is more efficient to debug.


Best,
Jiajun

On Mon, Oct 19, 2015 at 11:26 AM, Jiajun Cao <jia...@ccs.neu.edu> wrote:

> Hi Manuel,
>
> The infiniband plugin shouldn't affect application launching. Could you
> try removing the "--ib" flag and see if the application still crashes? This
> can help diagnose whether the issue is in the ib plugin or other dmtcp
> modules.
>
> Best,
> Jiajun
>
>
> Best,
> Jiajun
>
> On Sun, Oct 18, 2015 at 10:57 PM, Kapil Arya <ka...@ccs.neu.edu> wrote:
>
>> Hey Jiajun,
>>
>> Can you take a look at this problem as it is closer to your area of
>> expertise :-).
>>
>> Best,
>> Kapil
>>
>> On Sat, Oct 17, 2015 at 11:31 PM, Manuel Rodríguez Pascual <
>> manuel.rodriguez.pasc...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I am trying to checkpoint an MVAPICH application. It does not behave as
>>> expected, so maybe you can give me some support.
>>>
>>> I have compiled DMTCP with "--enable-infiniband-support " as only flag.
>>> I have MVAPICH installed.
>>>
>>> I can execute a test MPI application in two nodes, without DMTCP. I also
>>> can execute the application in a single node with DMTCP. however, it I
>>> execute it in two nodes with DMTCP, only the first one will run.
>>>
>>> Below there is a series of test commands with a lot of output, together
>>> with the versions of everything.
>>>
>>> Any ideas?
>>>
>>> thanks for your help,
>>>
>>>
>>> Manuel
>>>
>>>
>>> ---
>>> ---
>>>
>>> # mpichversion
>>>
>>> MVAPICH2 Version:     2.2a
>>>
>>> MVAPICH2 Release date: Mon Aug 17 20:00:00 EDT 2015
>>>
>>> MVAPICH2 Device:      ch3:mrail
>>>
>>> MVAPICH2 configure:   --disable-mcast
>>>
>>> MVAPICH2 CC:  gcc    -DNDEBUG -DNVALGRIND -O2
>>>
>>> MVAPICH2 CXX: g++   -DNDEBUG -DNVALGRIND -O2
>>>
>>> MVAPICH2 F77: gfortran -L/lib -L/lib   -O2
>>>
>>> MVAPICH2 FC:  gfortran   -O2
>>>
>>> # dmtcp_coordinator --version
>>>
>>> dmtcp_coordinator (DMTCP) 2.4.1
>>>
>>> ---
>>>
>>> ---
>>>
>>>
>>> I can execute a test MPI application in two nodes (acme11 and 12), with
>>>
>>> ---
>>> ---
>>> # mpirun_rsh  -n 2  acme11 acme12 ./helloWorldMPI
>>>
>>> Process 0 of 2 is on acme11.ciemat.es
>>>
>>> Process 1 of 2 is on acme12.ciemat.es
>>>
>>> Hello world from process 0 of 2
>>>
>>> Hello world from process 1 of 2
>>>
>>> Goodbye world from process 0 of 2
>>>
>>> Goodbye world from process 1 of 2
>>> ---
>>> ---
>>>
>>> As you can see, it works correctly.
>>>
>>>
>>> If I try to execute the application with DMTCP, however, it does not.
>>>
>>> I run the coordinator on acme11, with port 7779.
>>>
>>>
>>> I can execute the application on a single node. For example,
>>>
>>> ---
>>> ---
>>>
>>> #  dmtcp_launch -h acme11 -p 7779 --ib mpirun_rsh  -n 1  acme12
>>> ./helloWorldMPI
>>>
>>> [41000] NOTE at ssh.cpp:369 in prepareForExec; REASON='New ssh command'
>>>
>>>      newCommand = /home/localsoft/dmtcp/bin/dmtcp_ssh
>>> /home/localsoft/dmtcp/bin/dmtcp_nocheckpoint /usr/bin/ssh -q acme12 cd
>>> /home/slurm/tests;/home/localsoft/dmtcp/bin/dmtcp_launch --coord-host
>>> 172.17.29.173 --coord-port 7779 --ckptdir /home/slurm/tests --infiniband
>>> /home/localsoft/dmtcp/bin/dmtcp_sshd /usr/bin/env  MPISPAWN_MPIRUN_MPD=0
>>> USE_LINEAR_SSH=1 MPISPAWN_MPIRUN_HOST=acme11.ciemat.es
>>> MPISPAWN_MPIRUN_HOSTIP=172.17.29.173 MPIRUN_RSH_LAUNCH=1
>>> MPISPAWN_CHECKIN_PORT=33687 MPISPAWN_MPIRUN_PORT=33687 MPISPAWN_NNODES=1
>>> MPISPAWN_GLOBAL_NPROCS=1 MPISPAWN_MPIRUN_ID=40000 MPISPAWN_ARGC=1
>>> MPDMAN_KVS_TEMPLATE=kvs_885_acme11.ciemat.es_40000 MPISPAWN_LOCAL_NPROCS=1
>>> MPISPAWN_ARGV_0='./helloWorldMPI' MPISPAWN_ARGC=1
>>> MPISPAWN_GENERIC_ENV_COUNT=0  MPISPAWN_ID=0
>>> MPISPAWN_WORKING_DIR=/home/slurm/tests MPISPAWN_MPIRUN_RANK_0=0
>>> /usr/local/bin/mpispawn 0
>>>
>>> Process 0 of 1 is on acme12.ciemat.es
>>>
>>> Hello world from process 0 of 1
>>>
>>> Goodbye world from process 0 of 1
>>>
>>>
>>> COORDINATOR OUTPUT
>>>
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-4029-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = mpirun_rsh
>>>
>>>      msg.from = 1d64b124afe30f29-52000-562310a2
>>>
>>>      client->identity() = 1d64b124afe30f29-4029-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-52000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = mpirun_rsh_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-53000-562310a2
>>>
>>>      client->identity() = 1d64b124afe30f29-52000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-53000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = dmtcp_ssh_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-54000-562310a2
>>>
>>>      client->identity() = 1d64b124afe30f29-53000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-54000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = dmtcp_ssh
>>>
>>>      msg.from = 1d64b124afe30f29-53000-562310a2
>>>
>>>      client->identity() = 1d64b124afe30f29-53000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1b69d09fb3238b30-23945-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = dmtcp_sshd
>>>
>>>      msg.from = 1b69d09fb3238b30-55000-562310a2
>>>
>>>      client->identity() = 1b69d09fb3238b30-23945-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1b69d09fb3238b30-55000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme12.ciemat.es
>>>
>>>      client->progname() = dmtcp_sshd_(forked)
>>>
>>>      msg.from = 1b69d09fb3238b30-56000-562310a2
>>>
>>>      client->identity() = 1b69d09fb3238b30-55000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1b69d09fb3238b30-56000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme12.ciemat.es
>>>
>>>      client->progname() = mpispawn_(forked)
>>>
>>>      msg.from = 1b69d09fb3238b30-57000-562310a2
>>>
>>>      client->identity() = 1b69d09fb3238b30-56000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = env
>>>
>>>      msg.from = 1b69d09fb3238b30-56000-562310a2
>>>
>>>      client->identity() = 1b69d09fb3238b30-56000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = mpispawn
>>>
>>>      msg.from = 1b69d09fb3238b30-56000-562310a2
>>>
>>>      client->identity() = 1b69d09fb3238b30-56000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = helloWorldMPI
>>>
>>>      msg.from = 1b69d09fb3238b30-57000-562310a2
>>>
>>>      client->identity() = 1b69d09fb3238b30-57000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1b69d09fb3238b30-57000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1b69d09fb3238b30-56000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1b69d09fb3238b30-55000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-53000-562310a2
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-52000-562310a2
>>>
>>>
>>> ---
>>> ---
>>>
>>> So we see that it is working correctly, connecting and so.
>>>
>>> However, if I run the application in more than one core, as in the first
>>> example, it crashes. What happens is that the first node on the node list
>>> executes the application, and the rest do not.
>>>
>>> ----
>>> ----
>>>
>>> [root@acme11 tests]#  dmtcp_launch -h acme11 -p 7779 --ib mpirun_rsh
>>> -n 2  acme11 acme12 ./helloWorldMPI
>>>
>>> [59000] NOTE at ssh.cpp:369 in prepareForExec; REASON='New ssh command'
>>>
>>>      newCommand = /home/localsoft/dmtcp/bin/dmtcp_ssh
>>> /home/localsoft/dmtcp/bin/dmtcp_nocheckpoint /usr/bin/ssh -q acme11 cd
>>> /home/slurm/tests;/home/localsoft/dmtcp/bin/dmtcp_launch --coord-host
>>> 172.17.29.173 --coord-port 7779 --ckptdir /home/slurm/tests --infiniband
>>> /home/localsoft/dmtcp/bin/dmtcp_sshd /usr/bin/env  MPISPAWN_MPIRUN_MPD=0
>>> USE_LINEAR_SSH=1 MPISPAWN_MPIRUN_HOST=acme11.ciemat.es
>>> MPISPAWN_MPIRUN_HOSTIP=172.17.29.173 MPIRUN_RSH_LAUNCH=1
>>> MPISPAWN_CHECKIN_PORT=34203 MPISPAWN_MPIRUN_PORT=34203 MPISPAWN_NNODES=2
>>> MPISPAWN_GLOBAL_NPROCS=2 MPISPAWN_MPIRUN_ID=58000 MPISPAWN_ARGC=1
>>> MPDMAN_KVS_TEMPLATE=kvs_481_acme11.ciemat.es_58000 MPISPAWN_LOCAL_NPROCS=1
>>> MPISPAWN_ARGV_0='./helloWorldMPI' MPISPAWN_ARGC=1
>>> MPISPAWN_GENERIC_ENV_COUNT=0  MPISPAWN_ID=0
>>> MPISPAWN_WORKING_DIR=/home/slurm/tests MPISPAWN_MPIRUN_RANK_0=0
>>> /usr/local/bin/mpispawn 0
>>>
>>> [60000] NOTE at ssh.cpp:369 in prepareForExec; REASON='New ssh command'
>>>
>>>      newCommand = /home/localsoft/dmtcp/bin/dmtcp_ssh
>>> /home/localsoft/dmtcp/bin/dmtcp_nocheckpoint /usr/bin/ssh -q acme12 cd
>>> /home/slurm/tests;/home/localsoft/dmtcp/bin/dmtcp_launch --coord-host
>>> 172.17.29.173 --coord-port 7779 --ckptdir /home/slurm/tests --infiniband
>>> /home/localsoft/dmtcp/bin/dmtcp_sshd /usr/bin/env  MPISPAWN_MPIRUN_MPD=0
>>> USE_LINEAR_SSH=1 MPISPAWN_MPIRUN_HOST=acme11.ciemat.es
>>> MPISPAWN_MPIRUN_HOSTIP=172.17.29.173 MPIRUN_RSH_LAUNCH=1
>>> MPISPAWN_CHECKIN_PORT=34203 MPISPAWN_MPIRUN_PORT=34203 MPISPAWN_NNODES=2
>>> MPISPAWN_GLOBAL_NPROCS=2 MPISPAWN_MPIRUN_ID=58000 MPISPAWN_ARGC=1
>>> MPDMAN_KVS_TEMPLATE=kvs_481_acme11.ciemat.es_58000 MPISPAWN_LOCAL_NPROCS=1
>>> MPISPAWN_ARGV_0='./helloWorldMPI' MPISPAWN_ARGC=1
>>> MPISPAWN_GENERIC_ENV_COUNT=0  MPISPAWN_ID=1
>>> MPISPAWN_WORKING_DIR=/home/slurm/tests MPISPAWN_MPIRUN_RANK_0=1
>>> /usr/local/bin/mpispawn 0
>>>
>>> Process 0 of 2 is on acme11.ciemat.es
>>>
>>> Hello world from process 0 of 2
>>>
>>> Goodbye world from process 0 of 2
>>>
>>> COORDINATOR OUTPUT
>>>
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-4070-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = mpirun_rsh
>>>
>>>      msg.from = 1d64b124afe30f29-58000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-4070-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-58000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-58000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = mpirun_rsh_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-59000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-58000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = mpirun_rsh_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-60000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-58000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-59000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-60000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = dmtcp_ssh_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-61000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-59000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = dmtcp_ssh_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-62000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-60000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-61000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-62000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = dmtcp_ssh
>>>
>>>      msg.from = 1d64b124afe30f29-59000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-59000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = dmtcp_ssh
>>>
>>>      msg.from = 1d64b124afe30f29-60000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-60000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1b69d09fb3238b30-24001-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-4094-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = dmtcp_sshd
>>>
>>>      msg.from = 1d64b124afe30f29-64000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-4094-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = dmtcp_sshd
>>>
>>>      msg.from = 1b69d09fb3238b30-63000-56231173
>>>
>>>      client->identity() = 1b69d09fb3238b30-24001-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-64000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1b69d09fb3238b30-63000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = dmtcp_sshd_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-65000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-64000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme12.ciemat.es
>>>
>>>      client->progname() = dmtcp_sshd_(forked)
>>>
>>>      msg.from = 1b69d09fb3238b30-66000-56231173
>>>
>>>      client->identity() = 1b69d09fb3238b30-63000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = env
>>>
>>>      msg.from = 1d64b124afe30f29-65000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-65000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = mpispawn
>>>
>>>      msg.from = 1d64b124afe30f29-65000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-65000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1b69d09fb3238b30-66000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker
>>> connected'
>>>
>>>      hello_remote.from = 1d64b124afe30f29-65000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme11.ciemat.es
>>>
>>>      client->progname() = mpispawn_(forked)
>>>
>>>      msg.from = 1d64b124afe30f29-68000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-65000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating
>>> process Information after fork()'
>>>
>>>      client->hostname() = acme12.ciemat.es
>>>
>>>      client->progname() = mpispawn_(forked)
>>>
>>>      msg.from = 1b69d09fb3238b30-67000-56231173
>>>
>>>      client->identity() = 1b69d09fb3238b30-66000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = env
>>>
>>>      msg.from = 1b69d09fb3238b30-66000-56231173
>>>
>>>      client->identity() = 1b69d09fb3238b30-66000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = mpispawn
>>>
>>>      msg.from = 1b69d09fb3238b30-66000-56231173
>>>
>>>      client->identity() = 1b69d09fb3238b30-66000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = helloWorldMPI
>>>
>>>      msg.from = 1d64b124afe30f29-68000-56231173
>>>
>>>      client->identity() = 1d64b124afe30f29-68000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating
>>> process Information after exec()'
>>>
>>>      progname = helloWorldMPI
>>>
>>>      msg.from = 1b69d09fb3238b30-67000-56231173
>>>
>>>      client->identity() = 1b69d09fb3238b30-67000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-68000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1b69d09fb3238b30-67000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-65000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1b69d09fb3238b30-66000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-64000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1b69d09fb3238b30-63000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-59000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-60000-56231173
>>>
>>> [3984] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client
>>> disconnected'
>>>
>>>      client->identity() = 1d64b124afe30f29-58000-56231173
>>>
>>>
>>> ----
>>>
>>> ----
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Dr. Manuel Rodríguez-Pascual
>>> skype: manuel.rodriguez.pascual
>>> phone: (+34) 913466173 // (+34) 679925108
>>>
>>> CIEMAT-Moncloa
>>> Edificio 22, desp. 1.25
>>> Avenida Complutense, 40
>>> 28040- MADRID
>>> SPAIN
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Dmtcp-forum mailing list
>>> Dmtcp-forum@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
>>>
>>>
>>
>
------------------------------------------------------------------------------
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to