Hi
I'm trying to use dmtcp with an open-mpi (1.6.5) aplication (BT of NAS
benchmark).
In the moment I ask for a checkpoint in the coordinator by pressing "c",
the running application terminate before printing this error message:
[40000] ERROR at connectionidentifier.h:96 in assertValid;
REASON='JASSERT(strcmp(sign, HANDSHAKE_SIGNATURE_MSG) == 0) failed'
sign =
Message: read invalid message, signature mismatch. (External socket?)
orterun (40000): Terminating...
mcastrol@chubut:~/disconfs/software/NPB3.3.1/NPB3.3-MPI/bin$ [48000] ERROR
at connectionidentifier.h:96 in assertValid; REASON='JASSERT(strcmp(sign,
HANDSHAKE_SIGNATURE_MSG) == 0) failed'
sign =
Message: read invalid message, signature mismatch. (External socket?)
bt.A.4 (48000): Terminating...
[49000] ERROR at connectionidentifier.h:96 in assertValid;
REASON='JASSERT(strcmp(sign, HANDSHAKE_SIGNATURE_MSG) == 0) failed'
sign =
Message: read invalid message, signature mismatch. (External socket?)
bt.A.4 (49000): Terminating...
I'm using two identical nodes, they have the same user and the ssh public
keys (id_dsa.pub) are interchanged. The OS is ubuntu 12.04 kernel 3.13.0-46.
I'd appreciate any clue to solve this issue.
Thank you very much in advance.
Marcela
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum