Hi Edson, The error is what's expected. DMTCP considers the computation as a whole, i.e., for all processes involved in a computation, they must run under DMTCP. Technically, this is because DMTCP must handle the network communication. At the time of a checkpoint, DMTCP needs to drain the data in the sockets so that there won't be any lost data in-flight. In your case, the other side of the socket is not under the control of DMTCP.
Also, if possible, could you tell us what kind of application are you running? I haven't tested DMTCP on MPI applications communicating with the external world. This can be a good test suite for us. Best, Jiajun On Mon, Oct 26, 2015 at 6:46 AM, Edson Tavares de Camargo < etcamarg...@gmail.com> wrote: > Hi Everyone! > > > I have a question: What is the expected behaviour of DMTCP when I use > DMTCP on a MPI application that exchanges messages with another application > that is not running on dmtcp_launch? > > I ask because I have an error when I execute a MPI application that > exchanges message via TCP with another application. Both application are > running on my cluster. But I only need to make the checkpoint the MPI > application. The error is the following: > > ======== > WARNING at kernelbufferdrainer.cpp:120 in onTimeoutInterval; > REASON='JWARNING(false) failed' > _dataSockets[i]->socket().sockfd() = 15 > buffer.size() = 1059 > WARN_INTERVAL_SEC = 10 > Message: Still draining socket... perhaps remote host is not running under > DMTCP? > ======= > > Thanks! > > Edson > ------- > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Dmtcp-forum mailing list > Dmtcp-forum@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum > >
------------------------------------------------------------------------------
_______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum