Hello Rohan, Thank you for the reply. You were right, there was indeed a command (which created a process) being run by our Java program while the checkpointing took place. I was able to get it working by running the external process as a bash script and calling it in Java instead.
I have one further question. While using DMTCP to checkpoint a simple JDBC application (example: http://www.vogella.com/tutorials/MySQLJava/article.html#javaconnection), after opening the database connection, DMTCP completes the checkpointing, but the program then results in an error as follows: $ dmtcp_launch java MyProgram dmtcp_coordinator starting... Host: PC (127.0.1.1) Port: 7779 Checkpoint Interval: disabled (checkpoint manually instead) Exit on last client: 1 Backgrounding... /* Program Output */ /* Program Output */ /* Checkpoint command issued, checkpoint complete */ Exception in thread "main" java.sql.SQLException: Could not retrieve transation read-only status server at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1094) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:997) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:983) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:928) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:959) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:949) at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3967) at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3938) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2295) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2262) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2246) at MySQLAccess.readDataBase(Java2MySql.java:66) at MySQLAccess.main(Java2MySql.java:133) Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet successfully received from the server was 3,026 milliseconds ago. The last packet sent successfully to the server was 3,027 milliseconds ago. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:408) at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1137) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3965) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2578) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2758) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2820) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2769) at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1569) at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3961) ... 6 more Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) at java.net.SocketOutputStream.write(SocketOutputStream.java:153) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3946) ... 12 more When I restart the program using dmtcp_restart_script.sh, the same error as above appears. Please note that the program does work if executed normally. I believe that somehow, checkpointing is changing the state(?) of the database connection. Is there a way to get this working? Thank you, Pratyush On Tue, Apr 26, 2016 at 7:00 PM, Rohan Garg <rohg...@ccs.neu.edu> wrote: > Hi Pratyush, > > Checkpointing of files and Java is supported out of the box. There are > various runtime options (plugins) that you can use to modify the default > behavior according to your requirements. I'm unable to reproduce the issue > that you have reported with your example locally. I'm using the latest DMTCP > source from Github. Here's what I did: > > $ javac AppMain.java > $ dmtcp_launch java AppMain > # Checkpoint and kill the application > $ dmtcp_restart ckpt_java_*.dmtcp > > The error messages you are getting have nothing to do with checkpointing > of open files. It seems like your application has some connections to > other processes that are not running under DMTCP. Could you please verify > if that's the case? Also, could you please try running with the latest > release? > > Thanks, > Rohan > > On Tue, Apr 26, 2016 at 11:57:30AM +0530, Pratyush Patel wrote: >> Hello, >> >> I am using DMTCP to try and checkpoint a simple Java program, source >> of which can be found at http://pastie.org/10801783. >> >> In the program input.txt is a large file which contains several lines >> which I am trying to print. I am checkpointing the program by sending >> the signal to checkpoint externally through the dmtcp_coordinator. >> >> Although, I expected the checkpointing process to work with the open >> file descriptor, it appears that dmtcp is unable to checkpoint the >> program properly, and results in some error messages like: >> >> [43000] NOTE at timerlist.cpp:107 in removeStaleClockIds; >> REASON='Removing stale clock' >> staleClockIds[i] = -100842 >> [40000] WARNING at kernelbufferdrainer.cpp:125 in onTimeoutInterval; >> REASON='JWARNING(false) failed' >> _dataSockets[i]->socket().sockfd() = 11 >> buffer.size() = 140 >> WARN_INTERVAL_SEC = 10 >> Message: Still draining socket... perhaps remote host is not running >> under DMTCP? >> [40000] WARNING at kernelbufferdrainer.cpp:125 in onTimeoutInterval; >> REASON='JWARNING(false) failed' >> _dataSockets[i]->socket().sockfd() = 11 >> buffer.size() = 140 >> WARN_INTERVAL_SEC = 10 >> Message: Still draining socket... perhaps remote host is not running >> under DMTCP? >> >> In case it matters, I am using Ubuntu 15.10 with 4.3.0-040300-generic kernel. >> I am also using the latest dmtcp source code available in Ubuntu repository. >> >> Could anyone please let me know why this happens and whether there is >> a way to get it working? >> >> Thanks, >> Pratyush Patel >> >> ------------------------------------------------------------------------------ >> Find and fix application performance issues faster with Applications Manager >> Applications Manager provides deep performance insights into multiple tiers >> of >> your business applications. It resolves application problems quickly and >> reduces your MTTR. Get your free trial! >> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z >> _______________________________________________ >> Dmtcp-forum mailing list >> Dmtcp-forum@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum