Hi Rick,

Could you please try the attached patch? You'd have to set the pause
environment variable, DMTCP_RESTART_PAUSE, prior to launch.

  $ DMTCP_RESTART_PAUSE=1 dmtcp_launch a.out

-Rohan

On Wed, Aug 09, 2017 at 11:48:48AM -0400, dmtcp-fo...@gusbus.org wrote:
> 
> Hi again Jaijun,
> 
> I've gone ahead and downloaded and tested the latest 2.4 release of
> DMTCP (version 2.4.8) and it appears that this --no-coordinator bug
> has not yet been fixed.  I am getting a return value of zero from
> from the dmtcp_checkpoint() function after returning from a
> checkpoint (instead of the expected DMTCP_AFTER_CHECKPOINT).
> 
> For this reason, DMTCP version 2.4 is not an option for us as we
> require our program to be run without any coordinator.
> 
> If anyone could respond with some information on getting DMTCP
> version 2.3.1 to pause correctly upon restart using the
> environment variable MTCP_RESTART_PAUSE mentioned in the
> documentation, it would be greatly appreciated.  As stated at
> the start of this thread, I've not been able to get my restarted
> processes to pause using this environment variable prior to the
> dmtcp_launch command as stated in the documentation.
> 
> Thank you,
> Rick
> 
> 
> -----------------------------------------------------------------------
> 
> On Tue, 8 Aug 2017, Jiajun Cao wrote:
> 
> > Hi Rick,
> > 
> > The support for allowing gdb attach on restart was not added until
> > the 2.4 release.
> > 
> > Is there any possibility that you upgrade the installation to a
> > newer version? Note you don't need to have root privilege to do
> > that. If you want to test it locally, just compile the source code,
> > and add the bin path to your $PATH env var.
> > 
> > Best,
> > Jiajun
> > 
> > On Tue, Aug 08, 2017 at 03:53:02PM -0400, dmtcp-fo...@gusbus.org wrote:
> > > 
> > > Hi there,
> > > 
> > > I'm trying to understand the documented options for debugging a
> > > restarted DMTCP process.
> > > 
> > > I've been testing a few different environment variables trying to
> > > get the dmtcp_restart process to pause for 15 seconds like it is
> > > suppose to, so that I can attach to it with gdb.
> > > 
> > > There are various options in the documentation, environment variables
> > > like:
> > > 
> > > DMTCP_RESTART_PAUSE
> > > DMTCP_RESTART_PAUSE2
> > > DMTCP_GDB_ATTACH_ON_RESTART
> > > 
> > > I am currently locked into using DMTCP version 2.3.1 at the present
> > > time.  My program was compiled using gcc 6.4.0.  I am running under
> > > CentOS release 6.8 Final, kernal 2.6.32-642.4.2.el6.x86_64 #1 SMP.
> > > 
> > > 
> > > Have noted that the documentation states that prior to version 2.4
> > > of DMTCP, the environment variable to use is named MTCP_RESTART_PAUSE.
> > > 
> > > I have tried setting MTCP_RESTART_PAUSE prior to the dmtcp_launch
> > > command, and prior to both dmtcp_launch and dmtcp_restart commands
> > > with no luck.  I am able to use gdb to attach to the process (more
> > > easily from a separate terminal because of the standard output being
> > > produced from my process in the dmtcp_restart process terminal) but
> > > since there is no pause, gdb doesn't attach until later in the process
> > > after the restart, all based on how quickly I can get the gdb attach
> > > commmand typed in and entered.  Not really a reliable way to go :)
> > > 
> > > In any event, if there are known issues with regard to getting the
> > > restart process to pause in version DMTCP 2.3.1, then that would
> > > explain it, if however I am doing something wrong, then any help
> > > would be appreciated.
> > > 
> > > My normal series of run commands go like this (just like the DMTCP
> > > documentation)
> > > 
> > > MTCP_RESTART_PAUSE=1 dmtcp_launch --disable-alloc-plugin --no-coordinator 
> > > --port 0 program.exe < input.dat
> > > 
> > > dmtcp_restart --port 0 --port-file dmtcpportfile checkpoint.dmtcp < 
> > > input.int &
> > > 
> > > Do you think any of the command line options are causing any issues, like
> > > my lack of using the coordinator?  Also, my program always reads a file 
> > > from
> > > standard input even during a restart.
> > > 
> > > Thanks,
> > > Rick
> > > 
> > > 
> > > 
> > > ------------------------------------------------------------------------------
> > > Check out the vibrant tech community on one of the world's most
> > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > > _______________________________________________
> > > Dmtcp-forum mailing list
> > > Dmtcp-forum@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
> > 
> > 
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Dmtcp-forum mailing list
> Dmtcp-forum@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
diff --git a/src/threadlist.cpp b/src/threadlist.cpp
index b312bac..c84b3b5 100644
--- a/src/threadlist.cpp
+++ b/src/threadlist.cpp
@@ -614,6 +614,20 @@ void ThreadList::postRestart(void)
 
   restoreInProgress = 1;
 
+  /* If DMTCP_RESTART_PAUSE is set, sleep 15 seconds to allow gdb attach.*/
+  char *pause_param = getenv("DMTCP_RESTART_PAUSE");
+  if (pause_param != NULL) {
+#ifdef HAS_PR_SET_PTRACER
+    prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, 0, 0, 0); // For: gdb attach
+#endif // ifdef HAS_PR_SET_PTRACER
+    struct timespec delay = { 15, 0 }; /* 15 seconds */
+    JNOTE("Pausing 15 seconds. Do:  gdb <PROGNAME> %d\n")(dmtcp_virtual_to_real_pid(getpid()));
+    nanosleep(&delay, NULL);
+#ifdef HAS_PR_SET_PTRACER
+    prctl(PR_SET_PTRACER, 0, 0, 0, 0); // Revert permission to default.
+#endif // ifdef HAS_PR_SET_PTRACER
+  }
+
   sigfillset(&tmp);
   for (thread = activeThreads; thread != NULL; thread = thread->next) {
     struct MtcpRestartThreadArg mtcpRestartThreadArg;
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to