Hi Rohan, Anytime Friday afternoon is ok too. Please suggest a time so I can setup webex meeting. Thanks.
David Davies ASIC Verification Project Manager, PLX Technology, Inc. 408-962-3474 -----Original Message----- From: David Davies Sent: Wednesday, September 18, 2013 4:22 PM To: 'Rohan' Cc: dmtcp-forum@lists.sourceforge.net; Kapil Arya Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue Thanks. Anytime tomorrow is fine. David Davies ASIC Verification Project Manager, PLX Technology, Inc. 408-962-3474 -----Original Message----- From: Rohan [mailto:rohg...@ccs.neu.edu] Sent: Wednesday, September 18, 2013 3:04 PM To: David Davies Cc: dmtcp-forum@lists.sourceforge.net; Kapil Arya Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue Hi David, I have been trying to reproduce the "gettime" error on my system here but I haven't encountered that, with the trunk or with dmtcp-1.2.8. The "qemu_cond_wait" error is a known race condition. It has been a particularly hard one to keep track of. If it is occurring frequently for you we should look at it. I think a webex session would be more efficient. We can have a session tomorrow before afternoon, or Friday late in the afternoon. Thanks, Rohan On Wed, Sep 18, 2013 at 07:10:08PM +0000, David Davies wrote: > Hi Rohan, > > I tried with svn co svn://svn.code.sf.net/p/dmtcp/code/trunk dmtcp-trunk and > got different type errors with different checkpoints. Here are a few of them. > If you think it would help, I can host a webex and we can see it > interactively? > > Synopsys VCS: > [95000] ERROR at fileconnection.cpp:693 in refill; > REASON='JASSERT(jalib::Filesystem::FileExists(_path)) failed' > _path = /proc/self/exe > Message: File not found. > dpi_sim_tb (95000): Terminating... > > QEMU: > [96000] WARNING at jsocket.cpp:289 in readAll; REASON='JWARNING(cnt>=0) > failed' > sockfd() = 12 > cnt = -1 > len = 128 > (strerror((*__errno_location ()))) = Connection reset by peer > Message: JSocket read failure > [96000] ERROR at connectionidentifier.h:96 in assertValid; > REASON='JASSERT(strcmp(sign, HANDSHAKE_SIGNATURE_MSG) == 0) failed' > sign = > Message: read invalid message, signature mismatch. (External socket?) > qemu-system-i386 (96000): Terminating... > > > David Davies > ASIC Verification Project Manager, PLX Technology, Inc. > 408-962-3474 > > > -----Original Message----- > From: David Davies > Sent: Tuesday, September 17, 2013 4:37 PM > To: 'Rohan Garg' > Cc: dmtcp-forum@lists.sourceforge.net; Kapil Arya > Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue > > Hi Rohan, > > Thanks for assistance and sorry for the delayed response. > No, it is not the exact same QEMU error every time with different checkpoint > images. For example, I also see this type " qemu: qemu_cond_wait: Operation > not permitted". > I'll try the version Gene suggested at svn co > svn://svn.code.sf.net/p/dmtcp/code/trunk dmtcp-trunk and let you know. > > [root@demeter qemu-1.2.0]# /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart -j > ckpt_qemu-system-i386_1d4a8584596cf84-17103-5238e56e.dmtcp > dmtcp_checkpoint (DMTCP + MTCP) 1.2.8 > Copyright (C) 2006-2011 Jason Ansel, Michael Rieker, Kapil Arya, and > Gene Cooperman This > program comes with ABSOLUTELY NO WARRANTY. > This is free software, and you are welcome to redistribute it under certain > conditions; see COPYING file for details. > (Use flag "-q" to hide this message.) > > [17103] mtcp_restart_nolibc.c:160 mtcp_restoreverything: > error: new/current break (0x60C000) != saved break (0x7FFB36621000) > qemu: qemu_cond_wait: Operation not permitted Abort [root@demeter > qemu-1.2.0]# > > > David Davies > ASIC Verification Project Manager, PLX Technology, Inc. > 408-962-3474 > > > -----Original Message----- > From: Rohan Garg [mailto:rohg...@ccs.neu.edu] > Sent: Thursday, September 12, 2013 11:31 PM > To: David Davies > Cc: dmtcp-forum@lists.sourceforge.net; Kapil Arya > Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue > > The exit that you are seeing now is caused by QEMU. It could be that DMTCP is > not restoring the state of a timer. I'm trying to reproduce the issue here. > > Do you see the same issue every time, that is, with different checkpoint > images? > > ----- Original Message ----- > From: "David Davies" <ddav...@plxtech.com> > To: "Rohan" <rohg...@ccs.neu.edu> > Cc: dmtcp-forum@lists.sourceforge.net, "Kapil Arya" > <ka...@ccs.neu.edu> > Sent: Thursday, September 12, 2013 4:09:09 PM GMT -05:00 US/Canada > Eastern > Subject: RE: [Dmtcp-forum] DMTCP version 1.2.8 issue > > Hi Rohan, > > Thanks for the quick response. After the change and re-compile, it hits this > issue when restarting. > > [root@demeter qemu-1.2.0]# /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart > ckpt_qemu-system-i386_1d4a8584596cf84-25076-5232172e.dmtcp > dmtcp_checkpoint (DMTCP + MTCP) 1.2.8 > Copyright (C) 2006-2011 Jason Ansel, Michael Rieker, Kapil Arya, and > Gene Cooperman This > program comes with ABSOLUTELY NO WARRANTY. > This is free software, and you are welcome to redistribute it under certain > conditions; see COPYING file for details. > (Use flag "-q" to hide this message.) > > dmtcp_coordinator starting... > Port: 7779 > Checkpoint Interval: disabled (checkpoint manually instead) > Exit on last client: 1 > Backgrounding... > [25076] mtcp_restart_nolibc.c:160 mtcp_restoreverything: > error: new/current break (0x60C000) != saved break (0x7FE932274000) > gettime: Invalid argument > Internal timer error: aborting > [root@demeter qemu-1.2.0]# > > David Davies > ASIC Verification Project Manager, PLX Technology, Inc. > 408-962-3474 > > > -----Original Message----- > From: Rohan [mailto:rohg...@ccs.neu.edu] > Sent: Thursday, September 12, 2013 11:37 AM > To: David Davies > Cc: dmtcp-forum@lists.sourceforge.net; Rohan Garg; Kapil Arya > Subject: Re: [Dmtcp-forum] DMTCP version 1.2.8 issue > > Hi David, > > Could you please comment out the following line in > $DMTCP_SRC_DIR/mtcp/mtcp_restart_nolibc.c, re-compile, and test: > > 157 else { > 158 if (new_brk == current_brk) > 159 MTCP_PRINTF("error: new/current break (%p) != saved break > (%p)\n", > 160 current_brk, mtcp_saved_break); > 161 else > 162 MTCP_PRINTF("error: new break (%p) != current break (%p)\n", > 163 new_brk, current_brk); > 164 // mtcp_abort (); /* COMMENT THIS LINE */ > 165 } > > This should fix the problem without affecting other functionality. > > Thanks, > Rohan > > On Thu, Sep 12, 2013 at 11:06:38AM -0400, Kapil Arya wrote: > > Hi Rohan, > > > > Can you take an quick look and see what is going on? > > > > thanks, > > Kapil > > > > > > On Thu, Sep 12, 2013 at 10:51 AM, David Davies <ddav...@plxtech.com> wrote: > > > > > Hi,**** > > > > > > ** ** > > > > > > I’m trying to checkpoint and restart a Synopsys VCS simulation > > > that runs concurrently with a Virtual Machine (QEMU) that > > > communicate together over TCP sockets.**** > > > > > > I can successfully checkpoint and restart the Synopsys VCS > > > simulation alone, but not the QEMU. The restart of QEMU gives the > > > following:**** > > > > > > ** ** > > > > > > ** ** > > > > > > [root@demeter qemu-1.2.0]# > > > /opt/dmtcp/dmtcp-1.2.8/bin/dmtcp_restart > > > ckpt_qemu-system-i386_1d4a8584596cf84-15246-5231ce7f.dmtcp**** > > > > > > dmtcp_checkpoint (DMTCP + MTCP) 1.2.8**** > > > > > > Copyright (C) 2006-2011 Jason Ansel, Michael Rieker, Kapil Arya, > > > and**** > > > > > > Gene > > > Cooperman**** > > > > > > This program comes with ABSOLUTELY NO WARRANTY.**** > > > > > > This is free software, and you are welcome to redistribute it**** > > > > > > under certain conditions; see COPYING file for details.**** > > > > > > (Use flag "-q" to hide this message.)**** > > > > > > ** ** > > > > > > [15246] mtcp_restart_nolibc.c:160 mtcp_restoreverything:**** > > > > > > error: new/current break (0x60C000) != saved break > > > (0x7F5F3751D000)**** > > > > > > Segmentation fault**** > > > > > > [root@demeter qemu-1.2.0]#**** > > > > > > ** ** > > > > > > ** ** > > > > > > ** ** > > > > > > Machine details:**** > > > > > > ---------------------**** > > > > > > [root@demeter qemu-1.2.0]# uname -a**** > > > > > > Linux demeter 2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:22 GMT > > > 2011 > > > x86_64 x86_64 x86_64 GNU/Linux**** > > > > > > ** ** > > > > > > [root@demeter qemu-1.2.0]# cat /etc/redhat-release **** > > > > > > CentOS release 6.2 (Final)**** > > > > > > ** ** > > > > > > Any advice would be greatly appreciated.**** > > > > > > ** ** > > > > > > David Davies**** > > > > > > ASIC Verification Project Manager, PLX Technology, Inc.**** > > > > > > 408-962-3474**** > > > > > > ** ** > > > > > > > > > ------------------------------------------------------------------ > > > -- > > > ---------- How ServiceNow helps IT people transform IT departments: > > > 1. Consolidate legacy IT systems to a single system of record for > > > IT 2. Standardize and globalize service processes across IT 3. > > > Implement zero-touch automation to replace manual, redundant tasks > > > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/os > > > tg .clktrk _______________________________________________ > > > Dmtcp-forum mailing list > > > Dmtcp-forum@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum > > > > > > ------------------------------------------------------------------------------ LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum