On Mon, Jan 4, 2010 at 7:11 AM, Nathan Gray <[email protected]>wrote:
> On Wed, Dec 30, 2009 at 04:55:37PM +0100, Petr Rockai wrote: > > Nathan Gray <[email protected]> writes: > > > In another terminal I investigated a little bit. It looks like > > > 'darcs unpull' is hanging: > > > > > > $ date > > > Tue Dec 29 16:29:33 EST 2009 > > > > > > $ ps ax|grep darcs > > > 5311 pts/0 S+ 0:17 darcs-benchmark > /home/kolibrie/bin/darcs-2.3.1 /home/kolibrie/bin/darcs > > > 6136 pts/0 Sl+ 0:21 /home/kolibrie/bin/darcs-2.3.1 unpull > --last 1000 --all +RTS -sdarcs-stats -RTS > > > > > > $ uptime > > > 16:29:31 up 1:05, 3 users, load average: 0.10, 0.32, 0.47 > > > > I have seen this happen (actually, it happens on buildbot as well, and > > it doesn't happen on my laptop...), too. The question is what's going > > wrong... On the buildbot, I haven't noticed any hanging darcs > > processes. I am starting to suspect, that darcs is waiting for some user > > input in that case. It could be the "this will make unrevert impossible" > > one, but I don't see anything in the benchmark code that could produce > > one and there is none in the tarballs. Also, it should be reproducible > > that way. > > > > > Any ideas what I should do, besides kill it? > > > > If you could find out what that hanging darcs process is doing (strace > > -p is your friend), that'd be great. Also, running that command manually > > could help to find out: go to _playground/repo and run darcs-2.3.1 > > unpull --last 1000 --all. > > Strace repeats the same output forever for the 'unpull' process: > > $ strace -p 6136 > Process 6136 attached - interrupt to quit > futex(0x91bcbc0, FUTEX_WAIT_PRIVATE, 7, NULL) = ? ERESTARTSYS (To be > restarted) > --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- > sigreturn() = ? (mask now [INT]) > futex(0x91bcbc0, FUTEX_WAIT_PRIVATE, 7, NULL) = ? ERESTARTSYS (To be > restarted) > --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- > sigreturn() = ? (mask now [INT]) > ... > > Strace on the 'darcs-benchmark' process has much less output, but > appears to repeat as well: > > $ strace -p 5311 > Process 5311 attached - interrupt to quit > select(10, [9], [], NULL, {9, 508000}) = 0 (Timeout) > gettimeofday({1262616419, 519725}, NULL) = 0 > select(10, [9], [], NULL, {134, 217727}) = 0 (Timeout) > gettimeofday({1262616553, 739716}, NULL) = 0 > select(10, [9], [], NULL, {134, 217727}... > My vague understanding of the GHC runtime system is that it uses select to poll for IO. This allows the RTS to have very cheap threads for Haskell code. I expect it should be fine if the RTS keeps polling for IO. So I think the above "forever"-ness is ignorable. > > Any guidance what I should do? > I'm not sure what to do about it either. Hopefully someone else will chime in. Jason
_______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
