On Friday, Nov 24th 2006 at 22:33 -0500, quoth Steven W. Orr:

=>On Wednesday, Nov 22nd 2006 at 10:51 -0500, quoth Charlie Brady:
=>
=>=>
=>=>On Wed, 22 Nov 2006, Steven W. Orr wrote:
=>=>
=>=>> =>Whatever is feeding the standard input of that process has not 
terminated. 
=>=>> =>What does "ps fax" tell you?
=>=>> =>
=>=>> =>> Do we need to modify flexbackup to set SIG_IGN for SIGCHLD?
=>=>> =>
=>=>> =>I don't know why you are suggesting that.
=>=>> 
=>=>> Right. It's not a zombie like I said above, but since it's not, you're 
=>=>> correct that the issue of SIG_IGN for SIGCHLD would be a red herring. 
From 
=>=>> the ps output above, it's in a sleep state. Your question about who the 
=>=>> parent is is good. I don't remember because I just killed the process 
=>=>> after I sent this message but I believe (from previous incidents) it is 
=>=>> the child of flexbackup. So the tree should be
=>=>> 
=>=>> cron
=>=>> \_bash
=>=>>      \_flexbackup
=>=>>           \_gzip
=>=>
=>=>No, the tree should never be just that. Something should be feeding gzip, 
=>=>and gzip should be feeding something. Both "somethings" should be children 
=>=>of flexbackup. The exact identity of the "somethings" will depend on your 
=>=>configuration.
=>=> 
=>=>> What I think is happening is that flexbackup is waiting for gzip to 
=>=>> complete before it exits. But gzip doesn't exit because it's waiting for 
=>=>> more input, not knowing that more isn't coming. 
=>=>
=>=>Yes, and you need to determine why no more input is coming, and yet the 
=>=>program providing such input to gzip has not exited.
=>=>
=>=>> Sometimes I can go a month without a hangup, and sometimes it hangs 
=>=>> multiple times per week. Do we need to wait for a reoccurance or is this 
=>=>> enough to be able to work with?
=>=>
=>=>It's not enough because you haven't given us the full information. Since 
=>=>you've killed the gzip process, we can't determine what was feeding it 
=>=>input and why it was blocked. If you can show the actual process tree 
=>=>rather than what you think "should" be there, then we can provide more 
=>=>debugging instructions.
=>=>
=>=>Perhaps if you describe your configuration someone can speculate about 
=>=>what process was blocked and why.
=>
=>Ok. I got a new one today and I'm leaving it around so we can figure this 
=>thing out.
=>
=>Here's the cron tree:
=>
=> 3480 ?        Ss     0:01 crond
=>16571 ?        S      0:00  \_ crond
=>16572 ?        Ss     0:00      \_ /usr/bin/perl -w /usr/bin/flexbackup -set 
backup -incremental
=>16846 ?        Z      0:00      |   \_ [sh] <defunct>
=>16573 ?        S      0:00      \_ /usr/sbin/sendmail -FCronDaemon -i -odi 
-oem -oi -t
=>
=>And here's the gzip:
=>
=>16860 ?        S      0:01 gzip -9
=>
=>and ps -ef shows
=>
=>root     16860     1  0 03:31 ?        00:00:01 gzip -9
=>root     16571  3480  0 03:31 ?        00:00:00 crond
=>root     16572 16571  0 03:31 ?        00:00:00 /usr/bin/perl -w 
/usr/bin/flexbackup -set backup -incremental
=>smmsp    16573 16571  0 03:31 ?        00:00:00 /usr/sbin/sendmail 
-FCronDaemon -i -odi -oem -oi -t
=>root     16846 16572  0 03:31 ?        00:00:00 [sh] <defunct>
=>
=>which shows that gzip is now the child of init which means that his parent 
=>exited and orphaned him. And 16846 seems to not be getting cleanup up by 
=>flexbackup.
=>
=>Anyone have an idea of what this all means?

Next day and we got lucky. It happened again


root      3480  0.0  0.0   2668   468 ?        Ss   Aug27   0:01 crond
root     16571  0.0  0.0   3292   988 ?        S    Nov24   0:00  \_ crond
root     16572  0.0  0.5   8320  5960 ?        Ss   Nov24   0:00  |   \_ 
/usr/bin/perl -w /usr/bin/flexbackup -set backup -incremental
root     16846  0.0  0.0      0     0 ?        Z    Nov24   0:00  |   |  \_ 
[sh] <defunct>
smmsp    16573  0.0  0.2   7344  2744 ?        S    Nov24   0:00  |   \_ 
/usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t
root     21193  0.0  0.0   3292   988 ?        S    03:31   0:00  \_ crond
root     21194  0.0  0.5   8320  5952 ?        Ss   03:31   0:00      \_ 
/usr/bin/perl -w /usr/bin/flexbackup -set backup -differential
root     21377  0.0  0.0      0     0 ?        Z    03:31   0:00      |  \_ 
[sh] <defunct>
smmsp    21195  0.0  0.2   7344  2728 ?        S    03:31   0:00      \_ 
/usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t

and we now have two gzips owned by init.

526 > ps -ef | grep gzip
root     16860     1  0 Nov24 ?        00:00:01 gzip -9
root     21419     1  0 03:31 ?        00:00:03 gzip -9
steveo    5813  9890  0 10:28 pts/3    00:00:00 grep gzip
527 > 

-- 
Time flies like the wind. Fruit flies like a banana. Stranger things have  .0.
happened but none stranger than this. Does your driver's license say Organ ..0
Donor?Black holes are where God divided by zero. Listen to me! We are all- 000
individuals! What if this weren't a hypothetical question?
steveo at syslang.net

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
flexbackup-help mailing list
flexbackup-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flexbackup-help

Reply via email to