On Thu, Apr 11, 2013 at 12:07 PM, Millsap, James <[email protected]> wrote: > Unfortunately It is difficult as this machine is critical to our operations, > I don't have a whole lot of time to troubleshoot, before I must have it up > and running. It usually takes around two days for this issue to come up. > -TERM will kill it, no need to use --KILL. This is built from source so no > redhat packages. This is what I have in the qrunner log. > > Apr 10 10:01:08 2013 (17606) ArchRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17606) ArchRunner qrunner exiting. > Apr 10 10:01:08 2013 (17611) OutgoingRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17612) VirginRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17612) VirginRunner qrunner exiting. > Apr 10 10:01:08 2013 (17607) BounceRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17608) CommandRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17608) CommandRunner qrunner exiting. > Apr 10 10:01:08 2013 (17609) IncomingRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17609) IncomingRunner qrunner exiting. > Apr 10 10:01:08 2013 (17610) NewsRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17610) NewsRunner qrunner exiting. > Apr 10 10:01:08 2013 (17613) RetryRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:08 2013 (17613) RetryRunner qrunner exiting. > Apr 10 10:01:08 2013 (17604) Master watcher caught SIGTERM. Exiting. > Apr 10 10:01:08 2013 (17604) Master qrunner detected subprocess exit > (pid: 17606, sig: None, sts: 15, class: ArchRunner, slice: 1/1) > Apr 10 10:01:08 2013 (17604) Master qrunner detected subprocess exit > (pid: 17608, sig: None, sts: 15, class: CommandRunner, slice: 1/1) > Apr 10 10:01:08 2013 (17604) Master qrunner detected subprocess exit > (pid: 17609, sig: None, sts: 15, class: IncomingRunner, slice: 1/1) > Apr 10 10:01:08 2013 (17604) Master qrunner detected subprocess exit > (pid: 17610, sig: None, sts: 15, class: NewsRunner, slice: 1/1) > Apr 10 10:01:08 2013 (17604) Master qrunner detected subprocess exit > (pid: 17612, sig: None, sts: 15, class: VirginRunner, slice: 1/1) > Apr 10 10:01:08 2013 (17604) Master qrunner detected subprocess exit > (pid: 17613, sig: None, sts: 15, class: RetryRunner, slice: 1/1) > Apr 10 10:01:08 2013 (17607) BounceRunner qrunner exiting. > Apr 10 10:01:08 2013 (17604) Master qrunner detected subprocess exit > (pid: 17607, sig: None, sts: 15, class: BounceRunner, slice: 1/1) > Apr 10 10:01:37 2013 (17611) OutgoingRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:37 2013 (17604) Master watcher caught SIGTERM. Exiting. > Apr 10 10:01:37 2013 (17611) OutgoingRunner qrunner caught SIGTERM. Stopping. > Apr 10 10:01:37 2013 (17611) OutgoingRunner qrunner exiting. > Apr 10 10:01:38 2013 (17604) Master qrunner detected subprocess exit > (pid: 17611, sig: None, sts: 15, class: OutgoingRunner, slice: 1/1)
.... approx 20 seconds time .... > Apr 10 10:01:58 2013 (15858) CommandRunner qrunner started. > Apr 10 10:01:59 2013 (15859) IncomingRunner qrunner started. > Apr 10 10:01:59 2013 (15856) ArchRunner qrunner started. > Apr 10 10:01:59 2013 (15857) BounceRunner qrunner started. > Apr 10 10:01:59 2013 (15862) VirginRunner qrunner started. > Apr 10 10:01:59 2013 (15860) NewsRunner qrunner started. > Apr 10 10:01:59 2013 (15863) RetryRunner qrunner started. > Apr 10 10:01:59 2013 (15861) OutgoingRunner qrunner started. To me, the above looks like a system reboot. Is something rebooting the box at 10am? -Jim P. ------------------------------------------------------ Mailman-Users mailing list [email protected] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org
