Re: [gentoo-user] Re: Emergency shutdown, how to?
I agree that your script is nice and simple, and hence less prone to errors. I coded mine in c++ because I use it not only for a machine type watchdog, but also a task based watchdog that reboots the machine based on certain tasks living or not. Each task has to register with the watchdog server and continually tell the server they're alive, or reboot! But that's a story for another thread... #!/path/to/perl use strict; use Sys::Syslog; open my $fh, '', '/dev/watchdog' or die /dev/watchdog: $!; # if any of these go away we need to notice it. # ok... you'll notice the first one anyway. my @watchz = qw ( init ntpd apache /opt/sybase/ASE-12_5/bin/dataserver ); # wd timeout / 2, or 1 for minimum sleep # (avoid usleep: too much overhead). my $cycle = 15; # get the syslog handle openlog blah blah blah or die 'Et tu, syslog?'; CYCLE: for(;;) { sleep ( $cycle - ( time % $cycle ) ); # split and args vary by O/S, this works on linux. my @procz = map { split /\s+/, $_, 6 )[5] } qx( ps a ); my %chechz = (); @chechz{ @watchz } = (); delete @chechz{ @procz }; if( %chechz ) { # oops, current proc's don't include the # list of processes being watched. # # this can happen twice in a w/d interval # before the system goes down. my $nastygram = join \t, 'Missing proc's:', join \t, keys %chechz syslog LOG_CRIT | LOG_FOO, $nastygram; next CYCLE # alternative here is to close $fh here and # bounce the system immediately, the # approach of looping allows an # intentional restart of the service # (in less than 1 w/d cycle) w/o bouncing the box. } # if the proc check got this far then the w/d # file gets poked and we live for another loop. print $wd \n; } # this isn't a module 0 __END__ -- Steven Lembark85-09 90th St. Workhorse Computing Woodhaven, NY, 11421 [EMAIL PROTECTED] +1 888 359 3508 -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
Iain Buchanan wrote: On Sat, 2008-04-05 at 12:42 -0400, Steven Lembark wrote: I tried ALT + SysRq + EISUB today on my MythTV backend server which has been crashing lately. Unfortunately it's crashing so badly that even at the server's keyboard this didn't work. I guess my weekend fate of building a new server is sealed... Have fun. Check out motherboards with watchdog capability and enable it in the kernel. watchdogs are nice, and linux makes them ultra-easy to program, but of course if your watchdog task dies, then the machine effectively hits the reset button for you - no nice shutdown whatsoever! (Which is what you want in a hard lock-up, but not if your programming skills are the cause of the problem :) - Have the system turn off the watchdog if the file is closed. - After that just open it and poke a bit out now and then. - Make a point of closing the file on exit. #!/usr/bin/perl use strict; open my $fh, '', '/path/to/watchdog/file' or die Failed opening watchdog file: $!; # watchdog is now watching... select $fh; for(;;) { print \n; sleep 1;# watchdog timeout / 2 } my $graceful_exit = sub { close $fh; exit 0 }; for sig in ( qw( TERM QUIT INT __DIE__ ) ) { $SIG{ $sig }= $graceful_exit; } for sig in ( qw( HUP ) ) { $SIG{ $SIG }= 'IGNORE'; } __END__ -- Steven Lembark85-09 90th St. Workhorse Computing Woodhaven, NY, 11421 [EMAIL PROTECTED] +1 888 359 3508 -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
On Mon, 2008-04-07 at 13:28 -0400, Steven Lembark wrote: Iain Buchanan wrote: watchdogs are nice, and linux makes them ultra-easy to program, but of course if your watchdog task dies, then the machine effectively hits the reset button for you - no nice shutdown whatsoever! (Which is what you want in a hard lock-up, but not if your programming skills are the cause of the problem :) - Have the system turn off the watchdog if the file is closed. maybe, maybe not :) I personally like setting CONFIG_WATCHDOG_NOWAYOUT on systems with hardware watchdogs, especially remote unattended systems. Usually your watchdog task never dies on such a system, and when it does (be it from a nice kill or not) you want the watchdog to fire. However if this is a semi-used system (you ssh or log-in to it in any way to do stuff) you may not want this. - After that just open it and poke a bit out now and then. - Make a point of closing the file on exit. #!/usr/bin/perl use strict; open my $fh, '', '/path/to/watchdog/file' it's usually /dev/watchdog if you're using the linux kernel interface. I agree that your script is nice and simple, and hence less prone to errors. I coded mine in c++ because I use it not only for a machine type watchdog, but also a task based watchdog that reboots the machine based on certain tasks living or not. Each task has to register with the watchdog server and continually tell the server they're alive, or reboot! But that's a story for another thread... -- Iain Buchanan iaindb at netspace dot net dot au Linux - Where do you want to fly today? -- Unknown source -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
On Sat, 2008-04-05 at 12:42 -0400, Steven Lembark wrote: I tried ALT + SysRq + EISUB today on my MythTV backend server which has been crashing lately. Unfortunately it's crashing so badly that even at the server's keyboard this didn't work. I guess my weekend fate of building a new server is sealed... Have fun. Check out motherboards with watchdog capability and enable it in the kernel. watchdogs are nice, and linux makes them ultra-easy to program, but of course if your watchdog task dies, then the machine effectively hits the reset button for you - no nice shutdown whatsoever! (Which is what you want in a hard lock-up, but not if your programming skills are the cause of the problem :) cya, -- Iain Buchanan iaindb at netspace dot net dot au Evil is that which one believes of others. It is a sin to believe evil of others, but it is seldom a mistake. -- H.L. Mencken -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
I tried ALT + SysRq + EISUB today on my MythTV backend server which has been crashing lately. Unfortunately it's crashing so badly that even at the server's keyboard this didn't work. I guess my weekend fate of building a new server is sealed... Have fun. Check out motherboards with watchdog capability and enable it in the kernel. -- Steven Lembark85-09 90th St. Workhorse Computing Woodhaven, NY, 11421 [EMAIL PROTECTED] +1 888 359 3508 -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
On Wed, Apr 2, 2008 at 10:48 AM, Neil Bothwick [EMAIL PROTECTED] wrote: On Wed, 02 Apr 2008 19:40:37 +0200, Michael Schmarck wrote: Neil even proposed ALT + SysRq + EISUB, to be sure everything is killed, sync'd and unmounted. Which might or might not work. But note that I was also talking about applications being in a corrupted state (the database example). E sends a SIGTERM to all applications. Any well behaved application should shut down cleanly on this. I sends a SIGKILL, but it only affects programs that were so locked up they ignored E, so you have nothing to lose by then. I tried ALT + SysRq + EISUB today on my MythTV backend server which has been crashing lately. Unfortunately it's crashing so badly that even at the server's keyboard this didn't work. I guess my weekend fate of building a new server is sealed... Cheers, Mark -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
On Fri, 4 Apr 2008 14:05:42 -0700, Mark Knecht wrote: I tried ALT + SysRq + EISUB today on my MythTV backend server which has been crashing lately. Unfortunately it's crashing so badly that even at the server's keyboard this didn't work. Do you have CONFIG_MAGIC_SYSRQ=y in your kernel config? -- Neil Bothwick A woman walked into a bar and asked the barman for a large double entendre, so he gave her one. signature.asc Description: PGP signature
Re: [gentoo-user] Re: Emergency shutdown, how to?
On Fri, Apr 4, 2008 at 2:50 PM, Neil Bothwick [EMAIL PROTECTED] wrote: On Fri, 4 Apr 2008 14:05:42 -0700, Mark Knecht wrote: I tried ALT + SysRq + EISUB today on my MythTV backend server which has been crashing lately. Unfortunately it's crashing so badly that even at the server's keyboard this didn't work. Do you have CONFIG_MAGIC_SYSRQ=y in your kernel config? -- Neil Bothwick No Neil, it turns out on that one machine is isn't set. Thanks. I'll make sure it's set on the new server. Cheers, Mark -- gentoo-user@lists.gentoo.org mailing list
[gentoo-user] Re: Emergency shutdown, how to?
Liviu Andronic [EMAIL PROTECTED] wrote: Are there any potential harms to the hardware / system in case one tends to abuse (i.e. use more often than necessary) of this command? You're not shutting down the system in a clean way. Because of this, filesystem and/or applications might get corrupt (eg. think of a database, which was in the middle of writing to some of its tables). It's so often so tempting to shut down your system fast. Yeah, it sure is :) Michael -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
Am Mittwoch, 2. April 2008 schrieb ext Michael Schmarck: You're not shutting down the system in a clean way. You're not? I thought that's the purpose of the whole thing? Bye... Dirk -- Dirk Heinrichs | Tel: +49 (0)162 234 3408 Configuration Manager | Fax: +49 (0)211 47068 111 Capgemini Deutschland | Mail: [EMAIL PROTECTED] Wanheimerstraße 68 | Web: http://www.capgemini.com D-40468 Düsseldorf | ICQ#: 110037733 GPG Public Key C2E467BB | Keyserver: www.keyserver.net signature.asc Description: This is a digitally signed message part.
[gentoo-user] Re: Emergency shutdown, how to?
· Dirk Heinrichs [EMAIL PROTECTED]: Am Mittwoch, 2. April 2008 schrieb ext Michael Schmarck: Dirk Heinrichs [EMAIL PROTECTED] wrote: Am Mittwoch, 2. April 2008 schrieb ext Michael Schmarck: You're not shutting down the system in a clean way. You're not? I thought that's the purpose of the whole thing? It's more like pulling the plug, isn't it? At least none of the shutdown scripts is run. And if you don't run ALT + SysRq + U, or if it just doesn't work (like hangs at some (remote) fs), But nobody proposed _not_ to run ALT + SysRq + U, True, but if things come to worse, you've got to do a ALT+SysRq+B or +O, even before +U completely returned. As said, it can happen, that U(nmount) doesn't work - and then you'd need to shutdown anyway. Neil even proposed ALT + SysRq + EISUB, to be sure everything is killed, sync'd and unmounted. Which might or might not work. But note that I was also talking about applications being in a corrupted state (the database example). filesystems aren't even unmounted and thus dirty and thus need a fsck run on next boot. XFS to the rescue :-) Yep. Well, to be honest, I haven't had a fs die on me, because of a Alt+SysRq+B. Michael Schmarck -- Inspiration without perspiration is usually sterile. -- gentoo-user@lists.gentoo.org mailing list
Re: [gentoo-user] Re: Emergency shutdown, how to?
On Wed, 02 Apr 2008 19:40:37 +0200, Michael Schmarck wrote: Neil even proposed ALT + SysRq + EISUB, to be sure everything is killed, sync'd and unmounted. Which might or might not work. But note that I was also talking about applications being in a corrupted state (the database example). E sends a SIGTERM to all applications. Any well behaved application should shut down cleanly on this. I sends a SIGKILL, but it only affects programs that were so locked up they ignored E, so you have nothing to lose by then. -- Neil Bothwick Weird enough for government work. signature.asc Description: PGP signature