Re: software watchdog
On Sat, Jul 08, 2000 at 12:55:13AM +0200, Robert Waldner wrote: Hi! My home-debian-box starts to behave rather odd lately, now and then it will freeze completely. The only thing working is ICMP, I can´t even get a TCP connection open, the screen is frozen, neither mouse nor keyboard will generate any event. I´ve already tried changing all I have on spare (read RAM and graphics adapter). Since there´s not even a single syslog-entry, I don´t really know where to start debugging. Would it make sense if I installed the software watchdog into the kernel in this case, so that the machine would (eventually) reboot when it hangs? This would be great because I´ll be on a trip next week and my girl-friend needs the debian-box as gateway/ mailserver in the meantime... What kernel version? Are you running Samba? I had a long string of mysterious hangs with 2.2.14 and smbfs finally resolved by upgrading to 2.2.15. You *should* UG to 2.2.16 for security reasons. -- Karsten M. Self kmself@ix.netcom.com http://www.netcom.com/~kmself Evangelist, Opensales, Inc.http://www.opensales.org What part of Gestalt don't you understand? Debian GNU/Linux rocks! http://gestalt-system.sourceforge.net/K5: http://www.kuro5hin.org GPG fingerprint: F932 8B25 5FDD 2528 D595 DC61 3847 889F 55F2 B9B0 pgpiOp3AgTk17.pgp Description: PGP signature
Re: software watchdog
On Fri, 07 Jul 2000 19:33:25 EDT, paul writes: My home-debian-box starts to behave rather odd lately, now and then it will freeze completely. Is there anything consistent about the behavior? How long between reboot and freeze? Are there any error messages during startup? What applications are running when the machine freezes? (my bet is netscape) What Debian version are you running (Slink, Potato)? What kernel version are you running? Have you tried telneting to the machine to see if it is a console only problem? Is the behavior in ANY way predictable? More info is necessary if anyone is to be able to help. The box is running slink, with all packages, except the kernel and samba, apt-get upgrade´d to potato. Uptime is between 30 min and 5 days, I´m running X w/ fvwm95, setiathome in the background and, when I´m home, netscrap _may_ run. The system is IDE with a SCSI-streamer and -CD-recorder on a ASUS P54C-mobo with 4x16 MB FP-RAM and a Matrox Mystique as gfx card. 3 PCI-ethernet-cards, one ISA, and a SB16. Problem is that I can´t find any similarities between the hangs, the box can run ok for a few days, burning CDs, ripping and encoding mp3s, looking for ETs ;-), and doing ~1k mails per day, and at 3 am (according to my ping stats from outside) it´ll freeze. Since the connection to the outside is via PPTP, which hangs too, I can´t reach it anymore, although it answers to ICMP from local networks. but I can´t even get a TCP connection open (it sends nothing back, not even a syn_ack). The box is running as-is now for about a year, but is ~3 years old, so I guess it´s simply getting old. Cheap hardware isn´t built for running a 24/7 server, I guess. It shouldn´t get too hot, since I already underclocked it from 66/2.5 to 60/2 and I´m having 2 extra fans, one cooling the PCI and ISA cards, one cooling RAM. Also it sometimes freezes when loaded, sometimes when idling (I´ve disabled setiathome for testing). But that´s not the real problem, I´ll simply get a new used PC (used P1´s are for sale at about $ 100) when I´m back from the trip, what I need for the time being is a solution where it would simply reboot when having trouble, and I _guess_ the software watchdog _may_ be what I´m looking for in this case. tia, rw
Re: software watchdog
On Sat, 08 Jul 2000 12:32:32 +1200, C. Falconer writes: 1) Temperature... has a CPU fan, case fan, or PSU fan seized up and died? no, all fans running fine (see [EMAIL PROTECTED]) 2) Have you changed anything recently? moved it, rebooted it, run a new kernel? no, the hardware hasn´t changed in about 6 months. 3) Run top, procinfo, vmstat -1, pppstats -w 1, netstat, free, df, and look for anything odd or wrong. all looking fine, as far as I can tell (which doesn´t mean much, since I´m a lowly network engineer ;-) and know criscos better than *n*x). 4) Take the GF with you on your trip - they make great company. ah, no, it´s definitely a men-only-vacation, eg more about beer playing quake than sun-taning sightseeing ;-) cheers, rw
software watchdog
Hi! My home-debian-box starts to behave rather odd lately, now and then it will freeze completely. The only thing working is ICMP, I can´t even get a TCP connection open, the screen is frozen, neither mouse nor keyboard will generate any event. I´ve already tried changing all I have on spare (read RAM and graphics adapter). Since there´s not even a single syslog-entry, I don´t really know where to start debugging. Would it make sense if I installed the software watchdog into the kernel in this case, so that the machine would (eventually) reboot when it hangs? This would be great because I´ll be on a trip next week and my girl-friend needs the debian-box as gateway/ mailserver in the meantime... tia, rw
Re: software watchdog
Have you checked the processor fan? Jeff Robert Waldner wrote: Hi! My home-debian-box starts to behave rather odd lately, now and then it will freeze completely. The only thing working is ICMP, I can´t even get a TCP connection open, the screen is frozen, neither mouse nor keyboard will generate any event. I´ve already tried changing all I have on spare (read RAM and graphics adapter). Since there´s not even a single syslog-entry, I don´t really know where to start debugging. Would it make sense if I installed the software watchdog into the kernel in this case, so that the machine would (eventually) reboot when it hangs? This would be great because I´ll be on a trip next week and my girl-friend needs the debian-box as gateway/ mailserver in the meantime... tia, rw -- Unsubscribe? mail -s unsubscribe [EMAIL PROTECTED] /dev/null
Re: software watchdog
rw wrote: Hi! My home-debian-box starts to behave rather odd lately, now and then it will freeze completely. Is there anything consistent about the behavior? How long between reboot and freeze? Are there any error messages during startup? What applications are running when the machine freezes? (my bet is netscape) What Debian version are you running (Slink, Potato)? What kernel version are you running? Have you tried telneting to the machine to see if it is a console only problem? Is the behavior in ANY way predictable? More info is necessary if anyone is to be able to help. I've had uninterpetable (for me) problems on systems in the past, I'd try booting into single user mode, and fsck all partitions (except /) to check for and repair filesystem problems. If my /usr, /var, /tmp, and /home partitions were severely hosed (usually from running a hosed kernel with a buggy HD / BIOS combo) I'd then use a rescue disk to clean up the / partition. This usually would work somewhat. For some reason, upgrading to Potato helped with my current el-cheapo box ($40 motherboard, $45 processor, and $90 hard drive) but YMMV. The lack of syslog entries is something I've never experienced, so I can't help you there. -- ptw miscelaneous endeavors ([EMAIL PROTECTED])
Re: software watchdog
1) Temperature... has a CPU fan, case fan, or PSU fan seized up and died? 2) Have you changed anything recently? moved it, rebooted it, run a new kernel? 3) Run top, procinfo, vmstat -1, pppstats -w 1, netstat, free, df, and look for anything odd or wrong. 4) Take the GF with you on your trip - they make great company. At 12:55 AM 7/8/00 +0200, you wrote: My home-debian-box starts to behave rather odd lately, now and then it will freeze completely. The only thing working is ICMP, I can´t even get a TCP connection open, the screen is frozen, neither mouse nor keyboard will generate any event. I´ve already tried changing all I have on spare (read RAM and graphics adapter). Since there´s not even a single syslog-entry, I don´t really know where to start debugging. Would it make sense if I installed the software watchdog into the kernel in this case, so that the machine would (eventually) reboot when it hangs? This would be great because I´ll be on a trip next week and my girl-friend needs the debian-box as gateway/ mailserver in the meantime... tia, rw -- Unsubscribe? mail -s unsubscribe [EMAIL PROTECTED] /dev/null -- Criggie
Re: Software watchdog full process table
Hi, One of our network servers went down for reboot in the middle of last night, which was done by the software watchdog. The log message was as follows: daemon.log:Apr 10 03:15:44 seldon watchdog[102]: process table is full! daemon.log:Apr 10 03:15:44 seldon watchdog[102]: shutting down the system Does anyone know what does this mean, and what would cause this to happen? Is there anyway to determine what was happening on the machine? Basicaly, the kernel keeps track of each process by having an entry for it in the process table. This table is a finite resource and thus can be filled up, so that no new process could be created. I guess that in order to see what was the cuase of your problem, you can 1) Take a look at the log files to see what processes were running 2) Take a look at the cron entries. 3) Maybee there is a bug in the watchdog software ? -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Software watchdog full process table
On Fri, Apr 10, 1998 at 12:02:28PM +1000, Chris wrote: Hi, One of our network servers went down for reboot in the middle of last night, which was done by the software watchdog. The log message was as follows: daemon.log:Apr 10 03:15:44 seldon watchdog[102]: process table is full! daemon.log:Apr 10 03:15:44 seldon watchdog[102]: shutting down the system Does anyone know what does this mean, and what would cause this to happen? Is there anyway to determine what was happening on the machine? Thanks for any assistance, The kernel has a limit for the maximum number of processes, I suspect that it was reached.. As far as the software watchdog goes, all that its managed to do on my system was annoy me, and shutdown my computer for many odd reasons (ie load average too high, etc..). I realize that this can be changed using a command line option, however it never did anything during a genuine crash, so I just stopped using it.. :) pgpgMG93mzxbh.pgp Description: PGP signature
Software watchdog full process table
Hi, One of our network servers went down for reboot in the middle of last night, which was done by the software watchdog. The log message was as follows: daemon.log:Apr 10 03:15:44 seldon watchdog[102]: process table is full! daemon.log:Apr 10 03:15:44 seldon watchdog[102]: shutting down the system Does anyone know what does this mean, and what would cause this to happen? Is there anyway to determine what was happening on the machine? Thanks for any assistance, Chris -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]