help with flaky reboot on 3.1
Greetings, We are running 3.1-RELEASE with a kernel pulled on May 1, 1999 from the RELENG_3 branch (used this to take advantage of the KVA modifications that were rolled in after the release). The machines are dual PII 450's (N440BX) with 512MB RAM. We are also using built in ethernet and SCSI controllers. Our kernel configuration is fairly standard with the following exceptions: maxusers 512 options NMBCLUSTERS=33280 options SMP options APIC_IO options "VM_KMEM_SIZE=(128*1024*1024)" options "VM_KMEM_SIZE_MAX=(128*1024*1024)" Here are the symptoms we are seeing: 1 machine running a caching squid reverse proxy would spontaneously reboot with no error messages every week or so. This machine was a single CPU only. We were seeing an excessive number of sockets in the CLOSING state, via netstat. The reboots seemed to be co-related to having many such sockets. Suspecting bad TCP stack on the Internet, we did 'sysctl -w net.inet.tcp.always_keepalive=1' This fixed the many CLOSING sockets problem, but did not fix the reboots. Other machines running custom software (Dual CPU) would also spontaneously reboot also with no error messages. The reboots are happening on an increasing frequency, almost to the point of a couple times a day. Sometimes a machine would reboot a couple times a day, then be ok for another week or so. Our software excercies the disk, CPU and network quite a bit, but not excessively. The only machines that are having problems, are production machines directly connected to the Internet. We've had the same machines running internally with longer uptimes, and heavier volumes. Any suggestions/idea's? Sorry about the super-post, I thought detail was important. - Stevan Arychuk To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: help with flaky reboot on 3.1
On Tue, 14 Sep 1999, Stevan Arychuk wrote: Greetings, We are running 3.1-RELEASE with a kernel pulled on May 1, 1999 from the RELENG_3 branch (used this to take advantage of the KVA modifications that were rolled in after the release). Here are the symptoms we are seeing: 1 machine running a caching squid reverse proxy would spontaneously reboot with no error messages every week or so. This machine was a single CPU only. We were seeing an excessive number of sockets in the CLOSING state, via netstat. The reboots seemed to be co-related to having many such sockets. Suspecting bad TCP stack on the Internet, we did 'sysctl -w net.inet.tcp.always_keepalive=1' This fixed the many CLOSING sockets problem, but did not fix the reboots. Other machines running custom software (Dual CPU) would also spontaneously reboot also with no error messages. The reboots are happening on an increasing frequency, almost to the point of a couple times a day. Sometimes a machine would reboot a couple times a day, then be ok for another week or so. Our software excercies the disk, CPU and network quite a bit, but not excessively. The only machines that are having problems, are production machines directly connected to the Internet. We've had the same machines running internally with longer uptimes, and heavier volumes. Any suggestions/idea's? Sorry about the super-post, I thought detail was important. - Stevan Arychuk To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message - Chris D. Faulhaber [EMAIL PROTECTED] | All the true gurus I've met never System/Network Administrator,| claimed they were one, and always Reality Check Information, Inc. | pointed to someone better. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: help with flaky reboot on 3.1
On Tue, 14 Sep 1999, Stevan Arychuk wrote: Greetings, We are running 3.1-RELEASE with a kernel pulled on May 1, 1999 from the RELENG_3 branch (used this to take advantage of the KVA modifications that were rolled in after the release). Are the kernel and user-land out of sync (kernel sources newer than system sources)? Cheers, Chris p.s. sorry about the prev. reply without comments...Pine's send and cancel keys are too close together :) - Chris D. Faulhaber [EMAIL PROTECTED] | All the true gurus I've met never System/Network Administrator,| claimed they were one, and always Reality Check Information, Inc. | pointed to someone better. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
help with flaky reboot on 3.1
Greetings, We are running 3.1-RELEASE with a kernel pulled on May 1, 1999 from the RELENG_3 branch (used this to take advantage of the KVA modifications that were rolled in after the release). The machines are dual PII 450's (N440BX) with 512MB RAM. We are also using built in ethernet and SCSI controllers. Our kernel configuration is fairly standard with the following exceptions: maxusers 512 options NMBCLUSTERS=33280 options SMP options APIC_IO options VM_KMEM_SIZE=(128*1024*1024) options VM_KMEM_SIZE_MAX=(128*1024*1024) Here are the symptoms we are seeing: 1 machine running a caching squid reverse proxy would spontaneously reboot with no error messages every week or so. This machine was a single CPU only. We were seeing an excessive number of sockets in the CLOSING state, via netstat. The reboots seemed to be co-related to having many such sockets. Suspecting bad TCP stack on the Internet, we did 'sysctl -w net.inet.tcp.always_keepalive=1' This fixed the many CLOSING sockets problem, but did not fix the reboots. Other machines running custom software (Dual CPU) would also spontaneously reboot also with no error messages. The reboots are happening on an increasing frequency, almost to the point of a couple times a day. Sometimes a machine would reboot a couple times a day, then be ok for another week or so. Our software excercies the disk, CPU and network quite a bit, but not excessively. The only machines that are having problems, are production machines directly connected to the Internet. We've had the same machines running internally with longer uptimes, and heavier volumes. Any suggestions/idea's? Sorry about the super-post, I thought detail was important. - Stevan Arychuk To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: help with flaky reboot on 3.1
On Tue, 14 Sep 1999, Stevan Arychuk wrote: Greetings, We are running 3.1-RELEASE with a kernel pulled on May 1, 1999 from the RELENG_3 branch (used this to take advantage of the KVA modifications that were rolled in after the release). Here are the symptoms we are seeing: 1 machine running a caching squid reverse proxy would spontaneously reboot with no error messages every week or so. This machine was a single CPU only. We were seeing an excessive number of sockets in the CLOSING state, via netstat. The reboots seemed to be co-related to having many such sockets. Suspecting bad TCP stack on the Internet, we did 'sysctl -w net.inet.tcp.always_keepalive=1' This fixed the many CLOSING sockets problem, but did not fix the reboots. Other machines running custom software (Dual CPU) would also spontaneously reboot also with no error messages. The reboots are happening on an increasing frequency, almost to the point of a couple times a day. Sometimes a machine would reboot a couple times a day, then be ok for another week or so. Our software excercies the disk, CPU and network quite a bit, but not excessively. The only machines that are having problems, are production machines directly connected to the Internet. We've had the same machines running internally with longer uptimes, and heavier volumes. Any suggestions/idea's? Sorry about the super-post, I thought detail was important. - Stevan Arychuk To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message - Chris D. Faulhaber jed...@fxp.org | All the true gurus I've met never System/Network Administrator,| claimed they were one, and always Reality Check Information, Inc. | pointed to someone better. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: help with flaky reboot on 3.1
On Tue, 14 Sep 1999, Stevan Arychuk wrote: Greetings, We are running 3.1-RELEASE with a kernel pulled on May 1, 1999 from the RELENG_3 branch (used this to take advantage of the KVA modifications that were rolled in after the release). Are the kernel and user-land out of sync (kernel sources newer than system sources)? Cheers, Chris p.s. sorry about the prev. reply without comments...Pine's send and cancel keys are too close together :) - Chris D. Faulhaber jed...@fxp.org | All the true gurus I've met never System/Network Administrator,| claimed they were one, and always Reality Check Information, Inc. | pointed to someone better. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: help with flaky reboot on 3.1
Yes, The KVA patches were introduced after the initial release of 3.1, so I guess you could say we're running 3.1 1/2 - RELEASE. We held off on upgrading to 3.2 as there was still a problem with GDB. I'm testing 3.3-19990909-RC and I've been able to do a back trace on a core dump from a SMP-enabled kernel 4 out of 6 times. I haven't been following the other lists, does anyone know how close this latest RC is, will it be 3.3-RELEASE by tommorow? - Stevan Chris D. Faulhaber wrote: On Tue, 14 Sep 1999, Stevan Arychuk wrote: Greetings, We are running 3.1-RELEASE with a kernel pulled on May 1, 1999 from the RELENG_3 branch (used this to take advantage of the KVA modifications that were rolled in after the release). Are the kernel and user-land out of sync (kernel sources newer than system sources)? Cheers, Chris p.s. sorry about the prev. reply without comments...Pine's send and cancel keys are too close together :) - Chris D. Faulhaber jed...@fxp.org | All the true gurus I've met never System/Network Administrator,| claimed they were one, and always Reality Check Information, Inc. | pointed to someone better. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message