Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

2007-01-10 Thread Thomas Herrlin
Bruce A. Mah wrote:
 If memory serves me right, LI Xin wrote:
 Ken Smith wrote:
 On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote:
 It still runs networking daemons into a frozen zoneli state on
 heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
 no way to recover from it. (think frozen sshd and a very remote/headless
 server).
 See the stress test panic called 'Ran out of 128 Bucket
 http://people.FreeBSD.org/%7Epho/stress/log/cons210.html' on the 6.2
 todo list and my own latest test here:
 http://www.maniacs.se/~junics/temp/vmstat-z.txt
 This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
 sbsize limits in /etc/login.conf.
 I just made a vm disk image with replication instructions, however Peter
 Holm have replicated it with his own tools so i have not bothered with
 it until now. 
 That problem is being worked on but won't be fixed for 6.2-REL.
 Depending on how complex the fix winds up being it may be an Errata
 candidate when the time comes.
 Perhaps we should mention some known workarounds in the errata
 documentation.  E.g. raising nmbclusters limit, etc.?
 
 That's a good idea.  Do you have more specifics (e.g. any particular
 nmbclusters value, other workarounds, etc.)?
 
 Thanks,
 
 Bruce.
 

The most reliable way of avoiding zoneli according to my tests is
setting an sbsize limit in /etc/login.conf to a value lower than the
mbuf_cluster zone size limitation, note that there are 2048 bytes per
cluster. (See vmstat -z for details)
Or set the login.conf sbsize to a fraction of available RAM and combine
this with the 0/unlimited setting as some recommend.
Combining these two workarounds would probably be best, as setting mbuf
to use unlimited ram for networking would cause a panic or freeze sooner
or later anyway. I have not tested combining this yet as my system has
been running stable for some time now with my current workarounds.

Problems with sbsize limit:
Setting sbsize in login.conf will lead to that some processes will run
into a problem that they cannot allocate socket buffers in some extreme
cases, however this will not affect overall system stability and that is
my first priority.

I have also thrown together a small executable that attempts local
connection to its sshd with a the preliminary ssh handshake and that can
be used with watchdogd -e parameter to reboot the box. This is mainly
for headless/remote servers that MUST NOT have its sshd frozen.

You can also read my mail to the fbsd-current list with the subject Re:
zonelimit livelock, some possable workarounds

/Thomas Herrlin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

2007-01-09 Thread Bruce A. Mah
If memory serves me right, LI Xin wrote:
 Ken Smith wrote:
 On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote:
 It still runs networking daemons into a frozen zoneli state on
 heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
 no way to recover from it. (think frozen sshd and a very remote/headless
 server).
 See the stress test panic called 'Ran out of 128 Bucket
 http://people.FreeBSD.org/%7Epho/stress/log/cons210.html' on the 6.2
 todo list and my own latest test here:
 http://www.maniacs.se/~junics/temp/vmstat-z.txt
 This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
 sbsize limits in /etc/login.conf.
 I just made a vm disk image with replication instructions, however Peter
 Holm have replicated it with his own tools so i have not bothered with
 it until now. 
 That problem is being worked on but won't be fixed for 6.2-REL.
 Depending on how complex the fix winds up being it may be an Errata
 candidate when the time comes.
 
 Perhaps we should mention some known workarounds in the errata
 documentation.  E.g. raising nmbclusters limit, etc.?

That's a good idea.  Do you have more specifics (e.g. any particular
nmbclusters value, other workarounds, etc.)?

Thanks,

Bruce.



signature.asc
Description: OpenPGP digital signature


Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

2007-01-09 Thread LI Xin
Bruce A. Mah wrote:
 If memory serves me right, LI Xin wrote:
 Ken Smith wrote:
 On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote:
 It still runs networking daemons into a frozen zoneli state on
 heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
 no way to recover from it. (think frozen sshd and a very remote/headless
 server).
 See the stress test panic called 'Ran out of 128 Bucket
 http://people.FreeBSD.org/%7Epho/stress/log/cons210.html' on the 6.2
 todo list and my own latest test here:
 http://www.maniacs.se/~junics/temp/vmstat-z.txt
 This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
 sbsize limits in /etc/login.conf.
 I just made a vm disk image with replication instructions, however Peter
 Holm have replicated it with his own tools so i have not bothered with
 it until now. 
 That problem is being worked on but won't be fixed for 6.2-REL.
 Depending on how complex the fix winds up being it may be an Errata
 candidate when the time comes.
 Perhaps we should mention some known workarounds in the errata
 documentation.  E.g. raising nmbclusters limit, etc.?
 
 That's a good idea.  Do you have more specifics (e.g. any particular
 nmbclusters value, other workarounds, etc.)?

The current workaround is that set the following in /boot/loader.conf:

kern.ipc.nmbclusters=0

And reboot;  Note that this is not perfect as it can lead to the need of
increasing KVA space under certain load.

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

2007-01-08 Thread Abdullah Al-Marrie

On 12/28/06, Thomas Herrlin [EMAIL PROTECTED] wrote:

Ken Smith wrote:

snip
 All problems we felt needed to be addressed before 6.2 could be released
 have been taken care of.
It still runs networking daemons into a frozen zoneli state on
heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
no way to recover from it. (think frozen sshd and a very remote/headless
server).
See the stress test panic called 'Ran out of 128 Bucket
http://people.FreeBSD.org/%7Epho/stress/log/cons210.html' on the 6.2
todo list and my own latest test here:
http://www.maniacs.se/~junics/temp/vmstat-z.txt
This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
sbsize limits in /etc/login.conf.
I just made a vm disk image with replication instructions, however Peter
Holm have replicated it with his own tools so i have not bothered with
it until now.

 Unless further testing turns up something new
 RC2, which is available now for dowloading, will be the last of the
 Release Candidates and 6.2-RELEASE should be ready in about 2 weeks.
 Your continued help with testing would be greatly appreciated.  If you
 notice any problems with RC2 you can submit a PR or send mail to this
 list.

snip

/Thomas Herrlin


Have you tried these options in kernel?

options DEVICE_POLLING
options HZ=1000

add this line to the end of your /etc/sysctl.conf:

kern.polling.enable=1

DEVICE_POLLING changes the method through which data gets from your
network card to the kernel. Traditionally, each time the network card
needs attention (for example when it receives a packet), it generates
an interrupt request. The request causes a context switch and a call
to an interrupt handler. A context switch is when the CPU and kernel
have to switch from user land (the user's programs or daemons), and
kernel land (dealing with device drivers, hardware, and other
kernel-bound tasks). The last few years have seen significant
improvements in the efficiency of context switching but it is still an
extremely expensive operation. Furthermore, the amount of time the
system can have to spend when dealing with an interrupt can be almost
limitless. It is completely possible for an interrupt to never free
the kernel, leaving your machine unresponsive. Those of us unfortunate
enough to be on the wrong side of certain Denial of Service attacks
will know about this.

More info in here
A guide to server and workstation optimization, by Avleen Vig
http://silverwraith.com/papers/freebsd-tuning.php

--
Regards,

-Abdullah Ibn Hamad Al-Marri
Arab Portal
http://www.WeArab.Net/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

2006-12-28 Thread Thomas Herrlin
Ken Smith wrote:

snip
 All problems we felt needed to be addressed before 6.2 could be released
 have been taken care of.  
It still runs networking daemons into a frozen zoneli state on
heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
no way to recover from it. (think frozen sshd and a very remote/headless
server).
See the stress test panic called 'Ran out of 128 Bucket
http://people.FreeBSD.org/%7Epho/stress/log/cons210.html' on the 6.2
todo list and my own latest test here:
http://www.maniacs.se/~junics/temp/vmstat-z.txt
This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
sbsize limits in /etc/login.conf.
I just made a vm disk image with replication instructions, however Peter
Holm have replicated it with his own tools so i have not bothered with
it until now.

 Unless further testing turns up something new
 RC2, which is available now for dowloading, will be the last of the
 Release Candidates and 6.2-RELEASE should be ready in about 2 weeks.
 Your continued help with testing would be greatly appreciated.  If you
 notice any problems with RC2 you can submit a PR or send mail to this
 list.
   
snip

/Thomas Herrlin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

2006-12-28 Thread Ken Smith
On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote:
 It still runs networking daemons into a frozen zoneli state on
 heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
 no way to recover from it. (think frozen sshd and a very remote/headless
 server).
 See the stress test panic called 'Ran out of 128 Bucket
 http://people.FreeBSD.org/%7Epho/stress/log/cons210.html' on the 6.2
 todo list and my own latest test here:
 http://www.maniacs.se/~junics/temp/vmstat-z.txt
 This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
 sbsize limits in /etc/login.conf.
 I just made a vm disk image with replication instructions, however Peter
 Holm have replicated it with his own tools so i have not bothered with
 it until now. 

That problem is being worked on but won't be fixed for 6.2-REL.
Depending on how complex the fix winds up being it may be an Errata
candidate when the time comes.

-- 
Ken Smith
- From there to here, from here to  |   [EMAIL PROTECTED]
  there, funny things are everywhere.   |
  - Theodore Geisel |



signature.asc
Description: This is a digitally signed message part


Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

2006-12-28 Thread LI Xin
Ken Smith wrote:
 On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote:
 It still runs networking daemons into a frozen zoneli state on
 heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
 no way to recover from it. (think frozen sshd and a very remote/headless
 server).
 See the stress test panic called 'Ran out of 128 Bucket
 http://people.FreeBSD.org/%7Epho/stress/log/cons210.html' on the 6.2
 todo list and my own latest test here:
 http://www.maniacs.se/~junics/temp/vmstat-z.txt
 This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
 sbsize limits in /etc/login.conf.
 I just made a vm disk image with replication instructions, however Peter
 Holm have replicated it with his own tools so i have not bothered with
 it until now. 
 
 That problem is being worked on but won't be fixed for 6.2-REL.
 Depending on how complex the fix winds up being it may be an Errata
 candidate when the time comes.

Perhaps we should mention some known workarounds in the errata
documentation.  E.g. raising nmbclusters limit, etc.?

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature