On Tue, 26 Jun 2012, Claudio Jeker wrote: > On Mon, Jun 25, 2012 at 04:57:04PM -0500, Jeremy C. Reed wrote: > > I am running as a VMware guest. > > > > OpenBSD 5.1 GENERIC#181 amd64 amd64 > > > > dmesg says it is: > > cpu0: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz, 2399.73 MHz > > > > I also periodically had same following issue on same system with OpenBSD > > 5.0. > > > > I have various socket unit tests that are ran numerous times a week. > > After a few days of running, one of our tests doing a setsockopt to > > increase a send buffer size fails with "No buffer space available". Once > > this happens, I can't find anyway to recover other than rebooting. I > > also can't firgure out the trigger for problem but over time the problem > > happens. It may happen after one day or after a couple weeks. When we > > reboot the problem goes away. We have repeated this. But don't know what > > causes it. This problem has been going on for months. We reboot > > periodically due to it. > > > > A simple program to show the problem is at: > > http://git.bind10.isc.org/~jreed/sndbuf.cc > > > > For example: > > > > $ ftp http://git.bind10.isc.org/~jreed/sndbuf.cc > > Trying 149.20.48.84... > > Requesting http://git.bind10.isc.org/~jreed/sndbuf.cc > > 100% |**************************************************| 1538 > > 00:00 > > 1538 bytes received in 0.00 seconds (1.97 MB/s) > > > > $ less sndbuf.cc > > > > $ g++ -o sndbuf sndbuf.cc > > > > > > $ ./sndbuf > > > > current send buf size: 9216 > > failed to increase sendbuf size: 9217: No buffer space available > > > > I see some operating systems have a resource limit, like RLIMIT_SBSIZE, > > that can limit it. I don't see that for OpenBSD. > > > > It is acting like the net.inet.udp.sendspace sysctl is the limit. When I > > raise net.inet.udp.sendspace, the current send buf size and attempt > > (using the code above) increases. > > > > $ sysctl net.inet.udp.sendspace > > net.inet.udp.sendspace=9216 > > > > When I reboot, the sample code does not fail (until sometime later). > > > > setsockopt(2) says "The system places an absolute limit on these > > values." Where is this documented further? > > > > I am looking at sbcheckreserve and sbchecklowmem in > > src/sys/kern/uipc_socket2.c. > > > > What does sbchecklowmem() return if between the two limits? (Can that > > condition occur?) > > > > My vmstat -m output is at > > http://git.bind10.isc.org/~jreed/vmstat.out.20120625.txt > > > > Do you have any suggestions on how to troubleshoot this further? The > > system as of now has not been rebooted, so if there anything you would > > like me to test or details to provide, please let me know. > > > > You may want to have a look at sbcheckreserve() and sbchecklowmem() in > sys/kern/uipc_socket2.c. The kernel does not give you more resources than > he thinks is good for you and especially for himself. Running the kernel > out of mbufs is a bad thing. The heuristic is very simple but effective > (at least for TCP, UDP is a bit a different story).
Thank you much for your response. I had mentioned about same code above. Where is it documented? I also asked a question above can it be between the two limits and get a wrong response? (Can someone explain that?) > Note: Busy servers may need to tune the kern.maxclusters sysctl so that > more sockets can be served at the same time. The system is not busy at all except when it does scheduled builds and unit tests. Once problem is triggered it always fails later (until reboot) even when doing nothing extra for hours or for days. When I first checked, the setting was: kern.maxclusters=6144 I was able to increase the SO_SNDBUF by one (from 9216 to 9217) after I did: sysctl -w kern.maxclusters=6187 (at 6186 and below I could not) I am curious how the system may be running out of mbufs. (Normally nothing is running except the default, standard processes as known after a new bare install.) Is there something in the vmstat (or other tool) output to see why it doesn't reclaim memory?
