It is not only a boot time issue - I also see during normal operation:
2018-04-24T10:30:10.466723+00:00 gateway blacklistd 611 - - bl_recv:
recvmsg failed (No buffer space available)
2018-04-24T10:30:10.466821+00:00 gateway blacklistd 611 - - no message
(No buffer space available)
2018-04-24T10:56:47.223562+00:00 gateway sshd 13053 - - error: maximum
authentication attempts exceeded for invalid user root from
106.113.147.190 port 63303 ssh2 [preauth]
2018-04-24T11:15:09.240247+00:00 gateway blacklistd 611 - - bl_recv:
recvmsg failed (No buffer space available)
2018-04-24T11:15:09.240791+00:00 gateway blacklistd 611 - - no message
(No buffer space available)
I don't expect major resource usage for blacklistd though.
Also named does not seem to be too happy and ceases interface scanning.
This does not yet give a warm fuzzy feeling :-) && :-(
Frank
On 04/24/18 09:56, Roy Marples wrote:
On 24/04/2018 08:26, Martin Husemann wrote:
On Tue, Apr 24, 2018 at 07:30:04AM +0200, Frank Kardel wrote:
syslogd has sometimes issues with /var/run/log
2018-04-24T05:13:34.542548+00:00 gateway syslogd 408 - - recvfrom()
unix
`/var/run/log': No buffer space available
This is a seaparate change and unrelated to compatibility. It happens
with up to date binaries as well. I think it was a silent bug before
and has now been made more verbose. Still pretty annoying and happens
for me on various machines on every boot. Roy, did you have a chance to
look at it?
Not yet no. But yes, in all releases prior it was a silent bug on all
types of socket and in all the BSDs as well. I know, I checked - only
OpenBSD has an overflow check like this and they solve that with a
magic message on route(4) only which is just yuck as it makes the
problem worse.
I only have one machine where I can reliably repro this, my erlite and
that only happens because route(4) overflows (detected in dhcpcd) as
it's a router and the box isn't up yet and a load of address
validation flows over the socket when the link comes up. This is a
good thing, because dhcpcd can then react to the error and sync it's
state using getifaddrs().
I think the easiest fix is to increase the default size of the socket
buffer. Where this is done, I don't know but could find out if pushed.
This would fix everything if the default buffer was big enough.
Saying this, from what I'm hearing this only happens at boot time, so
we could potentially shrink the buffer back down again if we need to
consider dynamically growing it in the kernel as well. No idea if
that's even possible or what performance impact it would have.
The last option is to increase the socket buffer size in all affected
applications using ioctl (or is it setsockopt?). But to what value I
don't know. Trial and error?
Roy