On Mon, Jan 05, 2015 at 07:02:47PM +0100, Alexander Bluhm wrote:
> On Sun, Jan 04, 2015 at 07:16:19PM -0700, Theo de Raadt wrote:
> > >On Sat, Jan 3, 2015 at 8:38 AM, Alexander Bluhm <[email protected]>
> > >wrote:
> > >> My goal is to make logging via syslog reliable. At least I want
> > >> to see when a message gets lost.
> > >>
> > >> So my idea is to write a kernel log message if sendsyslog(2) cannot
> > >> deliver a message. Then you see the problem on the console and in
> > >> the dmesg buffer. If syslogd comes back later, you will also get
> > >> the error into the log files via /dev/klog.
> > >>
> > >> comments? ok?
> > >
> > >Idea makes sense, but I wonder whether rate limiting is the best way
> > >to handle logging of the failures. My first instinct would be to log
> > >the first failure, and then just count failures until a success
> > >occurs, at which point the count of the intervening failures should be
> > >reported.
>
> Yes, that sounds better than rate limiting. Updated diff.
I think rate limiting still makes sense. For example in the case where
sosend() is returning ENOBUFS which, when the pressure is high enough,
could happen frequently but only for a few calls at a time.
> > I had a discussion with Alexander where I expressed concern about the
> > syncronous nature of console output, and that rate limiting may not be
> > enough. On slow console devices, this can have significant
> > performance effects even with the rate limiting.
>
> Would it reduce your concerns when I change it to something that
> logs only to dmesg buffer and not to console?
>
> I have to admit, that you may get this log line also on reboot.
>
> > To me this code block fundamentally feels like it is using the dmesg
> > buffer as a debug scaffold, because the idea is that syslogd should
> > not fail. I know what he wants, but I am wary of the kernel having
> > this heavy cost to a system call which was designed to be so simple
> > and clean.
>
> I do not use it as a debugging aid. We have the security requirement
> that no log message may get lost unnoticed. The syslogd process
> may die, in that case you get a line in dmesg buffer from the kernel.
> That can be checked later.
>
> bluhm
>
> Index: kern/subr_log.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/subr_log.c,v
> retrieving revision 1.25
> diff -u -p -r1.25 subr_log.c
> --- kern/subr_log.c 13 Dec 2014 21:05:33 -0000 1.25
> +++ kern/subr_log.c 5 Jan 2015 16:58:14 -0000
> @@ -352,9 +352,14 @@ sys_sendsyslog(struct proc *p, void *v,
> struct file *f;
> size_t len;
> int error;
> +#ifndef SMALL_KERNEL
> + static unsigned int failed;
> +#endif
>
> - if (syslogf == NULL)
> - return (ENOTCONN);
> + if (syslogf == NULL) {
> + error = ENOTCONN;
> + goto out;
> + }
> f = syslogf;
> FREF(f);
>
> @@ -390,5 +395,19 @@ sys_sendsyslog(struct proc *p, void *v,
> }
> #endif
> FRELE(f, p);
> - return error;
> +
> + out:
> +#ifndef SMALL_KERNEL
> + if (error) {
> + if (failed == 0)
> + log(LOG_ERR, "send message to syslog failed\n");
> + failed++;
> + } else {
> + if (failed > 1)
> + log(LOG_ERR, "send message to syslog failed "
> + "%u more times\n", failed - 1);
> + failed = 0;
> + }
> +#endif
> + return (error);
> }
>
--
:wq Claudio