It was thus said that the Great Justin Lee once stated:
> 2014-07-13 3:31 GMT+08:00 David Lang <[email protected]>:
> > On Sat, 12 Jul 2014, Justin Lee wrote:
> >
> >> Hello List,
> >>
> >> I noticed that local messages are passed by socket rather than
> >> directly written to log files. The evidence is as follows:
> >
> >
> > Yes, this is how syslog works.
> >
> >
> >> Since log mechanism is an important infrastructure of many programs,
> >> it is supposed to be as fast as possible. So do we have some
> >> configurations or settings in rsyslog which make local messages
> >> written directly into log files without passing through socket?
> >
> >
> > actually, writing to disk can be slower than writing to a socket, so it's
> > not clear that it's a performance win to write to disk directly.
> 
> I think that writing to disk is supposed to be the final step of
> rsyslogd in processing local messages. After all, we couldn't see the
> messages in log files if rsyslogd didn't write to disk at all, right?
> But in addition to disk writing, use of socket by syslog to transfer
> local log messages introduces extra CPU time or memory to do the
> following things:

  A program (and for now, let's just assume a program under Unix to keep
things simple) has a few options for logging:

1. open a file and log stuff to that file.  Quick and simple.  Well, not so
simple actually.  Because of buffering (at the C stdio layer), if a program
were to suddenly die, some portion of the logged messages would be lost. 
You can minimize this by disallowing C stdio buffering, so a sudden crash of
the program won't lose logs, but this could slow down the program
(buffering was invented to speed things up).  Also, just because you do

        fh = open("logfile.txt",O_WRONLY);
        write(fh,"Thar she blows\n",15);
        close(fh);

doesn't mean that the message "Thar she blows" will actually be stored on
spinning metal disks.  After the close happens the computer looses all power
(hey, we were planning on installing UPSes) then again, log messages lost,
due to buffering in the kernel (and Linux does some agressive file
buffering).  Sure, you can add a call to fsync() after the write, but, if
the operating system bothers to honor the semantics, then you really loose
performance of the program.

A secondary problem with this is log rotation, the ability to stop recording
in one file (so it can be archived if need be, or just deleted) and start
recoding in a new file.  You have a few solutions here:  the program itself
can log rotate, but now the log rotation is either hard coded into the
program; there's a configuration option or a command line option (or both),
or it responds to SIGINT---all require more code, more testing, and more
places for things to go boom.  

Also, if you want to send the logs elsewhere, you'll need an external
program to copy the logs over or, to sit there, reading the log file and
sending the logs are they're appended (with a potential delay if C stdio
buffering is in use).

2. send the log to another process to deal with the information.  This is
where syslog() lives and yes, the current implementation uses local (Unix)
sockets to transfer the data, but really, your options are limited here:

        a) sockets (local or loopback or whatever)
        b) named pipes
        c) message queues
        d) shared memory

The first three all require the kernel to copy the data from the sending
process to the receiving process; the last one can skip the copying bit, but
there are nasty synchronizations issues that would need to be worked out
that would probably obliviate any savings of copying the data.

You will also need to work out how the client and server establish a
connection.  For a sockets and named pipes, it's easy---the client just
establishes a connection with a well-known named resource (in the case of
syslog(), it's typically "/dev/log").  It's less clear how one would do that
in the case of message queues and shared memory (which, in my experience,
tend to be used where a master process creates children processes and use
message queues/shared memory to communicate with, rather than with
independent processes).  

So it comes down to sockets or named pipes.  syslog() works with a datagram
based local socket, which does mean a system call, but only until either the
data is copied out, or, if the socket buffer is full, just returns (I
think---it's a datagram and the semantics of a datagram are typically "best
effort") so there's no potential blocking of the client. Even if the server
isn't running, the packets will just be tossed into the bit bucket. With a
named pipe, there does exist the possiblilty of blocking the client (and a
definite possibility of blocking if the logging server isn't running).

Now, since you already have another process to receive the logs, it's not
that big of a deal to have it forward the logs elsewhere, and if you are
doing it over a local socket, that typically means the logic is already
there to transmit it over a network socket with very little work.

  -spc
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to