Hello.
I have rsyslogd 5.6.0 on CentOS 5.5.
I've tried to send some of its output to FIFO and it doesn't work quite
the way I'd like it to. It seems I can reproduce the problem only when the
FIFO reader is started after rsyslogd, so the reader's end of FIFO is not
opened yet. It should be possible to reproduce the problem regardless, but
it didn't exibit itself when I tried in that sequence (which was just
once).
What happens is that rsyslogd tries to open the FIFO in non-blocking mode,
doesn't succeed and waits for notification that the reader has opened its
end. So far, so good. Then the reader starts, opens the reading end,
so rsyslogd opens the writing end and starts sending data. That's fine as
well.
Then rsyslogd starts getting EGAIN. I suppose that's because the rate of
incoming data is high enough to fill the kernel FIFO buffer before the
reader program gets the CPU time. The reader has no problem with keeping
up with the data rate in other configurations I've tried, but the FIFO
buffer in kernel is relatively small, writing to it won't block the
writing process untill the FIFO bufer is full, so it can use it's whole
CPU timeslot to fill the buffer and the reader's processing speed doesn't
matter.
I'm seeing a large number of write() attempts by rsyslogd (all failing
with EAGAIN) and eventually it stops trying to write and goes to sleep.
strace shows:
11109 gettimeofday({1289467839, 591385}, NULL) = 0
11109 select(0, NULL, NULL, NULL, {30, 0} <unfinished ...>
That is a pure 30 second pause which doesn't try to detect the state on
file descriptors.
The reader manages to emty the pipe and then, after 30 seconds, rsyslogd
starts writing again. It writes several lines of data and then the whole
process repeats. The net result is that the data transfer rate is extremly
slow and the system can't nearly keep up with the incoming data because
rsyslogd spends most of its time in that 30 second sleep in select().
I can see two ways out of that:
1. Instead of going to sleep with hard-coded time-out value, go to sleep
on select/poll/epoll/whatever that wakes up immediately after the FIFO
becomes writable.
2. Turn the FIFO file descriptor to blocking mode right after successful
open. This should be a lot easier to implement than the first option
and it's in-line with the way some other output modules behave (omprog
has blocking writes, for example). I do not know if some part of the
code expects FIFO file descriptor to be in the non-blocking mode,
though.
Blocking writes should enable me to use two queues and buffer the incoming
data on disk for (transient) cases when the consumer can't empty the pipe
fast enough. The configuration with the two queues doesn't work for me
at the moment because of a crash I reported in another mail, but I don't
think that's related to this problem. Once that is fixed, FIFOs should
become usable.
As it is, I can't use FIFO output at all, even with only one queue.
--
.-. .-. Yes, I am an agent of Satan, but my duties are largely
(_ \ / _) ceremonial.
|
| [email protected]
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com