I have a forwarding situation setup between two rsyslogs using omrelp. The 
relevant portion of my client conf looks like this:

$WorkDirectory /var/spool/rsyslog
$ActionQueueType LinkedList # use asynchronous processing
$ActionQueueFileName datanode_fwd # set file name, also enables disk mode
$ActionResumeRetryCount -1 # infinite retries on insert failure
$ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down
:programname,isequal,"datanode" :omrelp:opbhdname001:20514;RSYSLOG_ForwardFormat
& ~

And my server conf has:
$WorkDirectory /var/spool/rsyslog
$ActionQueueType LinkedList # use asynchronous processing
$ActionQueueFileName imrelp_recv # set file name, also enables disk mode
$ActionResumeRetryCount -1 # infinite retries on insert failure
$ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down
:inputname,isequal,"imrelp"     ?logDest;RSYSLOG_FileFormat
& ~

And I wanted to test just how reliable the setup was, so I wrote a script to 
log messages while taking the local and remote rsyslog servers up and down. I 
have 7 phases:

Phase 1: Both servers are up
Phase 2: Remote is down and local is up
Phase 3: Remote is up and local is up
Phase 4: Remote is down and local is up
Phase 5: Remote is down and local is down
Phase 6: Remote is down and local is up
Phase 7: Both servers are up

And I send log two messages per phase. These are the log messages that 
eventually get to the remote:

2012-05-10T16:22:46.409971-04:00 opbhddata006 datanode[16500]: Phase 1: Message 
1 at Thu May 10 16:22:46 EDT 2012 while remote rsyslogd is up and local 
rsyslogd is up.
2012-05-10T16:22:47.419354-04:00 opbhddata006 datanode[16500]: Phase 1: Message 
2 at Thu May 10 16:22:47 EDT 2012 while remote rsyslogd is up and local 
rsyslogd is up.
2012-05-10T16:22:49.902709-04:00 opbhddata006 datanode[16500]: Phase 2: Message 
4 at Thu May 10 16:22:49 EDT 2012 while remote rsyslogd is down and local 
rsyslogd is up.
2012-05-10T16:22:51.312337-04:00 opbhddata006 datanode[16500]: Phase 3: Message 
5 at Thu May 10 16:22:51 EDT 2012 while remote rsyslogd is up and local 
rsyslogd is up.
2012-05-10T16:22:52.321468-04:00 opbhddata006 datanode[16500]: Phase 3: Message 
6 at Thu May 10 16:22:52 EDT 2012 while remote rsyslogd is up and local 
rsyslogd is up.
2012-05-10T16:22:53.794995-04:00 opbhddata006 datanode[16500]: Phase 4: Message 
7 at Thu May 10 16:22:53 EDT 2012 while remote rsyslogd is down and local 
rsyslogd is up.
2012-05-10T16:22:54.804737-04:00 opbhddata006 datanode[16500]: Phase 4: Message 
8 at Thu May 10 16:22:54 EDT 2012 while remote rsyslogd is down and local 
rsyslogd is up.
2012-05-10T16:22:58.228954-04:00 opbhddata006 datanode[16500]: Phase 6: Message 
11 at Thu May 10 16:22:58 EDT 2012 while remote rsyslogd is down and local 
rsyslogd is up.
2012-05-10T16:22:59.237901-04:00 opbhddata006 datanode[16500]: Phase 6: Message 
12 at Thu May 10 16:22:59 EDT 2012 while remote rsyslogd is down and local 
rsyslogd is up.
2012-05-10T16:23:00.654580-04:00 opbhddata006 datanode[16500]: Phase 7: Message 
13 at Thu May 10 16:23:00 EDT 2012 while remote rsyslogd is up and local 
rsyslogd is up.
2012-05-10T16:23:01.663476-04:00 opbhddata006 datanode[16500]: Phase 7: Message 
14 at Thu May 10 16:23:01 EDT 2012 while remote rsyslogd is up and local 
rsyslogd is up.

As you can see, no messages are received during phase 5 when both the local and 
remote servers are down. That's completely expected, as I wouldn't expect any 
logs to get through anytime that the local is down (regardless of the remote). 
What surprises me is phase 2, it's missing the first message during this phase. 
No matter how many messages I send per phase, phase 2 is always missing that 
first one. I checked out the working directory and the local rsyslog seems to 
be creating the temp files just fine (I was having earlier issues with rsyslog 
not being able to write to /var/spool/rsyslog, so any time the local went down 
it would lose any messages not yet sent to the remote but that was an selinux 
issue).

And it's not like the local was in the process of being brought up or down 
between phase 1 and 2 and might have missed a log message, it was up and stayed 
up the whole time. The only thing I can think of is that the remote was still 
in the process of shutting down when the local sent the first log in phase 2; 
local rsyslog thought it had been completely sent and things were all good; but 
remote was stopped halfway while processing it and never wrote it to disk for 
some reason.

Any ideas?

-Steven Willis
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards

Reply via email to