I have a forwarding situation setup between two rsyslogs using omrelp. The relevant portion of my client conf looks like this:
$WorkDirectory /var/spool/rsyslog $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName datanode_fwd # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure $ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down :programname,isequal,"datanode" :omrelp:opbhdname001:20514;RSYSLOG_ForwardFormat & ~ And my server conf has: $WorkDirectory /var/spool/rsyslog $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName imrelp_recv # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure $ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down :inputname,isequal,"imrelp" ?logDest;RSYSLOG_FileFormat & ~ And I wanted to test just how reliable the setup was, so I wrote a script to log messages while taking the local and remote rsyslog servers up and down. I have 7 phases: Phase 1: Both servers are up Phase 2: Remote is down and local is up Phase 3: Remote is up and local is up Phase 4: Remote is down and local is up Phase 5: Remote is down and local is down Phase 6: Remote is down and local is up Phase 7: Both servers are up And I send log two messages per phase. These are the log messages that eventually get to the remote: 2012-05-10T16:22:46.409971-04:00 opbhddata006 datanode[16500]: Phase 1: Message 1 at Thu May 10 16:22:46 EDT 2012 while remote rsyslogd is up and local rsyslogd is up. 2012-05-10T16:22:47.419354-04:00 opbhddata006 datanode[16500]: Phase 1: Message 2 at Thu May 10 16:22:47 EDT 2012 while remote rsyslogd is up and local rsyslogd is up. 2012-05-10T16:22:49.902709-04:00 opbhddata006 datanode[16500]: Phase 2: Message 4 at Thu May 10 16:22:49 EDT 2012 while remote rsyslogd is down and local rsyslogd is up. 2012-05-10T16:22:51.312337-04:00 opbhddata006 datanode[16500]: Phase 3: Message 5 at Thu May 10 16:22:51 EDT 2012 while remote rsyslogd is up and local rsyslogd is up. 2012-05-10T16:22:52.321468-04:00 opbhddata006 datanode[16500]: Phase 3: Message 6 at Thu May 10 16:22:52 EDT 2012 while remote rsyslogd is up and local rsyslogd is up. 2012-05-10T16:22:53.794995-04:00 opbhddata006 datanode[16500]: Phase 4: Message 7 at Thu May 10 16:22:53 EDT 2012 while remote rsyslogd is down and local rsyslogd is up. 2012-05-10T16:22:54.804737-04:00 opbhddata006 datanode[16500]: Phase 4: Message 8 at Thu May 10 16:22:54 EDT 2012 while remote rsyslogd is down and local rsyslogd is up. 2012-05-10T16:22:58.228954-04:00 opbhddata006 datanode[16500]: Phase 6: Message 11 at Thu May 10 16:22:58 EDT 2012 while remote rsyslogd is down and local rsyslogd is up. 2012-05-10T16:22:59.237901-04:00 opbhddata006 datanode[16500]: Phase 6: Message 12 at Thu May 10 16:22:59 EDT 2012 while remote rsyslogd is down and local rsyslogd is up. 2012-05-10T16:23:00.654580-04:00 opbhddata006 datanode[16500]: Phase 7: Message 13 at Thu May 10 16:23:00 EDT 2012 while remote rsyslogd is up and local rsyslogd is up. 2012-05-10T16:23:01.663476-04:00 opbhddata006 datanode[16500]: Phase 7: Message 14 at Thu May 10 16:23:01 EDT 2012 while remote rsyslogd is up and local rsyslogd is up. As you can see, no messages are received during phase 5 when both the local and remote servers are down. That's completely expected, as I wouldn't expect any logs to get through anytime that the local is down (regardless of the remote). What surprises me is phase 2, it's missing the first message during this phase. No matter how many messages I send per phase, phase 2 is always missing that first one. I checked out the working directory and the local rsyslog seems to be creating the temp files just fine (I was having earlier issues with rsyslog not being able to write to /var/spool/rsyslog, so any time the local went down it would lose any messages not yet sent to the remote but that was an selinux issue). And it's not like the local was in the process of being brought up or down between phase 1 and 2 and might have missed a log message, it was up and stayed up the whole time. The only thing I can think of is that the remote was still in the process of shutting down when the local sent the first log in phase 2; local rsyslog thought it had been completely sent and things were all good; but remote was stopped halfway while processing it and never wrote it to disk for some reason. Any ideas? -Steven Willis _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards

