Hello, We have two rsyslog "writer" instances which are receiving syslog messages and writing them to an NFS mount on a NAS. The NAS is connected via 10gb ethernet but has a number of other applications writing to it as well. Typical write volume for our log writers is 2-3Mbps to the NAS. Yesterday I attempted to resolve some complaints with logs being "bursty" - events appear to be written in batches which sounded consistent with buffering on the rsyslog daemon or not syncing the file with every message. These bursts could come as infrequently as every 30 seconds however, which seems excessive. I took a number of steps to remedy this but ultimately had to reverse the changes because write volume (measured at the disks) to the NAS went from 2Mbps up to 9Mbps and iops went from 30 up to 900 at the peak. I'm sure there is an explanation for this - but there are some subtle configuration changes which had an impact and I don't understand exactly why.
First, the main change was to set "$ActionFileEnableSync on" in each logging servers configuration. From what I can tell this globally enables the ability to turn on file sync and it defaults to "off". This change, combined with removing the "-" from all the actions in my configuration should have resulted in all files being synced all the time - probably the worst case scenario and I'm not all that suprised this caused problems. Second, before rolling the change back 100% - I removed the "$ActionFileEnableSync on" configuration from the servers but left the action configurations without the "-" - assuming that the global configuration option would completely disable file sync and using a "-" would have no effect. This did not seem to be the case - it did dramatically decrease the volume of traffic to the NAS but still the volume was about double (5Mbps) the normal volume. Third, I added "-" before all the action lines in the configuration and this brought volume back down to the 2-3Mbps we are used to. This leaves me with a few questions: 1) I would expect file sync to increase transactions with the NAS but not increase the volume of data being written to the NAS as measured at the disk. Can someone shed light on why this would so profoundly impact write volume? Load on the rsyslog servers also went down substantially after making this change - presumably because there were fewer queued transactions - not sure. 2) Is there some difference between what "$ActionFileEnableSync on" does vs. adding a "-" before an action other than one is global and one is per-action? I thought these were just two different levels of granularity for configuring the same thing but that doesn't appear to be the case. 3) Is it expected to have rsyslog take up to 30 seconds to flush messages to a logfile? These are BUSY logs - being written to multiple times per second, so it's not a delay on the client. I can observe these same clients logging messages to another logging server (the old server) running syslog-ng and messages are observable in near real-time. Ultimate - #3 is the problem I am trying to solve, but I'm very curious why these changes had the impact they did and why globally disabling file sync using "$ActionFileEnableSync" would not entire decrease log volume until I add "-" to all the actions. Let me know if I can add info that would help shed light on this. Thanks, Aaron _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

