David - I personally don't have experience with imfile - but I bet your assertion that imfile outperforms unspooling from a DA queue is correct. We always appeared to be CPU bound when unspooling a queue. Since we have pushed imtcp inputs up to over 700,000 messages a second in tests, the source of the issues had to have been specifically the queue code and not something about rsyslog in general.
Brian On Wed, Oct 31, 2012 at 5:27 PM, <[email protected]> wrote: > On Wed, 31 Oct 2012, Brian Knox wrote: > >> I have two initial thoughts from experience concerning heavy use of >> the queue system in high traffic (6+ billion log lines a day) >> environments - specifically concerning DA queues. >> >> 1. The queues for the most part were very reliable >> 2. The performance unspooling from a "fail" state (defined as "we had >> to spool to disk") was sub optimal. Using fast io subsystems also >> seemed to have little impact on the unspool speed - it seemed stuck at >> around 50 to 60k messages a second in our environment, and seemed to >> be more CPU bound than IO bound. >> >> So, in our production we always tried to set the in memory portion of >> the queues very high - the disk backed portion was there in case of a >> disaster and we daily prayed we didn't end up having to fail over to >> them. We lived in fear of finding ourselves in a situation where we >> could not unspool fast enough to catch back up. >> >> The queue performance was a definite limiting factor for us. It was >> probably are biggest pain point with rsyslog. >> >> I don't say this to trivialize the complexity of the problem - and to >> be clear, we rarely had -reliability- issues with the queues. > > > I did some tests with the disk queues a few years ago (4.x days, but I > understand there hasn't been much change since then). I was mostly focusing > on the ultra-reliable mode of operation, seeing how fast I could push it > (for the record, about 8K logs/sec on ext2 on a super-fast fusion-io pci > SSD) > > one thing I remember seeing is that there was a huge volume of system calls > (including a lot of open/close calls). > > the core of rsyslog had a similar problem back in the early 3.x days, and > just finding ways to eliminate system calls from the fast path did wonders > for the overall performance of rsyslog. > > Disk queues haven't recieved this treatment yet, and they will benefit > tremendously from it. > > > One thing to check. I believe from your comment above that it's now true > that rsyslog can process data from flat files via imfile FASTER than it can > process queued data that's already been parsed. This is an indication of a > significant process problem. > > David Lang > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

