Great questions, David. In a nutshell, it's all about load-balancing the destinations. Say (for example) that my downstream recipients have a cap of 20K messages per second, I'm trying to respect those. There's some question (I'm following up on) whether if I were to dump 60K into one for a second, then switch to the next one for a second, and so forth, such that the *average* rate was < 20K mps that would be OK. I'm assuming that it won't be, so I need to continue to work towards getting a more true distribution. In an ideal world, I'd be able to use TCP to forward to the destinations, and just let standard TCP throttling take care of it. But, I'm stuck with UDP for now, so can't do that. A true round-robin would be the gold standard, but absent a "$messagecount" property, or global variables (or a full-fledged load balancing feature, which I offhand don't think rsyslog needs, rather, it needs to allow people to implement their own via the above methods), I'm left pursuing these avenues. As for latency, I'm actually OK with several minutes of latency. Not concerned about message order, either. Thanks!!! Robert
> Date: Wed, 23 Oct 2013 16:01:11 -0700 > From: [email protected] > To: [email protected] > Subject: Re: [rsyslog] Another approach to action load balancing > > On Wed, 23 Oct 2013, Robert McIntyre wrote: > > > So, I've had decent luck with Pavel's suggestion > > (field($timegenerated,':',3), > > and it rotates around nicely based on the second. > > > > I'm trying a slightly different approach, though, to try to get sub-second > > rotation. > > While I am in no way saying that you shouldn't continue trying to solve the > problem. I will ask if you really need the sub-second rotation in practice? > > if you can rotate between outputs, and give each output it's own action > queue, > with that queue having space for one second's worth of logs, then load > balancing > across 3 outputs will give you 3x the throughput at the cost of delaying logs > up > to ~3 seconds (if everything is on the edge of overload) > > if it takes N seconds for one output to process 1 second's worth of logs, and > you spread the traffic across M outputs (where M > N to give you a little > headroom just in case), then your most delayed log will be delayed up N > seconds > > going to rotation every 1/10 second will change the delay to be N/10 > > This will cause your logs to arrive at the destination in a different order > than > they were in the first place, but any load balancing scheme gives this > potential. > > The batch mode processing in rsyslog where each worker thread grabs up to N > messages from the main queue and works on them while another worker thread > grabs > the next up to N messages results in the same sort of thing (in this case N > defaults to 128 or 256 for the main queue, while a full second's worth of > logs > will almost certinly be a much larger number :-) > > > When doing load balancing on the network layer, I try to set the rebind > interval > large enough that it only rebalances once a second (or a handful of times per > second) and when I put logs into something that doesn't take a stream well > (like > Splunk), I write the logs to disk and move them to the destination once a > minute, and even that ends up not being really significant in practice. > > > > Yes, at lower traffic levels the balancing is choppy with different workers > going visibly idle for a bit, but finer grain balancing still has the workers > going idle, just for shorter periods that you are not seeing due to the > course > granularity of your measurements :-) > > if you balance per second across 10 outputs and you have 8 of them idle > looking > at top with a refresh interval of 1, then changed to balancing every 1/10 > second > you would still have 8 of them idle if you were able to set top to give you a > refresh interval of 1/10 > > > the real question to think about is what is the maximum overall delay that > you > can live with. > > David Lang > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

