The devices you are sending to are unlikely to support encrypted TCP (and if they do, you should really move to encrypted RELP not encrypted TCP, if they support one, they are almost certinly going to support the other)

David Lang

On Thu, 6 Nov 2014, Damian wrote:

Actually, I've managed to find a workaround to improve this approach.

Rather than using uptime, which has 1-second resolution, I'm using 
timegenerated, which has microsecond resolution. I was concerned that running 
at high rates and large numbers (eg. 100keps across 100 devices) would result 
in each device being blasted by 100keps for 1 second.

This way, I flip every microsecond to a new destination. I chose not to use the event 
time, since it may not have subsecond resolution.  I also chose not to use the event 
index number that I get inside every event (I have one), since it would require regex of 
the form ".*?EventId=(\d+).*?", which could be expensive in large (6-8KB) log 
lines, and slow down processing.

Next step is to go from UDP to Encrypted TCP, which should hopefully be easy.

Code snippet for the load balancing (across 3 in test);



# Load balance output based on system time

# Define a template that contains just the subsecond value of the event receipt 
time (not event timestamp!).
# For timegenerated, this is in microseconds
template(name="subseconds" type="string" 
string="%timegenerated:::date-subseconds%")

# Set a variable to that value
set $!subsecs = exec_template("subseconds");

# Perform a modulo of the subsecond value of the receipt time to decide which 
way to send it
if ($!subsecs % 3 == 0) then call output_0
if ($!subsecs % 3 == 1) then call output_1
if ($!subsecs % 3 == 2) then call output_2




On Friday, 7 November 2014, 1:04, Damian <[email protected]> wrote:



Thanks David - I got it working:

In the end, the $$uptime % 3 == 0   property worked, and it reliably directed 
traffic to each address for one-second intervals.

Also, having read field() from the link you mentioned, it's basically a substring 
operator, so field(timegenerated,":",3) would return the seconds - which should 
have a similar effect to the above, but based on the event timestamp. Hence, I assume 
(not yet tested) it would take the numeric value and decide it's not an integer rather 
than the parent string, so I could use it in a similar way. eg:

if (field($timegenerated,":",3) %3 == 0) then call destination_0

Thanks!


Damian


On Thursday, 6 November 2014, 12:56, David Lang <[email protected]> wrote:



On Thu, 6 Nov 2014, Damian Skeeles wrote:

Hi David,

Thanks, that's really good info. I'll have another go at uptime as my primary 
focus.

I noticed there are some properties for the replacement properties to show the 
time as epoch/Unix time, so that would be an integer (or could be converted to 
one from string, if rsyslog has such an operation). Any ideas if these are also 
available as properties, or only replacement properties?

all variables should be available in condition tests.

remember that you can also set a variable to
the output of a format operation.

I think that variable contents that look like numbers can be treated as numbers. Rsyslog doesn't have types in it's variables. But if you try to do a math operation on something that doesn't look like a number, it's not going to get evaluated the way you want it to

Btw, any ideas on what the field() operator does? I couldn't find it anywhere in the docs, and it's quite hard to google for by its nature.

rainerscript functions are defined at http://www.rsyslog.com/doc/master/rainerscript/functions.html

I can't do clustering at the receiving end as there are existing products to receive the events, and I want rsyslog to take the entire config/maintenance load of the balancing. I need one machine, one install, one config, all free and reliable, as the entire load balancing glue.

Ok, I'd still suggest that you take a look at the presentations. It requires that you have access to the OS level to make changes, but it doesn't require that the software receiving the logs know anything about it. I've used this approach to deliver logs to proprietary software running on linux boxes in the past with great success.

David Lang



Damian



On 6 Nov 2014, at
12:21, David Lang <[email protected]> wrote:

On Wed, 5 Nov 2014, Damian wrote:

Hi,

I'm currently working on trying to use rsyslog as a basic load balancer, by 
selecting the output on a time basis. I'm using the discussion posted here as 
my starting point:

http://lists.adiscon.net/pipermail/rsyslog/2013-October/034442.html


In this discussion, the authors looked at using:

if ($uptime % 3 == 0) then
action1
if ($uptime % 3 == 1) then action2

if ($uptime % 3 == 2) then action2

To use the system uptime to decide which way to send the events (so it would 
average over the three destinations). However, this didn't work in 7.4, as 
uptime is not available outside templates. I also found 8.4.2 to not like this 
parameter.

try accessing $$uptime (yes it's ugly, but it's a combination of $ to refer to 
the property name and the property name being named $uptime for legacy 
reasons). In some versions I think this is magically combined so you can just 
use $uptime, but I dont't remember what versions (if any) this worked in

For the original discussion, what eventually seemed to work was:

field($timegenerated,':',3);
However - it's not clear how this was used, and I can't see how it would refer 
to three different destinations. It seems more of a string operation than a 
modulus. When I try using this, rsyslog debug mode generates no errors, so it 
seems to work. If I try something like:
if ($timegenerated % 3 == 0) then call output_0
if ($timegenerated % 3 == 1) then call output_1
if ($timegenerated % 3 == 2) then call output_2

Then it gives errors for these lines; it doesn't seem to work as an operation.

$timegenerated is a string, so it's not surprising that this fails.

Can anyone clarify what the field($...) operation does, and how I can use it. 
Alternatively, any
suggestions as to how I can basically call a different ruleset if the 
system/event seconds value is modulus 0, 1, or 2.

I would actually approach this on the receiving end instead.

on the sender, set the rebindinterval to something like 1000 and then on the 
receiving end setup your multiple receivers to share an IP address and split 
traffic between them using the iptables CLUSTERIP feature. I talk about this in 
the presentation I gave at LISA 2012, video and paper are available at:
https://www.usenix.org/conference/lisa12/technical-sessions/presentation/lang_david

This will spread the traffic across the machines roughly every 1000 messages, 
and while it
uses a different mechanism, I think it ends up being cleaner. It's definantly 
easier to add new machines to the cluster as needed, and you can have something 
like corosync (http://http://clusterlabs.org/) to detect failures to the 
recieving servers and adjust the traffic load appropriately.

David Lang

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to