On Tue, 14 May 2013, Rainer Gerhards wrote:
I am not sure if it is such a good idea to do this for RELP. While of
course it is possible to do (and probably not so much work), the
performance profile is quite different. There is much more overhead in
creating and shutting down these sessions and I wouldn't do it too
frequently. Anyhow, I am touching RELP currently as well, so I could see if
I add support for this (depending on effort required, obviously not a
priority at the moment). But I am still a bit skeptic...
I agree that there is a lot of overhead in the the connection setup and frequent
reconnections would be a problem.
But there is setup overhead for any connection, if you do the reconnection every
few messages it's going to be crippling even with TCP. On the other hand, if
it's done every second or so, it's really not going to be significant. I also
don't think that RELP is the highest overhead protocol that rsyslog has.
That said, no matter what the connection and how hight the connection overhead,
there is going to be some point where you want to spread it across multiple
destinations. Given rsyslog's performance it's not likely to be that rsyslog
can't keep up on the far side, but rather that you want to spread the logs
across multiple systems. For example, I deliver logs to a farm of 20 Splunk
systems and I want roughly the same number of log messages on each machine so
that searching the logs is efficient, even though raw performance would allow a
single machine to receive and process the full log feed.
I think it would be good to have this capability for every network based output
(not just RELP, but also TLS, databases, etc), if not make it just a general
purpose "close and re-open this output every X messages" for every possible
output (essentially a HUP for that output)
Ideally the documentation would indicate the overhead of the reconnection
something like
"a reconnection for TCP costs the same overhead of 6 messages"
"a reconnection for RELP costs the same overhead as 30 messages"
"a reconnection for TLS costs the same overhead as 200 messages"
"a reconnection for Postgres costs the same overhead as 40 batches of messages"
the number of round-trips needed to tear down and establish a new connection is
probably a fair first-order approximation)
The other thing that could be done would be to introduce the concept of load
balanced output queues, where the admin can specify multiple outputs for one
action and rsyslog would distribute the messages across the different outputs.
If I were to do this, I would look at introducing a omloadbalance module that
then could call any other output module (probably requiring an action queue),
but this is far more complex because you have to deal with the possibility of
one output failing and deciding how to re-dispatch logs that you had sent to
that output to other outputs.
Closing and re-opening the outputs provides the ability to use external
load balancers with the minimum amount of complexity in rsyslog.
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.