Re: [rsyslog] [RFC] Log-forward destination-cluster support

singh.janmejay Thu, 04 Jun 2015 01:46:58 -0700

It won't be a very large change really if we develop it in an external library.


In rsyslog codebase, its a fairly small change, limited to input and
output modules that we pick. It'll be small parts of plugin code
(where new connection is established that will call this library
function conditionally, thats about it).



On Thu, Jun 4, 2015 at 2:09 PM, Rainer Gerhards
<[email protected]> wrote:
> Sorry if this sounds discouraging: I currently have such a large
> backlog that I can't engage in that effort and I think I am also
> unable to merge any change of this magnitude any time before the
> backlog has become shorter (Q4+ 2015 I guess).
>
> Sorry I have no better answer, but you see yourself what all is going
> on and I really need to make sure I can follow at least the bare
> essentials.
>
> Rainer
>
> 2015-06-04 5:53 GMT+02:00 singh.janmejay <[email protected]>:
>> Yes L4 load-balancing will work to significant scale. L7
>> load-balancing will do even better in terms of even load, but not sure
>> if syslog protocol is widely supported in load-balancers.
>>
>> DNS scaling and propagation delay are sometimes not acceptable, but
>> BGP anycast is something that'd work at data-center scale with very
>> large PODs.
>>
>> This is an alternative to that. It has fewer moving parts (just
>> producer and consumer), no LB and it doesn't require the complexity of
>> anycast.
>>
>> It trades-off engineering complexity of load-balancer and anycast with
>> smarter-clients and servers (increasing the complexity of clients and
>> servers a little, but also simplifying the deployment topology
>> significantly).
>>
>> I think all three are valid approaches and choice of one over the
>> other(best fit) will vary across deployments.
>>
>>
>> On Thu, Jun 4, 2015 at 8:45 AM, David Lang <[email protected]> wrote:
>>> I don't see the advantage of adding all this complexity as opposed to using
>>> existing load balancing approaches. With existing tools we can deliver the
>>> log stream to a cluster of systems, and deal with them failing. Yes, the
>>> easy approaches to doing this are limited to the throughput of a single
>>> wire, but since that single wire is commonly 10Gb/sec (and easily 40Gb/sec)
>>> with off-the-shelf technology, and the fact that the log stream can be
>>> compressed, this isn't likely to be an issue for much of anyone below Google
>>> scale.
>>>
>>> There is a lot of advantages to keeping the failover logic and config
>>> contained to as small an area of the network and as few devices as possible.
>>> The systems accepting the ogs _must- participate in the process (responding
>>> to health checks if nothing else), it only takes a couple other boxes (if
>>> any) to perform TCP load balancing. And having everything local increases
>>> the accuracy of the detection and speed of recovery.
>>>
>>> If you want to deal with larger failures (datacenter scale), then existing
>>> DNS/BGP failover tools can come into play.
>>>
>>> What advantage do we gain by pushing the configuration and failover logic to
>>> the senders?
>>>
>>> David Lang
>>>
>>>
>>> On Thu, 4 Jun 2015, singh.janmejay wrote:
>>>
>>>> Hi,
>>>>
>>>> This is proposal towards first-class support for notion of a 'cluster'
>>>> as a log-forwarding destination. It talks about a
>>>> technology-independent service-discovery-support implementation.
>>>>
>>>> Scenario / Context:
>>>> ------------------------
>>>> Say an environment is supposed to relay all logs to a logical
>>>> destination for aggregation/archival purpose. Such a setup at large
>>>> scale would have a several log-producers and several log-receivers (or
>>>> listeners).
>>>>
>>>> Because of several machines being involved, probability of some
>>>> listener nodes failing is significant, and log-producers should
>>>> ideally be somehow aware of "current" log-receiving-cluster definition
>>>> (that is, nodes that are healthy and receiving logs).
>>>>
>>>> This proposal talks about first-class definition and usage of
>>>> log-receiver-cluster.
>>>>
>>>> Configuration concepts:
>>>> -----------------------------
>>>>
>>>> action(...): gets an additional parameter called cluster, which
>>>> follows URL like conversion (scheme is used to indicate technology
>>>> used for service-discovery).
>>>>
>>>> input(...): gets an additional parameter called cluster, same as action
>>>>
>>>> The URL must resolve to currently-available host+port pairs, and may
>>>> support notifications, polling or both (depends on underlying
>>>> service-discovery technology).
>>>>
>>>> The resolution of URLs and representation of host+port pairs in the
>>>> service-endpoint is left to service-discovery technology.
>>>>
>>>> Configuration examples:
>>>> ------------------------------
>>>>
>>>> action(type="omfwd" cluster="zk://10.20.30.40,10.20.30.41/foo_cluster"
>>>> template="foo-fwd" protocol="tcp" tcp_framing="octet-counted"
>>>> ResendLastMSGOnReconnect="on")
>>>>
>>>> input(type="imptcp" port="10514" name="foo" ruleset="foo.recv"
>>>> cluster="zk://10.20.30.40,10.20.30.41/foo_cluster")
>>>>
>>>> The parameter "cluster" in both cases points to a url
>>>> "zk://10.20.30.40,10.20.30.41/foo_cluster" which does:
>>>>
>>>> [behaviour common to both input and action]
>>>>     - scheme 'zk' resolves to use of zookeeper based
>>>> service-discovery (picks one of the multiple supported
>>>> service-discovery technologies)
>>>>     - reach the zookeeper at IP address: 10.20.30.40 or 10.20.30.41
>>>>     - look for znode (zookeeper specific): /foo_cluster
>>>>
>>>>
>>>> [action specific behaviour]
>>>>     - enumerate ephemeral nodes under the said znode
>>>>     - pick one of the child-nodes at random
>>>>     - expect it to be follow naming convension host:port
>>>>     - break host and port apart and use it as destination for forwarding
>>>> logs
>>>>
>>>>
>>>> [input specific behaviour]
>>>>     - enumerate IP addresses that this input is listening to
>>>>     - create ephemeral child-znodes under the foo_cluster znode with
>>>> name IP:port (follow this convention, enforced by service discovery
>>>> technology specific implementation)
>>>>     - keep the session live as long as input is live (which means,
>>>> kill the session if queues are filled up, or the heartbeat-mechanism
>>>> will automatically kill it if rsyslog process dies, or host freezes up
>>>> or fails, or connecting switch fails, etc).
>>>>
>>>> Service-discovery technology independence:
>>>> --------------------------------------------------------
>>>>
>>>> URL-like naming convention allows for scheme to indicate
>>>> service-discovery technology. Cluster-url for simple polling based
>>>> discovery technology implemented over http would look like this:
>>>>
>>>> http://10.20.30.40/foo_cluster
>>>>
>>>> A log-cabin based impl would have something similar to:
>>>>
>>>> logcabin://10.20.30.40,10.20.30.40/foo_cluster
>>>>
>>>> Implementation sketch:
>>>> ----------------------------
>>>>
>>>> Service-discovery support can be implemented within rsyslog(runtime),
>>>> or as an independent library under rsyslog umbrella (I think I prefer
>>>> independent library).
>>>>
>>>> Any input or output modules that choose to support it basically link
>>>> with the library and do one of the following:
>>>>
>>>> - as an action:
>>>>     when doAction fails, it picks a random-endpoint from a collection
>>>> of currently-available-endpoints provided by service discovery (random
>>>> picking from the collection is implemented inside the library, because
>>>> then it allows us later to make use of other parameters (such as
>>>> current load, which will be better for load-balancing)).
>>>>
>>>> - as an input:
>>>>     if listen is successful and input is ready to take traffic, it
>>>> enumerates all IP addresses it is listening on, and registers with
>>>> service discovery technology those end-points (which can then be
>>>> discovered by senders).
>>>>
>>>> This can also be used at source or sink where producer or consumer is
>>>> aware of the service-discovery technology specific protocol and
>>>> rsyslog is only at one end of the pipe (as long as the same protocol
>>>> is followed, which will be easy considering the implementation is
>>>> available as a library).
>>>>
>>>> Host and port as required by to-be-supported input/output module will
>>>> become optional. Either host+port or cluster parameter must be
>>>> specified, specifying both will generate a warning, and it'll discard
>>>> cluster in favour of host+port (based on specificity).
>>>>
>>>> The support for this can be built incrementally for each input/output
>>>> module as desired, I have omfwd and imptcp in mind at this time.
>>>>
>>>> Thoughts?
>>>>
>>>>
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>>> LIKE THAT.
>>
>>
>>
>> --
>> Regards,
>> Janmejay
>> http://codehunk.wordpress.com
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
>> LIKE THAT.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
> LIKE THAT.



-- 
Regards,
Janmejay
http://codehunk.wordpress.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] [RFC] Log-forward destination-cluster support

Reply via email to