Hi,

This is proposal towards first-class support for notion of a 'cluster'
as a log-forwarding destination. It talks about a
technology-independent service-discovery-support implementation.

Scenario / Context:
------------------------
Say an environment is supposed to relay all logs to a logical
destination for aggregation/archival purpose. Such a setup at large
scale would have a several log-producers and several log-receivers (or
listeners).

Because of several machines being involved, probability of some
listener nodes failing is significant, and log-producers should
ideally be somehow aware of "current" log-receiving-cluster definition
(that is, nodes that are healthy and receiving logs).

This proposal talks about first-class definition and usage of
log-receiver-cluster.

Configuration concepts:
-----------------------------

action(...): gets an additional parameter called cluster, which
follows URL like conversion (scheme is used to indicate technology
used for service-discovery).

input(...): gets an additional parameter called cluster, same as action

The URL must resolve to currently-available host+port pairs, and may
support notifications, polling or both (depends on underlying
service-discovery technology).

The resolution of URLs and representation of host+port pairs in the
service-endpoint is left to service-discovery technology.

Configuration examples:
------------------------------

action(type="omfwd" cluster="zk://10.20.30.40,10.20.30.41/foo_cluster"
template="foo-fwd" protocol="tcp" tcp_framing="octet-counted"
ResendLastMSGOnReconnect="on")

input(type="imptcp" port="10514" name="foo" ruleset="foo.recv"
cluster="zk://10.20.30.40,10.20.30.41/foo_cluster")

The parameter "cluster" in both cases points to a url
"zk://10.20.30.40,10.20.30.41/foo_cluster" which does:

[behaviour common to both input and action]
     - scheme 'zk' resolves to use of zookeeper based
service-discovery (picks one of the multiple supported
service-discovery technologies)
     - reach the zookeeper at IP address: 10.20.30.40 or 10.20.30.41
     - look for znode (zookeeper specific): /foo_cluster


[action specific behaviour]
     - enumerate ephemeral nodes under the said znode
     - pick one of the child-nodes at random
     - expect it to be follow naming convension host:port
     - break host and port apart and use it as destination for forwarding logs


[input specific behaviour]
     - enumerate IP addresses that this input is listening to
     - create ephemeral child-znodes under the foo_cluster znode with
name IP:port (follow this convention, enforced by service discovery
technology specific implementation)
     - keep the session live as long as input is live (which means,
kill the session if queues are filled up, or the heartbeat-mechanism
will automatically kill it if rsyslog process dies, or host freezes up
or fails, or connecting switch fails, etc).

Service-discovery technology independence:
--------------------------------------------------------

URL-like naming convention allows for scheme to indicate
service-discovery technology. Cluster-url for simple polling based
discovery technology implemented over http would look like this:

http://10.20.30.40/foo_cluster

A log-cabin based impl would have something similar to:

logcabin://10.20.30.40,10.20.30.40/foo_cluster

Implementation sketch:
----------------------------

Service-discovery support can be implemented within rsyslog(runtime),
or as an independent library under rsyslog umbrella (I think I prefer
independent library).

Any input or output modules that choose to support it basically link
with the library and do one of the following:

- as an action:
     when doAction fails, it picks a random-endpoint from a collection
of currently-available-endpoints provided by service discovery (random
picking from the collection is implemented inside the library, because
then it allows us later to make use of other parameters (such as
current load, which will be better for load-balancing)).

- as an input:
     if listen is successful and input is ready to take traffic, it
enumerates all IP addresses it is listening on, and registers with
service discovery technology those end-points (which can then be
discovered by senders).

This can also be used at source or sink where producer or consumer is
aware of the service-discovery technology specific protocol and
rsyslog is only at one end of the pipe (as long as the same protocol
is followed, which will be easy considering the implementation is
available as a library).

Host and port as required by to-be-supported input/output module will
become optional. Either host+port or cluster parameter must be
specified, specifying both will generate a warning, and it'll discard
cluster in favour of host+port (based on specificity).

The support for this can be built incrementally for each input/output
module as desired, I have omfwd and imptcp in mind at this time.

Thoughts?

-- 
Regards,
Janmejay
http://codehunk.wordpress.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to