Re: [rsyslog] Would imhiredis make sense?
Side-note: I agree with mostolog on the advantages of componentication for fault isolation. Just another user case... Rainer Sent from phone, thus brief. Am 23.11.2016 14:47 schrieb "David Lang": > On Wed, 23 Nov 2016, mosto...@gmail.com wrote: > > However, if you really want to go this way, one thing you can do is to >>> make use of the multicast mac feature in ethernet to distribute the same >>> logs to multiple systems/containers and have each container throw away all >>> logs except what it's configured to handle. >>> >>> This lets you add/remove log processing at any time and even have >>> multiple systems processing the same logs in different ways >>> >>> https://www.usenix.org/conference/lisa12/technical-sessions/ >>> presentation/lang_david >>> >> Network traffic x2 >> Actually, we are using a similar environment for other things, but I >> don't think that's the way to go. >> > > This doesn't need to double the network traffic in the way you are > thinking. The IP address that the senders deliver to is shared across all > your processing boxes. The switch replicates the traffic on it's backbone > and delivers it to each machine. > > with your current approach you do > > sender -> rsyslog -> redis -> logstash -> ES > > so there are 3-4 copies of the logs (depending on if sender and rsyslog > are the same box) > > if instead you did > > sender -> multicast mac to rsyslog -> ES > > there would only be two copies of the logs on the wire at any point > (although N copies total going into the rsyslog box, but that's only on the > interface to those boxes) > > David Lang > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Would imhiredis make sense?
On Wed, 23 Nov 2016, mosto...@gmail.com wrote: However, if you really want to go this way, one thing you can do is to make use of the multicast mac feature in ethernet to distribute the same logs to multiple systems/containers and have each container throw away all logs except what it's configured to handle. This lets you add/remove log processing at any time and even have multiple systems processing the same logs in different ways https://www.usenix.org/conference/lisa12/technical-sessions/presentation/lang_david Network traffic x2 Actually, we are using a similar environment for other things, but I don't think that's the way to go. This doesn't need to double the network traffic in the way you are thinking. The IP address that the senders deliver to is shared across all your processing boxes. The switch replicates the traffic on it's backbone and delivers it to each machine. with your current approach you do sender -> rsyslog -> redis -> logstash -> ES so there are 3-4 copies of the logs (depending on if sender and rsyslog are the same box) if instead you did sender -> multicast mac to rsyslog -> ES there would only be two copies of the logs on the wire at any point (although N copies total going into the rsyslog box, but that's only on the interface to those boxes) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Would imhiredis make sense?
Logstash needs something like redis because it can't do any queueing itself. Rsyslog is built around queues, and has the ability to create multiple queues and piplines internally, you don't need to run multiple instances. I want multiples instances in order to: * Being able to process pipelines on different containers/hosts much less needed on rsyslog due to the higher effiency. I've had rsyslog handling over a hundred thousand logs/sec on a single host. This is our current scenario (each element deployed within a docker container): logs-->RELP-->rsyslog-->redis-->logstash_app_1/N... This allow us to have multiple simpler configurations for logstash, splitting traffic between multiple workers/containers on different hosts, high availability, load balancing... * Isolate pipelines to prevent problems on one affecting others rulesets with queues on each ruleset solvs this for you. One segfault while processing one ruleset/action (actually, it happened a lot with 8.22) crash the whole process. All processing from that point on will take place in different threads working on different queues for each category. Will I be able to "reload" rsyslog configuration to add/delete new rulesets/pipelines? you can stop/start rsyslog, but there is not a way to change the config on the fly. :( However, if you really want to go this way, one thing you can do is to make use of the multicast mac feature in ethernet to distribute the same logs to multiple systems/containers and have each container throw away all logs except what it's configured to handle. This lets you add/remove log processing at any time and even have multiple systems processing the same logs in different ways https://www.usenix.org/conference/lisa12/technical-sessions/presentation/lang_david Network traffic x2 Actually, we are using a similar environment for other things, but I don't think that's the way to go. KISS, start simple and only add complexity when you find it's actually needed. Have plans for how to scale out when you hit limits, but you usually find that you hit limits far later than expected. Yes, you may have to eventually do the same work, but by having a solid system now with less work, you can spend the time saved now to improve other things. KISS is great, but we are looking to build a dynamic pipeline, and we found rsyslog is close to be the proper tool, with a couple of changes! Somehow related with Rainer's new file reader proposal, I think a rsyslog code review/refactor will help with this. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Would imhiredis make sense?
On Tue, 22 Nov 2016, mosto...@gmail.com wrote: What sort of log volume are you talking about here? (logs/sec type of thing) From 0 to thousand-thousands/sec Logstash needs something like redis because it can't do any queueing itself. Rsyslog is built around queues, and has the ability to create multiple queues and piplines internally, you don't need to run multiple instances. I want multiples instances in order to: * Being able to process pipelines on different containers/hosts much less needed on rsyslog due to the higher effiency. I've had rsyslog handling over a hundred thousand logs/sec on a single host. * Isolate pipelines to prevent problems on one affecting others rulesets with queues on each ruleset solvs this for you. * (others) that's hard to answer :-) What you would do is create a ruleset for each application (pipeline) and give that ruleset it's own queue. I know it can be done, but not what I'm looking for. Moreover, I would love to be a "dynamic" configuration As new logs arrive, you then sort them by application, and for each application (or application category), you call the appropriate ruleset. And, if there are a lot of evt/sec, you may have a bottleneck. I'll probably have a rsyslog cluster based on docker swarm mode This is unlikly to be a bottleneck. The overhead of recieving a log message, parsing it, and looking up what ruleset to call is very cheap. At anything under several hundred thousand logs/sec it's unlikly to max out a single core. All processing from that point on will take place in different threads working on different queues for each category. Will I be able to "reload" rsyslog configuration to add/delete new rulesets/pipelines? you can stop/start rsyslog, but there is not a way to change the config on the fly. However, if you really want to go this way, one thing you can do is to make use of the multicast mac feature in ethernet to distribute the same logs to multiple systems/containers and have each container throw away all logs except what it's configured to handle. This lets you add/remove log processing at any time and even have multiple systems processing the same logs in different ways https://www.usenix.org/conference/lisa12/technical-sessions/presentation/lang_david Give it a try, I'll bet that you find the result much simpler and faster. I expecting your reply ;) KISS, start simple and only add complexity when you find it's actually needed. Have plans for how to scale out when you hit limits, but you usually find that you hit limits far later than expected. Yes, you may have to eventually do the same work, but by having a solid system now with less work, you can spend the time saved now to improve other things. David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Would imhiredis make sense?
What sort of log volume are you talking about here? (logs/sec type of thing) From 0 to thousand-thousands/sec Logstash needs something like redis because it can't do any queueing itself. Rsyslog is built around queues, and has the ability to create multiple queues and piplines internally, you don't need to run multiple instances. I want multiples instances in order to: * Being able to process pipelines on different containers/hosts * Isolate pipelines to prevent problems on one affecting others * (others) What you would do is create a ruleset for each application (pipeline) and give that ruleset it's own queue. I know it can be done, but not what I'm looking for. Moreover, I would love to be a "dynamic" configuration As new logs arrive, you then sort them by application, and for each application (or application category), you call the appropriate ruleset. And, if there are a lot of evt/sec, you may have a bottleneck. I'll probably have a rsyslog cluster based on docker swarm mode All processing from that point on will take place in different threads working on different queues for each category. Will I be able to "reload" rsyslog configuration to add/delete new rulesets/pipelines? Give it a try, I'll bet that you find the result much simpler and faster. I expecting your reply ;) ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] Would imhiredis make sense?
On Tue, 22 Nov 2016, mosto...@gmail.com wrote: We've been playing with logstash, rsyslog and redis for a while in order to *index into elasticsearch a bunch of application logs*. Briefly: app1-file1.log, app1-file2.log...appN-fileX.log -> pipeline -> elasticsearch. So far, we are using *redis queues and _each application_ processing was made by one logstash instance* (docker container). Of course, this works with 5-10 applications, but it doesn't when you plan to deploy 100 apps cause each logstash instance requires ~512MB of RAM. We've been thinking about rsyslog since the beginning, because it takes fewer RAM, but just noticed it doesn't have a *redis input module (aka: imhiredis)* We still plan to have independent instances (one rsyslog for each application), but we're wondering if you'll consider it makes sense to implement this module. What sort of log volume are you talking about here? (logs/sec type of thing) Logstash needs something like redis because it can't do any queueing itself. Rsyslog is built around queues, and has the ability to create multiple queues and piplines internally, you don't need to run multiple instances. What you would do is create a ruleset for each application (pipeline) and give that ruleset it's own queue. As new logs arrive, you then sort them by application, and for each application (or application category), you call the appropriate ruleset. All processing from that point on will take place in different threads working on different queues for each category. Give it a try, I'll bet that you find the result much simpler and faster. David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
[rsyslog] Would imhiredis make sense?
Hi We've been playing with logstash, rsyslog and redis for a while in order to *index into elasticsearch a bunch of application logs*. Briefly: app1-file1.log, app1-file2.log...appN-fileX.log -> pipeline -> elasticsearch. So far, we are using *redis queues and _each application_ processing was made by one logstash instance* (docker container). Of course, this works with 5-10 applications, but it doesn't when you plan to deploy 100 apps cause each logstash instance requires ~512MB of RAM. We've been thinking about rsyslog since the beginning, because it takes fewer RAM, but just noticed it doesn't have a *redis input module (aka: imhiredis)* We still plan to have independent instances (one rsyslog for each application), but we're wondering if you'll consider it makes sense to implement this module. Regards ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.