Re: [rsyslog] plans for rsyslog 8.8

David Lang Thu, 15 Jan 2015 11:29:38 -0800

On Thu, 15 Jan 2015, Radu Gheorghe wrote:

On Thu, Jan 15, 2015 at 7:12 PM, Rainer Gerhards <[email protected]>
wrote:
[...]

2. provide a general infrastructure for pull models, whatever this is to be
used for

[...]

Use cases for 2 exists, but I don't know the specifics. They surface every
now and then on the ML when someone ask for pull integration. I think there
was even a discussion with Radu, but I may be wrong.


I remember some discussions, for example if rsyslog could buffer and expose
an API, one could easily implement a plugin (say, on top of Elasticsearch)
that would enable the datastore to pull in data at its own pace.

This contrasts with the current push model, where one has to tune things
like batch sizes and retries in a way that doesn't overall the destination.

I'm missing something here. If rsyslog has a queue for the destination, and thedelivery to the destination is via TCP, how is a pull any better than a push? ifthe destination accepts data at a faster pace than it can really handle, whywould the pull be any better? If the destination only accepts data at the rateit can handle, then the traffic will backup into the rsyslog queue.

Which is not really possible, because you can't control the load generated
by queries, GC, and whatnot.

Of course the pull model has its own caveats, but it would be nice to be
able to choose what works best for every usecase.

I see a case for pull in rsyslog grabbing data (sort of a remote imfile type ofthing), connecting to an existing API to fetch data, or remotely pulling a datafile rather than having to have an agent on the remote machine to scrape andsend it (this may be the right answer to getting logs out of a bunch of windowsmachines for example). The journald input is an example of a pull input.

But for output from rsyslog, I'm not seeing a lot of use. On the other hand, Icould see an output module being pretty straighforward.

Have the data go to a queue, and instead of the output module being invoked bythe main loop, it would sit and wait for a request from the network and thenread messages from it's queue and deliver them to the network. Once the remoteendpoint signals that it's received the data, mark the messages as delivered(removing them from the queue)

One area that I think could use some long-term attention is the internal API tothe queues. Queue contention can be a problem, and disk queues in particular aremuch slower than they should be. Batching things helps a lot in this area, butthis contention can lead to very odd situations where performance is actuallyworse with less traffic (if you are tuned to handle a lot of traffic withmultiple threads, the contention with a low volume of traffic can actallydecrease throughput, as shown by the LDAP thread where having rsyslog write datato a file lets it receive more logs than if it throws them away)


David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] plans for rsyslog 8.8

Reply via email to