Some quick comments:
- with the hypothetical omrest, rsyslog doesn't have to keep the message
until everyone gets it. We can make it so we assume there's one consumer
only, and once the message is acknowledged, it's gone. If you need to fan
out, you can have multiple actions (with their own action queues)
- but it sounds like omrest isn't a good idea (for now), because omprog
already does most of what omrest would do (except it's push vs pull)
- about the particular case of omkafka, we have a publish-subscribe
architecture. Kind of like *MQ, if I understand correctly. So you can't
make Kafka pull, you have to push. So we either have omkafka, or we use
omprog with a script that pushes to Kafka, or we have an independent
service that receives syslog and can push to Kafka. AFAIK the likes of
Logstash, Flume and Fluentd can already play that role, so we don't need to
reinvent the wheel:
https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem


2013/12/16 David Lang <[email protected]>

> On Sun, 15 Dec 2013, Otis Gospodnetic wrote:
>
>  Hi,
>>
>> On Sun, Dec 15, 2013 at 3:36 AM, Radu Gheorghe <[email protected]>
>> wrote:
>>
>>  Just my 2 cents here:
>>> a lot earlier I came with a proposal go give a REST API that would
>>> basically enable external applications to get messages from a rsyslog
>>> queue:
>>> http://bugzilla.adiscon.com/show_bug.cgi?id=482
>>>
>>> With omrest, one should be able to use any programming language to pull
>>> messages from rsyslog. For example, one could write a Kafka publisher (in
>>> any language) that would pull messages from rsyslog and publish to Kafka.
>>>
>>> I assume this is better than omprog because AFAIK with omprog piping to
>>> the
>>> STDIN of a binary there's a tiny OS buffer (a pipe or something? this is
>>> iffy territory for me) that may get full and you may lose messages if the
>>> other app isn't fast enough. That, or you need to implement queues in
>>> your
>>> external program. Which is duplicate work, queues are already in rsyslog.
>>> With omrest (hypothetically), if you need more performance, you just need
>>> to spawn more threads/processes to pull from the queue and push wherever.
>>> Assuming you have the hardware.
>>>
>>>
>> I like the omrest idea because I like the ability for systems to pull at
>> their own speed and to upgrade more freely.
>>
>
> the problem is that rsyslog is stuck holding on to the messages until all
> possible clients have pulled the message. This is not a really good idea.
>
>
>  If the above issue were in Github I'd click the "Watch" button immediately
>> to get notified of any comments or activity around it.... but I'd have to
>> create yet another account in "somebody's Bugzilla", so I won't. (just
>> trying to illustrate the thinking that I bet many people go through in
>> similar situations).
>>
>> Re omprog, does rsyslog launch the referenced binary for each message or
>> just launches the process once and keeps feeding it via stdin?
>>
>
> once and then feeds multiple messages (and relaunches it if it dies)
>
>
>  I'm no Linux/C programmer, so maybe this makes no sense, but would using
>> sendfile be appropriate here?
>>
>
> not really. sendfile is appropriate when you have a file that you are
> sending as-is with no modification to it.
>
> David Lang
>
>
>  Thanks,
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>>  On the input side, one can already write connectors in any language. Just
>>> make the thing push to any input rsyslog supports. For most use-cases,
>>> rsyslog should pull from that input fast enough to avoid any issues.
>>>
>>> Now the only problem with omrest is that it needs to be implemented :)
>>> Which bumps into the 24h problem of people [who can actually do it].
>>>
>>> Best regards,
>>> Radu
>>>
>>> 2013/12/15 Otis Gospodnetic <[email protected]>
>>>
>>>  Hi,
>>>>
>>>> Thanks for the info.
>>>> I was asking because having the ability to write ims and oms in
>>>> different
>>>> languages would open a lot of opportunities.  This is one of those
>>>> "enablement" things.  I understand writing modules in other languages
>>>> may
>>>> mean those using such modules may hurt performance, but some people need
>>>> certain functionality more than performance.
>>>>
>>>> Take omkafka example from the other day.  If there were a way to write
>>>> an
>>>> om in Java it's be trivial for a lot of Java developers out there to
>>>> contribute omkafka.
>>>>
>>>> If omprog enables development of the ecosystem, it sounds like something
>>>>
>>> to
>>>
>>>> point out clearly somewhere and nurture that a bit.  I do see
>>>> http://www.rsyslog.com/doc/omprog.html because somebody shared a link,
>>>>
>>> but
>>>
>>>> I don't see that on http://www.rsyslog.com/doc or on
>>>> http://www.rsyslog.com/doc/dev_oplugins.html or in the new README.
>>>>
>>>> Coincidentally, I just came across Fluentd's instructions for writing
>>>> plugins, which could serve as guidance:
>>>> http://docs.fluentd.org/articles/plugin-development .  Nice, clean,
>>>> well
>>>> structured, not a lot of prose...
>>>>
>>>> Otis
>>>> --
>>>> Performance Monitoring * Log Analytics * Search Analytics
>>>> Solr & Elasticsearch Support * http://sematext.com/
>>>>
>>>>
>>>> On Sat, Dec 14, 2013 at 1:39 PM, David Lang <[email protected]> wrote:
>>>>
>>>>  On Sat, 14 Dec 2013, RB wrote:
>>>>>
>>>>>  On Sat, Dec 14, 2013 at 5:24 AM, Rainer Gerhards
>>>>>
>>>>>> <[email protected]> wrote:
>>>>>>
>>>>>>  well, technically it's for sure possible, it's just another of these
>>>>>>>
>>>>>> 24h
>>>>
>>>>> things. Technically, it's a question of interface, and insofar of
>>>>>>>
>>>>>> which
>>>
>>>> types of modules. Obviously, these will be slower, and how slow is
>>>>>>> another
>>>>>>> interface/effort question.
>>>>>>>
>>>>>>> Thinking about this, one could probably also claim the answer is
>>>>>>>
>>>>>> "yes,
>>>
>>>> you
>>>>>>> can write OUTPUT modules in any language", it's just a doc issue. In
>>>>>>> fact,
>>>>>>> omprog can be used as an interface here. It's actually not even a bad
>>>>>>> interface...
>>>>>>>
>>>>>>> Again, something learned ;)
>>>>>>>
>>>>>>>
>>>>>> Probably the cheapest (implementation) "binding" for rsyslog would be
>>>>>> a system() like call.  Execute the subprogram with /bin/sh -c and
>>>>>> communicate with structured messages on STDIO.
>>>>>>
>>>>>>
>>>>> a real module binding would be far more complex. It would allow the
>>>>>
>>>> module
>>>>
>>>>> (in whatever language) access to the rsyslog queues and other data
>>>>> structures. This is possible, but not easy by any means.
>>>>>
>>>>> One big problem is that currently rsyslog does all this work in a
>>>>>
>>>> threaded
>>>>
>>>>> environment. It may make sense in v9 or v10 to shift from a default
>>>>> shared-everything threading model to a explicit shared memory
>>>>>
>>>> multiprocess
>>>>
>>>>> model. At that point having one of the processes use a different
>>>>>
>>>> language
>>>
>>>> would not be that hard.
>>>>>
>>>>> But in the current threaded model, having one thread run a different
>>>>> language would be very, very hard.
>>>>>
>>>>> The other issue here is performance. Rsyslog goes to a LOT of effort to
>>>>>
>>>> be
>>>>
>>>>> fast. Some of the things that have made very noticable diffences in
>>>>> performance in rsyslog are things that seem like they should be very
>>>>>
>>>> minor.
>>>>
>>>>> Think about these things and then think about what would be involved to
>>>>> define interfaces in a multi-language safe way.
>>>>>
>>>>> things that have resulted in noticable speedups have been:
>>>>>
>>>>> removing gettimeofday() calls.
>>>>>
>>>>>   it used to be that rsyslog recorded when a message arrived, when it
>>>>>
>>>> was
>>>
>>>> put on the main queue, when it was moved to an action queue, when it
>>>>>
>>>> was
>>>
>>>> pulled from the action queue, and when it was delivered
>>>>>
>>>>>   now, high performance users configure rsyslog so that it only does
>>>>>
>>>> one
>>>
>>>> gettimeofday() call per hundred (ot thousand) messages that arrive and
>>>>>
>>>> use
>>>>
>>>>> that one time for every message
>>>>>
>>>>> string modules
>>>>>
>>>>>   it used to be that the default template (<%pri%>%timestamp%
>>>>>
>>>> %hostname%
>>>
>>>> %syslogtag%%msg%) was interpreted by the rsyslog engine for every
>>>>>
>>>> message
>>>
>>>> that was output
>>>>>
>>>>>   now string modules written in C create these strings rather than
>>>>> interpreting the template. This resulted in a double-digit %
>>>>>
>>>> performance
>>>
>>>> improvement
>>>>>
>>>>> With optimizations like these in use, changing things to allow for a
>>>>> module written in a different language to have access to the rsyslog
>>>>> internals as would be needed for a high-performance interface seems
>>>>>
>>>> like
>>>
>>>> it
>>>>
>>>>> will probably end up hurting the rsyslog performance overall.
>>>>>
>>>>>
>>>>> That being said, I am very much in favor of multi-process with explicit
>>>>> sharing rather than multi-threaded with implicit sharing, but getting
>>>>>
>>>> all
>>>
>>>> the interfaces correct and fast would be a VERY hard task.
>>>>>
>>>>> David Lang
>>>>>
>>>>> _______________________________________________
>>>>> rsyslog mailing list
>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>> http://www.rsyslog.com/professional-services/
>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>>
>>>> myriad
>>>
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>>> DON'T LIKE THAT.
>>>>>
>>>>>  _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>>  _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>> DON'T LIKE THAT.
>>>
>>>  _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>  _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to