[rsyslog] Fwd: [logstash-users] Re: Log Drop Issue

Rainer Gerhards Mon, 19 May 2014 22:26:24 -0700

---------- Forwarded message ----------
From: Jordan Sissel <[email protected]>
Date: Tue, May 20, 2014 at 12:51 AM
Subject: Fwd: [logstash-users] Re: [rsyslog] Log Drop Issue
To: [email protected], [email protected]



This message got bounced because I am not on the list. It might be worth
forwarding to the rsyslog-users list given it should answer the user's
question as well as David's concerns about uncertainty and hear-say.

-Jordan

---------- Forwarded message ----------
From: <[email protected]>
Date: Mon, May 19, 2014 at 9:52 AM
Subject: Re: [logstash-users] Re: [rsyslog] Log Drop Issue
To: [email protected]


You are not allowed to post to this mailing list, and your message has
been automatically rejected.  If you think that your messages are
being rejected in error, contact the mailing list owner at
[email protected].



---------- Forwarded message ----------
From: Jordan Sissel <[email protected]>
To: logstash-users <[email protected]>
Cc: rsyslog-users <[email protected]>
Date: Mon, 19 May 2014 09:56:25 -0700
Subject: Re: [logstash-users] Re: [rsyslog] Log Drop Issue



On Mon, May 19, 2014 at 7:33 AM, David Lang <[email protected]> wrote:

> the pstats output should also show the deliveries to logstash (at least if
> you are using anything close to a current version of rsyslog)
>
> This isn't the logstash support mailing list :-) but we've been hearing a
> lot of people talking about how logstash will drop logs when it gets
> behind, so that could be the problem. If you use RELP instead of just TCP,
> that will guarantee all the logs get to logstash, even if the TCP
> connection gets cut and has to be reestablished. But if logstash is
> dropping them after that, we won't be able to help much.
>
>
David; this email thread is on both logstash-users and rsyslog-users, so,
in a way, this is the logstash support mailing list ;)

As for "Logstash will drop logs when it gets behind" - this is neither a
feature nor something I've observed in general. Logstash's inputs don't
have any such code that says "Are we under load? OK, drop logs!" - logstash
aims to never drop logs unless the operator has explicitly requested such
dropping.

I do agree that TCP isn't great for preventing message loss and that an
application-level acknowledgement protocol such as RELP will help in
transmission reliability.

Back at the reported problem - message loss. Empirically, I cannot
reproduce this message loss under several configurations. The examples
below tests with and without rsyslog, with and without logstash filters;
and in both scenarios, as I expected, all messages are received properly. I
suspect something else is going on here.

I did 4 tests:
* seq | nc -> logstash (tcp -> tcp) -> nc -> /tmp/somefile
* seq | logger -> rsyslog -> logstash (tcp -> tcp) -> nc -> /tmp/somefile
* seq | nc -> logstash (tcp -> grok -> tcp) -> nc -> /tmp/somefile
* seq | logger -> rsyslog -> logstash (tcp -> grok -> tcp) -> nc ->
/tmp/somefile

If logstash or rsyslog are dropping messages, I can't reproduce it. Full
details of the test runs below.

# Fun generator
% seq -f "%g laksjdf lakjsdf lakjsdf lkasjdf laksjflakwjet laksjtl kasjdl
kajsdlfkja lkwje tlkasjltkaj welktj alkdsj falskdjf laksjdf lkaj lkwj
elkjatslkdjt alksdjf laksjd flkasjd flkaj lkjwe;taosiejtaow;eijt" 100000 |
nc localhost 3333
# Logstash passing tcp to tcp:
% bin/logstash -e 'input { tcp { port => 3333 } } output { tcp { codec =>
line { format => "%{message}" } host => localhost port => 3334 } }'
# Receiver:
% nc -l localhost 3334
# Result:
% wc -l /tmp/logstash.logs
100000 /tmp/logstash.logs
% tail -1 /tmp/logstash.logs
100000 laksjdf lakjsdf lakjsdf lkasjdf laksjflakwjet laksjtl kasjdl
kajsdlfkja lkwje tlkasjltkaj welktj alkdsj falskdjf laksjdf lkaj lkwj
elkjatslkdjt alksdjf laksjd flkasjd flkaj lkjwe;taosiejtaow;eijt

Now, if I repeat the same process, but put rsyslog in between (seq -> nc ->
rsyslog -> logstash -> nc -> file), the results are exactly the same. No
events are lost.

# Fun generator (now pipes to logger)
% seq -f "%g laksjdf lakjsdf lakjsdf lkasjdf laksjflakwjet laksjtl kasjdl
kajsdlfkja lkwje tlkasjltkaj welktj alkdsj falskdjf laksjdf lkaj lkwj
elkjatslkdjt alksdjf laksjd flkasjd flkaj lkjwe;taosiejtaow;eijt" 100000 |
logger -p local7.info
# In my rsyslog config:
local7:* @@localhost:3333
# Same logstash as previous test
# Result:
% wc -l /tmp/logstash.logs2
100000 /tmp/logstash.logs2
% tail -1 /tmp/logstash.logs2
<190>May 19 16:38:38 oh-my jls: 100000 laksjdf lakjsdf lakjsdf lkasjdf
laksjflakwjet laksjtl kasjdl kajsdlfkja lkwje tlkasjltkaj welktj alkdsj
falskdjf laksjdf lkaj lkwj elkjatslkdjt alksdjf laksjd flkasjd flkaj
lkjwe;taosiejtaow;eijt


If I add a simple filter in the middle (grok to parse a number), all logs
are successfully transmitted.
% wc -l /tmp/logstash.logs.grok*
  100000 /tmp/logstash.logs.grok         # netcat -> logstash
  100000 /tmp/logstash.logs.grok2      # logger -> rsyslog -> logstash

-Jordan



>
>> we are generating logs from LOIC with 800 logs per second and sending it
>> to
>> rsyslog.
>>
>> rsyslog agent is receiving all the logs even it is showing some count on
>> the upper side , i mean if i sent 3000 logs from LOIC it always shows me
>> that it receives 3020 logs , don't know about these 20.(According to my
>> imstat file)
>>
>> Now again rsyslog passes these logs to logstash which at the moment has 15
>> filters. The metric to count the logs recived at logstash is
>> http://logstash.net/docs/1.4.1/filters/metrics#flush_interval , but it
>> always showed me the less number as i received in rsyslog. At certain
>> level
>> for example ,,3000 logs/5 sec, count of rsyslog and logstash is same but
>> after 3500 , 4000 there is a difference of 30 -40 logs between rsyslog and
>> logstash , our connection between rsyslog to logstash is TCP , so there
>> should be no reason for this difference , either logstash is unable to
>> parse all the msgs or rsyslog is sending less msgs to logstash.
>>
>> Any clue?
>>
>> On Sunday, May 18, 2014, David Lang <[email protected]> wrote:
>>
>>> If you are sending the right contents in the packet, LOIC should be just
>>>
>> fine.
>>
>>>
>>> Now, you haven't said what version of rsyslog you are using, your
>>>
>> configuration, or even talked about what speed network you are on (let
>> alone system specs), so figuring out what's wrong is not possible yet :-)
>>
>>>
>>> That said, we've had people getting around 400,000 packets/sec through
>>>
>> rsyslog, so you are probably not hitting any fundamental limit at 10,000
>> packets/sec. But there are a lot of things to look at to figure this out.
>>
>>>
>>> start by configuring impstats (set it to dump stats every 10 sec for now
>>>
>> at these traffic levels) so that we can see the what's happening inside
>> rsyslog (to see if the problem is getting the logs in, or out of rsyslog)
>>
>>>
>>> version info (rsyslog, distro, kernel)
>>>
>>> when you say you put things on a switch, is this a 100Mb switch, Gb
>>>
>> switch, managed (so you can get stats from the switch)????
>>
>>>
>>> what's your rsyslog config?
>>>
>>> what do you use for name resolution (/etc/hosts, local DNS, nearby DNS,
>>>
>> ISP DNS, LDAP, ????)
>>
>>>
>>> UDP isn't necessarily faster than TCP, but it's a whole lot easier to
>>>
>> setup a UDP test, so let's stick with that for now. There's nothing in the
>> syslog protocol to add reliability to UDP
>>
>>>
>>> David Lang
>>>
>>>
>>>  On Sun, 18 May 2014, masoom alam wrote:
>>>
>>>  Dear All,
>>>>
>>>> I hope every one is doing fine. We were doing stress testing of Rsyslog
>>>>
>>> and
>>
>>> found few problems (in our setup and not in Rsyslog :)) that I want to
>>>> discuss at this forum.
>>>>
>>>>
>>>>   1. We were using LOIC (LOIC is DDOS attack tool - your anti virus will
>>>>   delete it :) - disclaimer: handle it with care) for generation of UDP
>>>>   packets. We created a customized log message. The speed of Packets
>>>>
>>> sent by
>>
>>>   LOIC is very very high, that is some thing 20,000 packets in 2 sec. for
>>>>   example. Every thing is fine if we use point to point connection
>>>>
>>> between
>>
>>>   Rsyslog machine and MangoDB machine. Point to point means connection
>>>>
>>> via
>>
>>>   cross cable and not through a switch. We calculated the no. of packets
>>>>
>>> sent
>>
>>>   by LOIC and no. of packets received by Rsyslog and then written by
>>>>
>>> MongoDB
>>
>>>   after parsing by Logtash is fairly equal.
>>>>   2. However, if we connect both MongoDB and Rsyslog through a switch
>>>>
>>> LAN,
>>
>>>   there is a packet drop at the Rsyslog end, some what between 300-500
>>>>   packets depending on the speed of LOIC - thus 300-500 lesser logs are
>>>>   written to MongoDB.
>>>>
>>>>
>>>> What we concluded and I want your expert opinion on this:
>>>>
>>>>
>>>>   1. It seems LOIC is the not the right choice for traffic generation
>>>> for
>>>>   Rsyslog - that is stress testing of Rsyslog. It sends packets via UDP
>>>>
>>> 514,
>>
>>>   but essentially it does not follow Syslog Protocol. Now, we are trying
>>>>
>>> to
>>
>>>   understand: Is there some sort of reliability achieved in Syslog
>>>>
>>> protocol
>>
>>>   even if packets are sent on UDP 514? BTW, i am well aware that UDP is
>>>>
>>> for
>>
>>>   faster communication but at the expense of reliability. Why we are
>>>>
>>> saying
>>
>>>   that there is a problem at the LOIC - that is traffic generation end -
>>>>   because when we selected to send traffic via TCP on LOIC, due to speed
>>>>
>>> it
>>
>>>   combines log packets and Rsyslog reports an error in its log. The net
>>>>   effect of this operation is that Rsyslog combines arbitrary no. of
>>>> logs
>>>>   together and then give to Logtash, which does not understand and leave
>>>>
>>> it
>>
>>>   un parsed.
>>>>   2. What options do we have, either we use a python library to generate
>>>>   traffic, write it to /var/log/messages and ask the Rsyslog to send
>>>> that
>>>>   traffic to another Rsyslog?. Does using this way guarantees that there
>>>>
>>> will
>>
>>>   be no drop of Log messages even in UDP?
>>>>   3. But what I am interested in understanding what is the reliability
>>>>   mechanism provided by Rsyslog in general in the case of UDP. After all
>>>>
>>> each
>>
>>>   n every log is a very important piece of information and can destroy
>>>>
>>> the
>>
>>>   reputation of an organization.
>>>>   4. Even if the some reliability can be achieved at the Rsyslog end,
>>>> how
>>>>   can we avoid - up to maximum extent - the possibility of log drop
>>>>
>>> between
>>
>>>   Rsyslog to Logtash. Logtash is a very small program than Rsyslog.
>>>>
>>> Rsyslog
>>
>>>   in our setup is used only for Buffering - thus, what parameters in the
>>>>   .conf file of Rsyslog should be changed to achieve this reliability.
>>>>
>>> Please
>>
>>>   note that, our Rsyslog and Logtash are setup at the same system - so no
>>>>   network latency at this end.
>>>>   5. In all this setup, the performance of LogAnalyzer was very pathetic
>>>>   in filtering and running other queries. So we choose to write a simple
>>>>
>>> PHP
>>
>>>   web page for displaying logs and it is running much much faster than
>>>>   LogAnalyzer.
>>>>   6. Are we on the right path for checking reliability and stress
>>>>
>>> testing?
>>
>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>>
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>>
>>>>  _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>> DON'T LIKE THAT.
>>
>>>
>>>
>>
>>
> --
> Remember: if a new user has a bad time, it's a bug in logstash.
> --- You received this message because you are subscribed to the Google
> Groups "logstash-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

[rsyslog] Fwd: [logstash-users] Re: Log Drop Issue

Reply via email to