Re: [rsyslog] stuck up with bad performance with rsyslog+mysql

Rainer Gerhards Mon, 17 Jun 2013 02:00:14 -0700

On Mon, Jun 17, 2013 at 8:57 AM, Mahesh V <[email protected]>wrote:


> The problem in my application is, it has a heartbeat for every second with
> device under test.
> If the cpu usage from other entities are high, the heartbeat looses its
> rythm and hence the connectivity.
>
>
This is OT, but: if you just need a steady hearbeat and do not do much more
complex processing during that time, you could also consider running the
heartbeat at realtime priority. You can check imudp.c for an example of how
to do that.

Rainer

> thanks,
> Mahesh
>
>
> On Sun, Jun 16, 2013 at 1:50 PM, David Lang <[email protected]> wrote:
>
> > On Sun, 16 Jun 2013, Radu Gheorghe wrote:
> >
> >  2013/6/14 David Lang <[email protected]>
> >>
> >>  On Fri, 14 Jun 2013, Radu Gheorghe wrote:
> >>>
> >>>  Hi Mahesh,
> >>>
> >>>>
> >>>> If you don't need mysql for a specific reason, I'd suggest you try
> >>>> thowing
> >>>> your logs in Elasticsearch. Here's a tutorial:
> >>>> http://wiki.rsyslog.com/index.****php/HOWTO:_rsyslog_%2B_****
> >>>> elasticsearch<
> http://wiki.rsyslog.com/index.**php/HOWTO:_rsyslog_%2B_**elasticsearch>
> >>>> <http://wiki.**
> rsyslog.com/index.php/HOWTO:_**rsyslog_%2B_elasticsearch<
> http://wiki.rsyslog.com/index.php/HOWTO:_rsyslog_%2B_elasticsearch>
> >>>> >
> >>>>
> >>>> I assume you'll get way better insert and query performance than you
> can
> >>>> with mysql (ie: with bulks, I get 10-20K logs indexed per second on my
> >>>> $500
> >>>> laptop. Then I can query in 100M-200M logs within a second. Depends on
> >>>> your
> >>>> settings). Plus, it's super-easy to scale Elasticsearch by adding new
> >>>> nodes.
> >>>>
> >>>> For querying, there are several, tools, the most popular being Kibana:
> >>>> http://three.kibana.org/
> >>>>
> >>>>
> >>> Just to note, one of the things that makes MySQL so slow or Mahesh is
> >>> it's
> >>> safety features. After each insert, MySQL makes sure the data is safe
> on
> >>> disk before it considers the insert complete.
> >>>
> >>
> >>
> >> By that, you mean it does a fsync after every transaction? I thought it
> >> doesn't do this (at least not by default, with neither MyISAM nor
> InnoDB).
> >> But then again, at least InnoDB does it more often than ES does.
> >>
> >
> > I don't remember the table types, but the newer of the two does do fsync
> > after each transaction, which is how it actually properly supports
> > transactions. This is why it was such a big deal when MySQL changed the
> > default.
> >
> >
> >
> >>  If the system crashes, the data will be there. There are config options
> >>> to
> >>> override this in MySQL.
> >>>
> >>> To get the numbers that elasticsearch is getting on your laptop, it's
> >>> almost certinly not doing this.
> >>>
> >>>
> >> I assume you lose some data if the whole system suddenly goes down. But
> if
> >> just ES does (ie: kill -9 the JVM), you shouldn't lose any data.
> >>
> >> I think ES writes stuff in a very different way than MySQL does. When
> you
> >> index something in ES, it does the indexing in memory and writes the raw
> >> data in the transaction
> >> log<http://www.elasticsearch.**org/guide/reference/index-**
> >> modules/translog/<
> http://www.elasticsearch.org/guide/reference/index-modules/translog/>
> >> >.
> >> Only after this is done you get a reply from ES.
> >>
> >> The transaction log is replayed on startup in case something goes wrong
> >> and
> >> you lose the data you had in memory. Every once in a while, it writes
> what
> >> it has to disk in the actual Lucene
> >> index<http://www.**elasticsearch.org/guide/**reference/glossary/#shard<
> http://www.elasticsearch.org/guide/reference/glossary/#shard>
> >> >**where
> >> it stores data "permanently".
> >>
> >> These chunks of data that it writes are
> >> segments<https://lucene.**apache.org/core/3_6_2/**
> >> fileformats.html#Segments<
> https://lucene.apache.org/core/3_6_2/fileformats.html#Segments>
> >> >,
> >> which consist of multiple files. The thing about segments is that
> they're
> >> immutable. And to make sure that you don't end up with a gazzillion
> >> segments, these are asynchronously
> >> merged<http://www.**elasticsearch.org/guide/**
> >> reference/index-modules/merge/<
> http://www.elasticsearch.org/guide/reference/index-modules/merge/>
> >> **>from
> >> time to time.
> >>
> >
> > the thing is that if it doesn't do a fsync, you have no guarantee that
> the
> > data is on the disk. And it's very possible for later data to make it to
> > the disk before earlier data does.
> >
> > doing a kill -9 isn't the same as a system crash.
> >
> > when you do a kill -9 the kernel and filesystem code contain all the data
> > that the application wrote, and will present that data if asked, and will
> > eventually get it to disk.
> >
> > But if the system looses power, any data not actually written to disk is
> > lost. And (depending on lots of implementation details) it's possible to
> > end up with holes in files, or files created that have no content, or
> even
> > files created, with space allocated for them, but stray data from the
> drive
> > in that space, not what the application wrote.
> >
> >
> > I suspect that what ES does is that it writes the data in long sequential
> > writes, and tries to make it so that if there is power loss, logs will be
> > lost but not corrupted. It can do that at the data rates that you are
> > describing. It's writing hundreds, if not thousands of logs per
> > 'transaction'
> >
> >
> >  this is probably acceptable, but you do need to be aware of the
> tradeoff.
> >>>
> >>>
> >> Right, there are always trade-offs. I'm sorry if I came across as the
> >> "you're using the wrong technology" guy. I hate it when people do that.
> >>
> >> In this particular case, I understand it's only about aggregating logs
> and
> >> searching them afterwards instead of doing that with straight files. And
> >> this is exactly what ES is about, so I thought it would be easier/better
> >> to
> >> give it a shot. And I don't see write speed as being its strong point,
> >> either - that would be the search speed.
> >>
> >
> > I think that you are correct in saying that ES is better than MySQL for
> > this, but I was wanting to point out that the reason why MySQL is as slow
> > as he was seeing is because it's making sure that each transaction is
> safe
> > before proceeding.
> >
> > Relaxing this guarantee is the sort of thing that all the No-SQL
> databases
> > do, and most of their performance wins are possible only because they do
> > not provide the same guarantees that the traditional SQL databases
> provide.
> >
> > David Lang
> >
> > ______________________________**_________________
> > rsyslog mailing list
> > http://lists.adiscon.net/**mailman/listinfo/rsyslog<
> http://lists.adiscon.net/mailman/listinfo/rsyslog>
> > http://www.rsyslog.com/**professional-services/<
> http://www.rsyslog.com/professional-services/>
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > DON'T LIKE THAT.
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] stuck up with bad performance with rsyslog+mysql

Reply via email to