Re: Restarts without data loss

Juhani Connolly Mon, 09 Jul 2012 03:49:54 -0700

It is currently pushing only 10 events per second or so(roughly 250bytes per event). This is with datadir/checkpoint on the same directory.Of course the fact that there is a tail process running and that tomcatis also writing out logs is without a doubt compounding the problemsomewhat.

I haven't taken a serious look at thread dumps of the file channel sinceI don't have a thorough understanding of it. However analysis hasinvolved trying varying numbers of sinks(no throughput difference) andreplacing with memory channel(which easily handles the 650 ish requestsper second we have per server for the particular api, no problems evenwith a single sink).

Since you say there's heavy fsyncing, and with 7200rpm disks, each seekwill have an average latency of 4.16ms, so for alternating seeks betweenthe checkpoint and the data dir, if each of those writes happens inorder, you're already limited to best case of barely more than 100events per second. Our experience so far has shown it to besignificantly less.

I do believe that batching a bunch of puts or takes with a single committogether as two seeks followed by writes(or one if we can only use asingle storage file) could give significant returns when paired with abatching sink/source(which many already do... Requesting multiple itemsat a time).

If there is any specific data you would like I would be happy to try andprovide it.


On 07/09/2012 05:22 PM, Brock Noland wrote:

On Mon, Jul 9, 2012 at 8:51 AM, Juhani Connolly<[email protected]<mailto:[email protected]>> wrote:
     - Intended setup with flume was a file channel connected to an
    avro sink. With only a single disk available, it is extremely
    slow. JDBC channel is also extremely slow, and MemoryChannel will
    fill up and start refusing puts as soon as a network issue comes up.
Have you taken a few thread dumps or done other analysis? When you say"extremely slow" what do you mean? Configured for no datalossFileChannel is going to be doing a lot of fsync'ing so I am notsurprised it's slow. That is a property of disks not FileChannel. Ithink we should use group commit but that shouldn't make it 10x faster.
Brock

Re: Restarts without data loss

Reply via email to