Re: File Channel performance and fsync

Brock Noland Mon, 22 Oct 2012 07:30:01 -0700

In this cae, it's best to think about FileChannel as if it were a database.
Let's pretend we are going to insert 1 million rows. If we committed on
each row, would performance be "good"?  No, everyone knows that when you
are inserting rows in databases, you want to batch 100-1000 rows into a
single commit, if you want "good" performance. (Quoting good because it's
subjective based on the scenario, but in this case we mean lots of
MB/second).


Part of the reason behind this logic is that when a database does a commit,
it does an fsync operation to ensure that all data is written to disk and
that you will not lose data due to a subsequent power loss.

FileChannel behaves *exactly* the same. If your "batch" is only a single
event, file channel will:

write single event
fsync
write single event
fsync

As such, if you want "good" performance with FileChannel, you must increase
your batch size, just like a database. If you have a batchSize of say 100,
then FileChannel will:

write single event 0
write single event 1
...
write single event 99
fsync

Which will result in much "better" performance. It's worth noting that
ExecSource in Flume 1.2, does not have a batchSize and as such each event
is written and then committed. ExecSource in flume 1.3, which we will
release soon, does have a configurable batchSize. If you want to try that
out you can build it from the flume-1.3.0 branch.

Brock

On Mon, Oct 22, 2012 at 8:59 AM, Brock Noland <[email protected]> wrote:

>  Which version? 1.2 or trunk?
>
> On Monday, October 22, 2012 at 8:18 AM, Jagadish Bihani wrote:
>
>  Hi
>
> This is the simplistic configuration with which I am getting lower
> performance.
> Even with 2-tier architecture (cat source - avro sinks - avro source- HDFS
> sink)
> I get the similar performance with file channel.
>
> Configuration:
> =========
> adServerAgent.sources = avro-collection-source
> adServerAgent.channels = fileChannel
> adServerAgent.sinks = hdfsSink fileSink
>
> # For each one of the sources, the type is defined
> adServerAgent.sources.avro-collection-source.type=exec
> adServerAgent.sources.avro-collection-source.command= cat
> /home/hadoop/file.tsf
>
> # The channel can be defined as follows.
> adServerAgent.sources.avro-collection-source.channels = fileChannel
>
> #Define file sink
> adServerAgent.sinks.fileSink.type = file_roll
> adServerAgent.sinks.fileSink.sink.directory = /home/hadoop/flume_sink*
> *
> adServerAgent.sinks.fileSink.channel = fileChannel
> adServerAgent.channels.fileChannel.type=file
>
> adServerAgent.channels.fileChannel.dataDirs=/home/hadoop/flume/channel/dataDir5
>
> adServerAgent.channels.fileChannel.checkpointDir=/home/hadoop/flume/channel/checkpointDir5
> adServerAgent.channels.fileChannel.maxFileSize=4000000000
>
> And it is run with :
> JAVA_OPTS = -Xms500m -Xmx700m -Dcom.sun.management.jmxremote
> -XX:MaxDirectMemorySize=2g
>
> Regards,
> Jagadish
>
> On 10/22/2012 05:42 PM, Brock Noland wrote:
>
> Hi,
>
>  I'll respond in more depth later, but it would help if you posted your
> configuration file and the version of flume you are using.
>
>  Brock
>
>  On Mon, Oct 22, 2012 at 6:48 AM, Jagadish Bihani <
> [email protected]> wrote:
>
>  Hi
>
> I am writing this on top of another thread where there was discussion on
> "fsync lies" and
> only file channel used fsync and not file sink. :
>
> -- I tested the fsync performance on 2 machines  (On 1 machine I was
> getting very good throughput
> using file channel and on another almost 100 times slower with almost same
> hardware configuration.)
> using following code
>
>
> #define PAGESIZE 4096
>
> int main(int argc, char *argv[])
> {
>
>         char my_write_str[PAGESIZE];
>         char my_read_str[PAGESIZE];
>         char *read_filename= argv[1];
>         int readfd,writefd;
>
>         readfd = open(read_filename,O_RDONLY);
>         writefd = open("written_file",O_WRONLY|O_CREAT,777);
>         int len=lseek(readfd,0,2);
>         lseek(readfd,0,0);
>         int iterations = len/PAGESIZE;
>         int i;
>         struct timeval t0,t1;
>
>        for(i=0;i<iterations;i++)
>         {
>
>                 read(readfd,my_read_str,PAGESIZE);
>                 write(writefd,my_read_str,PAGESIZE);
>                 *gettimeofday(&t0,0);**
> **                fsync(writefd);**
> **              gettimeofday(&t1,0);*
>                 long elapsed = (t1.tv_sec-t0.tv_sec)*1000000 +
> t1.tv_usec-t0.tv_usec;
>                 printf("Elapsed time is= %ld \n",elapsed);
>          }
>         close(readfd);
>         close(writefd);
> }
>
>
> -- As expected it requires typically 50000 microseconds for fsync to
> complete on one machine and 200 microseconds
> on another machine it took 290 microseconds to complete on an average. So
> is machine with higher
> performance is doing a 'fsync lie'?
> i
> -- If I have understood it clearly; "fsync lie" means the data is not
> actually written to disk and it is in
> some disk/controller buffer.  I) Now if disk loses power due to some
> shutdown or any other disaster, data will
> be lost. II) Can data be lost even without it ? (e.g. if it is keeping
> data in some disk buffer and if fsync is being
> invoked continuously then will that data can also  be lost? If only part
> -I is true; then it can be acceptable
> because probability of shutdown is usually less in production environment.
> But if even II is true then there is a
> problem.
>
> -- But on the machine where disk doesn't lie performance of flume using
> File channel is very low (I have seen it
> maximum 100 KB/sec even with sufficient  DirectMemory allocation.) Does
> anybody have stats about throughput
> of file channel ? Is anybody getting better performance with file channel
> (without fsync lies). What is the recommended
> usage of it for an average scenario ? (Transferring files of few MBs to
> HDFS sink continuously on typical hardware
> (16 core processors, 16 GB RAM etc.)
>
>
> Regards,
> Jagadish
>
> On 10/10/2012 11:30 PM, Brock Noland wrote:
>
> Hi,
>
> On Wed, Oct 10, 2012 at 11:22 AM, Jagadish 
> Bihani<[email protected]> <[email protected]> wrote:
>
> Hi Brock
>
> I will surely look into 'fsync lies'.
>
> But as per my experiments I think "file channel" is causing the issue.
> Because on those 2 machines (one with higher throughput and other with
> lower)
> I did following experiment:
>
> cat Source -memory channel - file sink.
>
> Now with this setup I got same throughput on both the machines. (around 3
> MB/sec)
> Now as I have used "File sink" it should also do "fsync" at some point of
> time.
> 'File Sink' and 'File Channel' both do disk writes.
> So if there is differences in disk behaviour then even in the 'File Sink' it
> should be visible.
>
> Am I missing something here?
>
> File sink does not call fsync.
>
>
> Regards,
> Jagadish
>
>
>
> On 10/10/2012 09:35 PM, Brock Noland wrote:
>
> OK your disk that is giving you 40KB/second is telling you the truth
> and the faster disk is lying to you. Look up "fsync lies" to see what
> I am referring to.
>
> A spinning disk can do 100 fsync operations per second (this is done
> at the end of every batch). That is how I estimated your event size,
> 40KB/second is doing 40KB / 100 =  409 bytes.
>
> Once again, if you want increased performance, you should increase the
> batch size.
>
> Brock
>
> On Wed, Oct 10, 2012 at 11:00 AM, Jagadish 
> Bihani<[email protected]> <[email protected]> wrote:
>
> Hi
>
> Yes. It is around 480 - 500 bytes.
>
>
> On 10/10/2012 09:24 PM, Brock Noland wrote:
>
> How big are your events? Average about 400 bytes?
>
> Brock
>
> On Wed, Oct 10, 2012 at 5:11 AM, Jagadish 
> Bihani<[email protected]> <[email protected]> wrote:
>
> Hi
>
> Thanks for the inputs Brock. After doing several experiments
> eventually problem boiled down to disks.
>
>    -- But I had used the same configuration (so all software components
> are
> same in all 3 machines)
> on all 3 machines.
> -- In User guide it is written that if multiple file channel instances
> are
> active on the same agent then
> different disks are preferable. But in my case only one file channel is
> active per agent.
> -- Only one pattern I observed that on the machines where I got better
> performance have multiple disks.
> But I don't understand how that will help if I have only 1 active file
> channel.
> -- What is the impact of the type of disk/disk device driver on
> performance?
> I mean I don't understand
> with 1 disk I am getting 40 KB/sec and with other 2 MB/sec.
>
> Could you please elaborate on File channel and disks correlation.
>
> Regards,
> Jagadish
>
>
> On 10/09/2012 08:01 PM, Brock Noland wrote:
>
> Hi,
>
> Using file channel, in terms of performance, the number and type of
> disks is going to be much more predictive of performance than CPU or
> RAM. Note that consumer level drives/controllers will give you much
> "better" performance because they lie to you about when your data is
> actually written to the drive. If you search for "fsync lies" you'll
> find more information on this.
>
> You probably want to increase the batch size to get better performance.
>
> Brock
>
> On Tue, Oct 9, 2012 at 2:46 AM, Jagadish Bihani<[email protected]> 
> <[email protected]> wrote:
>
> Hi
>
> My flume setup is:
>
> Source Agent : cat source - File Channel - Avro Sink
> Dest Agent :     avro source - File Channel - HDFS Sink.
>
> There is only 1 source agent and 1 destination agent.
>
> I measure throughput as amount of data written to HDFS per second.
> ( I have rolling interval 30 sec; so If 60 MB file is generated in 30
> sec
> the
> throughput is : -- 2 MB/sec ).
>
> I have run source agent on various machines with different hardware
> configurations :
> (In all cases I run flume agent with JAVA OPTIONS as
> "-DJAVA_OPTS="-Xms500m -Xmx1g -Dcom.sun.management.jmxremote
> -XX:MaxDirectMemorySize=2g")
>
> JDK is 32 bit.
>
> Experiment 1:
> =====
> RAM : 16 GB
> Processor: Intel Xeon E5620 @ 2.40 GHz (16 cores).
> 64 bit Processor with 64 bit Kernel.
> Throughput: 2 MB/sec
>
> Experiment 2:
> ======
> RAM : 4 GB
> Processor: Intel Xeon E5504  @ 2.00GHz (4 cores). 32 bit Processor
> 64 bit Processor with 32 bit Kernel.
> Throughput : 30 KB/sec
>
> Experiment 3:
> ======
> RAM : 8 GB
> Processor:Intel Xeon E5520 @ 2.27 GHz (16 cores).32 bit Processor
> 64 bit Processor with 32 bit Kernel.
> Throughput : 80 KB/sec
>
>    -- So as can be seen there is huge difference in the throughput with
> same
> configuration but
> different hardware.
> -- In the first case where throughput is more RES is around 160 MB in
> other
> cases it is in
> the range of 40 MB - 50 MB.
>
> Can anybody please give insights that why there is this huge difference
> in
> the throughput?
> What is the correlation between RAM and filechannel/HDFS sink
> performance
> and also
> with 32-bit/64 bit kernel?
>
> Regards,
> Jagadish
>
>
>
>
>
>
>
>
>  --
> Apache MRUnit - Unit testing MapReduce -
> http://incubator.apache.org/mrunit/
>
>
>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Re: File Channel performance and fsync

Reply via email to