Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

Cameron Bahar Thu, 13 Dec 2012 12:24:19 -0800

Is the chunk size tunable in A Ceph cluster. I don't mean dynamic, but even 
statically configurable when a cluster is first installed?


Thanks,
Cameron

Sent from my iPhone

On Dec 13, 2012, at 9:41 AM, Gregory Farnum <[email protected]> wrote:

> On Thu, Dec 13, 2012 at 9:27 AM, Sage Weil <[email protected]> wrote:
>> Hi Jutta,
>> 
>> On Thu, 13 Dec 2012, Lachfeld, Jutta wrote:
>>> Hi all,
>>> 
>>> I am currently doing some comparisons between CEPH FS and HDFS as a file 
>>> system for Hadoop using Hadoop's integrated benchmark TeraSort. This 
>>> benchmark first generates the specified amount of data in the file system 
>>> used by Hadoop, e.g. 1TB of data, and then sorts the data via the MapReduce 
>>> framework of Hadoop, sending the sorted output again to the file system 
>>> used by Hadoop.  The benchmark measures the elapsed time of a sort run.
>>> 
>>> I am wondering about my best result achieved with CEPH FS in comparison to 
>>> the ones achieved with HDFS. With CEPH, the runtime of the benchmark is 
>>> somewhat longer, the factor is about 1.2 when comparing with an HDFS run 
>>> using the default HDFS block size of 64MB. When comparing with an HDFS run 
>>> using an HDFS block size of 512MB the factor is even 1.5.
>>> 
>>> Could you please take a look at the configuration, perhaps some key factor 
>>> already catches your eye, e.g. CEPH version.
>>> 
>>> OS: SLES 11 SP2
>>> 
>>> CEPH:
>>> OSDs are distributed over several machines.
>>> There is 1 MON and 1 MDS process on yet another machine.
>>> 
>>> Replication of the data pool is set to 1.
>>> Underlying file systems for data are btrfs.
>>> Mount options  are only "rw,noatime".
>>> For each CEPH OSD, we use a RAM disk of 256MB for the journal.
>>> Package ceph has version 0.48-13.1, package ceph-fuse has version 0.48-13.1.
>>> 
>>> HDFS:
>>> HDFS is distributed over the same machines.
>>> HDFS name node on yet another machine.
>>> 
>>> Replication level is set to 1.
>>> HDFS block size is set to  64MB or even 512MB.
>> 
>> I suspect that this is part of it.  The default ceph block size is only
>> 4MB.  Especially since the differential increases with larger blocks.
>> I'm not sure if the setting of block sizees is properly wired up; it
>> depends on what version of the hadoop bindings you are using.  Noah would
>> know more.
>> 
>> You can adjust the default block/object size for the fs with the cephfs
>> utility from a kernel mount.  There isn't yet a convenient way to do this
>> via ceph-fuse.
> 
> If Jutta is using the *old* ones I last worked on in 2009, then this
> is already wired up for 64MB blocks. A "ceph pg dump" would let us get
> a rough estimate of the block sizes in use.
> 
> "ceph -s" would also be useful to check that everything is set up reasonably.
> 
> Other than that, it would be fair to describe these bindings as
> little-used — minimal performance tests indicated rough parity back in
> 2009, but those were only a couple minutes long and on very small
> clusters, so 1.2x might be normal. Noah and Joe are working on new
> bindings now, and those will be tuned and accompany some backend
> changes if necessary. They might also have a better eye for typical
> results.
> -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

Reply via email to