Is the chunk size tunable in A Ceph cluster. I don't mean dynamic, but even statically configurable when a cluster is first installed?
Thanks, Cameron Sent from my iPhone On Dec 13, 2012, at 9:41 AM, Gregory Farnum <[email protected]> wrote: > On Thu, Dec 13, 2012 at 9:27 AM, Sage Weil <[email protected]> wrote: >> Hi Jutta, >> >> On Thu, 13 Dec 2012, Lachfeld, Jutta wrote: >>> Hi all, >>> >>> I am currently doing some comparisons between CEPH FS and HDFS as a file >>> system for Hadoop using Hadoop's integrated benchmark TeraSort. This >>> benchmark first generates the specified amount of data in the file system >>> used by Hadoop, e.g. 1TB of data, and then sorts the data via the MapReduce >>> framework of Hadoop, sending the sorted output again to the file system >>> used by Hadoop. The benchmark measures the elapsed time of a sort run. >>> >>> I am wondering about my best result achieved with CEPH FS in comparison to >>> the ones achieved with HDFS. With CEPH, the runtime of the benchmark is >>> somewhat longer, the factor is about 1.2 when comparing with an HDFS run >>> using the default HDFS block size of 64MB. When comparing with an HDFS run >>> using an HDFS block size of 512MB the factor is even 1.5. >>> >>> Could you please take a look at the configuration, perhaps some key factor >>> already catches your eye, e.g. CEPH version. >>> >>> OS: SLES 11 SP2 >>> >>> CEPH: >>> OSDs are distributed over several machines. >>> There is 1 MON and 1 MDS process on yet another machine. >>> >>> Replication of the data pool is set to 1. >>> Underlying file systems for data are btrfs. >>> Mount options are only "rw,noatime". >>> For each CEPH OSD, we use a RAM disk of 256MB for the journal. >>> Package ceph has version 0.48-13.1, package ceph-fuse has version 0.48-13.1. >>> >>> HDFS: >>> HDFS is distributed over the same machines. >>> HDFS name node on yet another machine. >>> >>> Replication level is set to 1. >>> HDFS block size is set to 64MB or even 512MB. >> >> I suspect that this is part of it. The default ceph block size is only >> 4MB. Especially since the differential increases with larger blocks. >> I'm not sure if the setting of block sizees is properly wired up; it >> depends on what version of the hadoop bindings you are using. Noah would >> know more. >> >> You can adjust the default block/object size for the fs with the cephfs >> utility from a kernel mount. There isn't yet a convenient way to do this >> via ceph-fuse. > > If Jutta is using the *old* ones I last worked on in 2009, then this > is already wired up for 64MB blocks. A "ceph pg dump" would let us get > a rough estimate of the block sizes in use. > > "ceph -s" would also be useful to check that everything is set up reasonably. > > Other than that, it would be fair to describe these bindings as > little-used — minimal performance tests indicated rough parity back in > 2009, but those were only a couple minutes long and on very small > clusters, so 1.2x might be normal. Noah and Joe are working on new > bindings now, and those will be tuned and accompany some backend > changes if necessary. They might also have a better eye for typical > results. > -Greg > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
