Hi,

Greg has spent some time fixing up a Hadoop FileSystem module allowing a 
hadoop cluster to use Ceph in place of HDFS.  It hasn't seen extensive 
testing or benchmarking (we don't use hadoop internally), but it passes 
our basic tests and seems to have similar performance to HDFS.

The main reason Hadoop users might be interested is the scaling problems 
people are having with HDFS's namenode.  Ceph's MDS maintains minimal 
per-inode metadata (no block lists), doesn't require that it all be in 
memory, and (perhaps most importantly) has a clustered MDS architecture, 
allowing metadata to be spread across tens or possibly hundreds of nodes.

Anyway, we're very much interested in seeing Ceph perform well for Hadoop.  

The Hadoop module can be found in src/client/hadoop, and has been 
submitted for inclusion in the next Hadoop release.  It relies on libceph, 
which can be built and installed from source, or as a .deb.

sage

------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Ceph-devel mailing list
Ceph-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ceph-devel

Reply via email to