Ceph and glusterfs are NOT centralized files systems. Glusterfs can be used with Hadoop map reduce, but it requires a special plug in, and hdfs 2 can be ha, so it's probably not worth switching. Ymmv. On Dec 31, 2013 4:01 PM, "Jiayu Ji" <[email protected]> wrote:
> I am not very familiar with Ceph and GlusterFS, but I know they are > centralized file systems. In this kinds of FS, compute nodes and the > storage nodes are separated. If the size of your data increases, the > network may eventually become the bottleneck. > > Hadoop is a framework includes storage (HDFS) and computation (MapReduce). > It aims to bring the computation power to the storage node. In this case, > it assigns tasks to where the data is stored due to its awareness of the > data locality. Also, if the size of data increases, you can add more nodes > to the cluster. By doing that, you achieve almost linear scalability. > > > On Sat, Dec 28, 2013 at 1:26 PM, Kurt Moesky <[email protected]> wrote: > >> Hi Charles, >> >> That is actually what we're doing, comparing the Hadoop file system to >> Ceph and GlusterFS. Just looking for some input from the field as that what >> you experts see as the strengths of HDFS over Ceph and GlusterFS. >> >> Thanks, >> Kurt >> >> >> On Sat, Dec 28, 2013 at 11:42 AM, Charles Earl >> <[email protected]>wrote: >> >>> Would it not be better to compare HDFS as the others are distributed >>> file systems? >>> Charles >>> >>> On Dec 28, 2013, at 1:40 PM, Kurt Moesky <[email protected]> wrote: >>> >>> > Hi guys, >>> > >>> > I am working on a write-up of Hadoop, Ceph and GlusterFS and was >>> wondering if you could chime in with some benefits of Hadoop over the other >>> two? >>> > >>> > I know Hadoop is widely used by the likes of Yahoo and Facebook. >>> > >>> > Are there benefits in scaling, management (I like the Ambari >>> interface) etc? >>> > >>> > Thanks. >>> >> >> > > > -- > Jiayu (James) Ji, > > Cell: (312)823-7393 > >
