Thanks, Cos! > from Ignite standpoint replacing one with another doesn't give much advantage
Agreed. From the standpoint of Ignite, Hadoop, or Spark, Gluster works no differently than HDFS. If Ignite doesn't have an object store available already, then Ceph could add that capability. >From the standpoint of the user and integration with a larger IT infrastructure, Gluster offers advantages over HDFS. As you say, Gluster is a POSIX-compatible native filesystem -- it provides a FUSE module for mounting remote Gluster volumes. This means non-Hadoop applications can store data in the same file system as Hadoop. I come from a scientific computing background where pretty much every simulation or analysis tool expected access to a POSIX file system. We evaluated Hadoop at one point but chose not to use it because we would have to copy all of our data into HDFS. Gluster is a much better POSIX distributed file system than what my university's cluster used, and I wish I had known about it while doing my Ph.D. :) For my work at Red Hat, we run Spark on Gluster. We don't use any special plugins -- since Spark uses the Hadoop file system libraries, Spark can read off native file systems. Same advantages mentioned above -- nice to be able to use grep, cat, etc. alongside Spark :) On Mon, Jul 13, 2015 at 8:55 PM, Konstantin Boudnik <[email protected]> wrote: > On Mon, Jul 13, 2015 at 07:00PM, RJ Nowling wrote: > > Cos, > > > > Can you expand on what you mean by "native to Linux" for Ceph? > > I meant that the file system is presented in a Linux distro as kernel > module. > HDFS, as you know, is an alien Java process that creates a layer > indirection > on top of say ext4 or jfs to provide a distributed storage; Ceph does this > similarly to other _native_ file systems. > > > And can you elaborate on why Gluster doesn't make sense as a HDFS > > replacement to you? > > What I wanted to express, perhaps a bit clumsy, is that HDFS and Gluster > are > two instances of HCFS. from Ignite standpoint replacing one with another > doesn't give much advantage (unless I am missing something about the > Gluster). > Hopefully it makes sense? > > > Not trying to argue -- just generally curious. :) > > Not trying to cast a shadow on Gluster nor whitewash HDFS (far from it) ;) > > Cos > > > Thanks! > > > > On Mon, Jul 13, 2015 at 5:06 PM, Konstantin Boudnik <[email protected]> > wrote: > > > > > I think file system is more universally used. However, one can build > an FS > > > on > > > top of a good object storage - just need to provide some metadata > > > abstraction/concept. > > > > > > Replacing HDFS w/ Gluster doesn't make much sense to me (if ever be > > > considered). What I like about Ceph is that it is native to Linux, > unlike > > > all > > > other artificial HCFS contraptions. Hence my initial question. > > > > > > Cos > > > > > > On Thu, Jul 09, 2015 at 01:53AM, Dmitriy Setrakyan wrote: > > > > Hm... I would think that file system would be more beneficial, > although > > > > object store on disk can also be valuable. > > > > > > > > Cos, what is your thinking? > > > > > > > > D. > > > > > > > > On Wed, Jul 8, 2015 at 8:40 PM, RJ Nowling <[email protected]> > wrote: > > > > > > > > > Ceph makes a better object store while Gluster makes a better file > > > system. > > > > > That's why Ceph is a popular backend for OpenStack Swift. > > > > > > > > > > Does Ignite want a FS or Object backend? > > > > > > > > > > On Wed, Jul 8, 2015 at 5:57 PM, Konstantin Boudnik <[email protected] > > > > > wrote: > > > > > > > > > > > Good point... although I was curious about Ignite's take on that > > > first > > > > > and > > > > > > foremost. Yet, cross-posting to [email protected] > > > > > > > > > > > > Jay et all: any thoughts about the combination? > > > > > > Cos > > > > > > > > > > > > On Wed, Jul 08, 2015 at 03:14PM, Roman Shaposhnik wrote: > > > > > > > I'm sure our RH brethren have something to say about Ceph. > > > > > > > Re-post on dev@bigtop? > > > > > > > > > > > > > > Thanks, > > > > > > > Roman. > > > > > > > > > > > > > > On Wed, Jul 8, 2015 at 3:17 PM, Konstantin Boudnik < > [email protected] > > > > > > > > > > wrote: > > > > > > > > Guys, > > > > > > > > > > > > > > > > I was looking at the Hadoop accelerator the other day and > been > > > > > > thinking if > > > > > > > > anyone has tried to use IGFS on top of a real distributed > file > > > > > > storage. The > > > > > > > > case in point is Ceph (ceph.com) - a Linux file system > available > > > > > from > > > > > > any > > > > > > > > major Linux distribution as a kernel module. > > > > > > > > > > > > > > > > HDFS has its share in the world, but it isn't the fastest, > > > simplest, > > > > > > nor most > > > > > > > > advantageous distributed storage on the planet. Hence I am > > > wondering > > > > > > if this > > > > > > > > would be a good call to provide Ignite on CEPH as a 2nd FS > > > > > > capabilities. > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > Cos > > > > > > > > > > > > > > > > > > > > > > >
