We use lustre and a couple internal data storage services. I wouldn't recommend lustre much; it's got an SPOF which is a problem at scale. I just wanted to point out that you can skip hdfs if you so choose.
On Wednesday, October 22, 2014, Dick Davies <[email protected]> wrote: > Be interested to know what that is, if you don't mind sharing. > > We're thinking of deploying a Ceph cluster for another project anyway, > it seems to remove some of the chokepoints/points of failure HDFS suffers > from > but I've no idea how well it can interoperate with the usual HDFS clients > (Spark in my particular case but I'm trying to keep this general). > > On 21 October 2014 13:16, David Greenberg <[email protected] > <javascript:;>> wrote: > > We use spark without HDFS--in our case, we just use ansible to copy the > > spark executors onto all hosts at the same path. We also load and store > our > > spark data from non-HDFS sources. > > > > On Tue, Oct 21, 2014 at 4:57 AM, Dick Davies <[email protected] > <javascript:;>> wrote: > >> > >> I think Spark needs a way to send jobs to/from the workers - the Spark > >> distro itself > >> will pull down the executor ok, but in my (very basic) tests I got > >> stuck without HDFS. > >> > >> So basically it depends on the framework. I think in Sparks case they > >> assume most > >> users are migrating from an existing Hadoop deployment, so HDFS is > >> sort of assumed. > >> > >> > >> On 20 October 2014 23:18, CCAAT <[email protected] <javascript:;>> > wrote: > >> > On 10/20/14 11:46, Steven Schlansker wrote: > >> > > >> > > >> >> We are running Mesos entirely without HDFS with no problems. We use > >> >> Docker to distribute our > >> >> application to slave nodes, and keep no state on individual nodes. > >> > > >> > > >> > > >> > Background: I'm building up a 3 node cluster to run mesos and spark. > No > >> > legacy Hadoop needed or wanted. I am using btrfs for the local file > >> > system, > >> > with (2) drives set up for raid1 on each system. > >> > > >> > So you are suggesting that I can install mesos + spark + docker > >> > and not a DFS on these (3) machines? > >> > > >> > > >> > Will I need any other softwares? My application is a geophysical > >> > fluid simulator, so scala, R, and all sorts of advanced math will > >> > be required on the cluster for the Finite Element Methods. > >> > > >> > > >> > James > >> > > >> > > > > > >

