This is what I also intend to do. Is a s3 path considered non-hdfs? If so, how does it know the credentials to use to fetch the file.
Sent from my iPhone > On Oct 21, 2014, at 5:16 AM, David Greenberg <[email protected]> wrote: > > We use spark without HDFS--in our case, we just use ansible to copy the spark > executors onto all hosts at the same path. We also load and store our spark > data from non-HDFS sources. > >> On Tue, Oct 21, 2014 at 4:57 AM, Dick Davies <[email protected]> wrote: >> I think Spark needs a way to send jobs to/from the workers - the Spark >> distro itself >> will pull down the executor ok, but in my (very basic) tests I got >> stuck without HDFS. >> >> So basically it depends on the framework. I think in Sparks case they >> assume most >> users are migrating from an existing Hadoop deployment, so HDFS is >> sort of assumed. >> >> >> On 20 October 2014 23:18, CCAAT <[email protected]> wrote: >> > On 10/20/14 11:46, Steven Schlansker wrote: >> > >> > >> >> We are running Mesos entirely without HDFS with no problems. We use >> >> Docker to distribute our >> >> application to slave nodes, and keep no state on individual nodes. >> > >> > >> > >> > Background: I'm building up a 3 node cluster to run mesos and spark. No >> > legacy Hadoop needed or wanted. I am using btrfs for the local file system, >> > with (2) drives set up for raid1 on each system. >> > >> > So you are suggesting that I can install mesos + spark + docker >> > and not a DFS on these (3) machines? >> > >> > >> > Will I need any other softwares? My application is a geophysical >> > fluid simulator, so scala, R, and all sorts of advanced math will >> > be required on the cluster for the Finite Element Methods. >> > >> > >> > James >> > >> > >

