FInally: how big do the "multiple disks configured as separate filesystems" that are used for temporary Spark storage need to be?
Thanks, Craig On Tue, Oct 15, 2013 at 1:12 PM, Craig Vanderborgh < [email protected]> wrote: > In particular: If I make the "SPARK_WORKER_INSTANCES" env variable setting > in spark-env.sh, will this propagate through Mesos and result in (say) two > workers per cluster node? > > Thanks, > Craig > > > On Tue, Oct 15, 2013 at 1:07 PM, Craig Vanderborgh < > [email protected]> wrote: > >> Hi Matei, >> >> This is helpful but it would be even more so if this documentation could >> describe how to make these settings correctly in a Spark-on-Mesos >> environment. Can you describe the differences for Mesos? >> >> Thanks again, >> Craig >> >> >> On Mon, Oct 14, 2013 at 6:15 PM, Matei Zaharia >> <[email protected]>wrote: >> >>> Hi Craig, >>> >>> The best configuration is to have multiple disks configured as separate >>> filesystems (so no RAID), and set the spark.local.dir property, which >>> configures Spark's scratch space directories, to be a comma-separated list >>> of directories, one per disk. In 0.8 we've written a bit on how to >>> configure machines for Spark here: >>> http://spark.incubator.apache.org/docs/latest/hardware-provisioning.html. >>> For the filesystem I'd suggest ext3 with noatime set. >>> >>> Matei >>> >>> On Oct 14, 2013, at 11:28 AM, Craig Vanderborgh < >>> [email protected]> wrote: >>> >>> > Hi All, >>> > >>> > We're setting up a new Spark-on-Mesos cluster. I'd like anyone who is >>> already done this to suggest a disk partitioning/filesystem layout that has >>> worked well for them in their cluster deployment. >>> > >>> > We are running MapR M3 on the cluster, but only for maprfs. Our jobs >>> will be programmed for and run on Spark. >>> > >>> > Thanks in advance, >>> > Craig Vanderborgh >>> >>> >> >
