Re: Configuring amount of disk space available to spark executors in mesos?

2015-04-13 Thread Patrick Wendell
Hey Jonathan,

Are you referring to disk space used for storing persisted RDD's? For
that, Spark does not bound the amount of data persisted to disk. It's
a similar story to how Spark's shuffle disk output works (and also
Hadoop and other frameworks make this assumption as well for their
shuffle data, AFAIK).

We could (in theory) add a storage level that bounds the amount of
data persisted to disk and forces re-computation if the partition did
not fit. I'd be interested to hear more about a workload where that's
relevant though, before going that route. Maybe if people are using
SSD's that would make sense.

- Patrick

On Mon, Apr 13, 2015 at 8:19 AM, Jonathan Coveney jcove...@gmail.com wrote:
 I'm surprised that I haven't been able to find this via google, but I
 haven't...

 What is the setting that requests some amount of disk space for the
 executors? Maybe I'm misunderstanding how this is configured...

 Thanks for any help!

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Configuring amount of disk space available to spark executors in mesos?

2015-04-13 Thread Jonathan Coveney
Nothing so complicated... we are seeing mesos kill off our executors
immediately. When I reroute logging to an NFS directory we have available,
the executors survive fine. As such I am wondering if the spark workers are
getting killed by mesos for exceeding their disk quota (which atm is 0).
This could be a red herring, however.

2015-04-13 15:41 GMT-04:00 Patrick Wendell pwend...@gmail.com:

 Hey Jonathan,

 Are you referring to disk space used for storing persisted RDD's? For
 that, Spark does not bound the amount of data persisted to disk. It's
 a similar story to how Spark's shuffle disk output works (and also
 Hadoop and other frameworks make this assumption as well for their
 shuffle data, AFAIK).

 We could (in theory) add a storage level that bounds the amount of
 data persisted to disk and forces re-computation if the partition did
 not fit. I'd be interested to hear more about a workload where that's
 relevant though, before going that route. Maybe if people are using
 SSD's that would make sense.

 - Patrick

 On Mon, Apr 13, 2015 at 8:19 AM, Jonathan Coveney jcove...@gmail.com
 wrote:
  I'm surprised that I haven't been able to find this via google, but I
  haven't...
 
  What is the setting that requests some amount of disk space for the
  executors? Maybe I'm misunderstanding how this is configured...
 
  Thanks for any help!



Configuring amount of disk space available to spark executors in mesos?

2015-04-13 Thread Jonathan Coveney
I'm surprised that I haven't been able to find this via google, but I
haven't...

What is the setting that requests some amount of disk space for the
executors? Maybe I'm misunderstanding how this is configured...

Thanks for any help!