Hi,

By default, 60% of JVM memory is reserved for RDD caching, so in your case,
72GB memory is available for RDDs which means that your total data may fit
in memory. You can check the RDD memory statistics via the storage tab in
web ui.

Hope this helps!
Liquan



On Tue, Sep 30, 2014 at 4:11 PM, anny9699 <anny9...@gmail.com> wrote:

> Hi,
>
> Is there a guidance about for a data of certain data size, how much total
> memory should be needed to achieve a relatively good speed?
>
> I have a data of around 200 GB and the current total memory for my 8
> machines are around 120 GB. Is that too small to run the data of this big?
> Even the read in and simple initial processing seems to last forever.
>
> Thanks a lot!
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/memory-vs-data-size-tp15443.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Liquan Pei
Department of Physics
University of Massachusetts Amherst

Reply via email to