Grega, the way to think about this setting is that it sets the maximum amount of memory Spark is allowed to use for caching RDDs before it must expire or spill them to disk. Spark in principle knows at all times how many RDDs are kept in memory and their total sizes, so it can for example persist then free older RDDs when it's allocating space for new RDDs, when this limit is hit.
There's otherwise no partitioning of the heap to reserve for RDDs vs. "normal" objects. The entire heap is still managed by the JVM, accessible to your client code. Of course you do want to be careful with and minimize your own memory use to avoid OOMEs. Sent while mobile. Pls excuse typos etc. On Nov 8, 2013 8:02 AM, "Grega Kešpret" <[email protected]> wrote: > Hi, > > The docs say: Fraction of Java heap to use for Spark's memory cache. This > should not be larger than the "old" generation of objects in the JVM, which > by default is given 2/3 of the heap, but you can increase it if you > configure your own old generation size. > > if we are not caching any RDDs, does it mean that we only have > 1-memoryFraction heap available for "normal" JVM objects? Would it make > sense then to set memoryFraction to 0? > > Thanks, > > Grega > -- > [image: Inline image 1] > *Grega Kešpret* > Analytics engineer > > Celtra — Rich Media Mobile Advertising > celtra.com <http://www.celtra.com/> | > @celtramobile<http://www.twitter.com/celtramobile> >
<<celtra_logo.png>>
