since spark holds data structures on heap (and by default tries to work
with all data in memory) and its written in Scala seeing lots of scala
Tuple2 is not unexpected. how do these numbers relate to your data size?
On Oct 27, 2014 2:26 PM, Sonal Goyal sonalgoy...@gmail.com wrote:
Hi,
I wanted
Thanks Koert. These numbers indeed tie back to our data and algorithms.
Would going the scala route save some memory, as the java API creates
wrapper Tuple2 for all pair functions?
On Wednesday, October 29, 2014, Koert Kuipers ko...@tresata.com wrote:
since spark holds data structures on heap
Hi,
I wanted to understand what kind of memory overheads are expected if at all
while using the Java API. My application seems to have a lot of live Tuple2
instances and I am hitting a lot of gc so I am wondering if I am doing
something fundamentally wrong. Here is what the top of my heap looks