Re: Understanding RDD.GroupBy OutOfMemory Exceptions

asimjalis Wed, 07 Jan 2015 06:57:06 -0800

Hi Patrick: Do you know what the status of this issue is? Is there a JIRA
that is tracking this issue?


Thanks.

Asim

Patrick Wendell writes: "Within a partition things will spill - so the
current documentation is correct. This spilling can only occur *across keys*
at the moment. Spilling cannot occur within a key at present. [...] Spilling
within one key for GroupBy's is likely to end up in the next release of
Spark, Spark 1.2. In most cases we see when users hit this, they are
actually trying to just do aggregations which would be more efficiently
implemented without the groupBy operator."




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Understanding-RDD-GroupBy-OutOfMemory-Exceptions-tp11427p21016.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Understanding RDD.GroupBy OutOfMemory Exceptions

Reply via email to