It is hard to say what could be reason without more detail information. If you
provide some more information, maybe people here can help you better.
1) What is your worker's memory setting? It looks like that your nodes have
128G physical memory each, but what do you specify for the worker's
Using large memory for executors (*--executor-memory 120g*).
Not really a good advice.
On Thu, Apr 2, 2015 at 9:17 AM, Cheng, Hao hao.ch...@intel.com wrote:
Spark SQL tries to load the entire partition data and organized as
In-Memory HashMaps, it does eat large memory if there are not many
Spark SQL tries to load the entire partition data and organized as In-Memory
HashMaps, it does eat large memory if there are not many duplicated group by
keys with large amount of records;
Couple of things you can try case by case:
·Increasing the partition numbers (the records count