Hi Yuming - I was running into the same issue with larger worker nodes a few
weeks ago.

The way I managed to get around the high GC time, as per the suggestion of
some others, was to break each worker node up into individual workers of
around 10G in size. Divide your cores accordingly.

The other way I was able to avoid high GC time was to use the right kind of
serialisation to keep the number of objects in memory low.

Hope that helps!
- nick



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-High-GC-time-tp23005p23030.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to