We have ~120 executors with 5 cores each, for a very long-running job which
crunches ~2.5 TB of data with has too many filters to query. Currently, we
have ~30k partitions which make ~90MB per partition.

We are using Spark v2.2.2 as of now. The major problem we are facing is due
to GC on the driver. All of the driver memory (30G) is getting filled and GC
is very active, which is taking more than 50% of the runtime for Full GC
Evacuation. The heap dump indicates that 80% of the memory is being occupied
by LiveListenerBus and it's not being cleared by GC. Frequent GC runs are
clearing newly created objects only.

>From the Jira tickets, I got to know that Memory consumption by
LiveListenerBus has been addressed in v2.3 (not sure of the specifics). But
until we evaluate migrating to v2.3, is there any quick fix or workaround
either to prevent various listerner events bulking up in driver's memory or
to identify and disable the Listener which is causing the delay in
processing events.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to