Instead of spark-shell have you tried running it as a job. 

how many executors and cores, can you share the RDD graph and event timeline
on the UI and did you find which of  the tasks taking more time was they are
any GC 

please look at the UI if not already it can provide lot of information

Sent from:

To unsubscribe e-mail:

Reply via email to