Is this repeatable? Do you always get one or two executors that are 6 times as slow? It could be that some of your tasks have more work to do (maybe you are filtering some records out? If it’s always one particular worker node is there something about the machine configuration (e.g. CPU speed) that means the processing takes longer.
————————————————————————————— Robin East Spark GraphX in Action Michael S Malak and Robin East http://www.manning.com/books/spark-graphx-in-action <http://www.manning.com/books/spark-graphx-in-action> > On 15 Sep 2015, at 12:35, patcharee <patcharee.thong...@uni.no> wrote: > > Hi, > > I was running a job (on Spark 1.5 + Yarn + java 8). In a stage that lookup > (org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:873)) > there was an executor that took the executor computing time > 6 times of > median. This executor had almost the same shuffle read size and low gc time > as others. > > What can impact the executor computing time? Any suggestions what parameters > I should monitor/configure? > > BR, > Patcharee > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >