I performed a series of TeraGen jobs via spark-submit ( each job generated 
equal size dataset into different S3 buckets )
I noticed that some jobs were fast and some were slow.

Slow jobs always had many log prints like
DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1.0, runningTasks: 1 
( or 2, etc.. )

Fast jobs always have few prints of those lines.

Can someone explain me, why the number of those debug prints are vary for 
different executions of the same job? The more i see those prints - so the 
job is slower.
Does someone experienced the same behavior?

Thanks
Gil.




Reply via email to