I don't even think task stealing / speculative execution is turned on by default. Do you know what snapshot version you used for 0.8 previously?
On Mon, Nov 4, 2013 at 12:03 PM, Wenlei Xie <[email protected]> wrote: > Hi, > > I have some iterative program written in Spark and have been tested under > a snapshot version of spark 0.8 before. After I ported it to the 0.8 > release version, I see performance drops in large datasets. I am wondering > if there is any clue? > > I monitored the number of partitions on each machine (by looking at > DAGScheduler.getCacheLocs). I observed that some machine may have 30 > partitions in the previous iteration while only have < 10 partitions in the > next iterations. This is something I didn't observed in the older version. > Thus I am wondering if the release version would do task stealing > more aggressively (for a better dynamic load balance?) > > Thank you! > > Best Regards, > Wenlei > >
