hello all, we are just testing a semi-realtime application (it should return results in less than 20 seconds from cached RDDs) on spark 1.6.0. before this it used to run on spark 1.5.1
in spark 1.6.0 the performance is similar to 1.5.1 if i set spark.memory.useLegacyMode = true, however if i switch to spark.memory.useLegacyMode = false the queries take about 50% to 100% more time. the issue becomes clear when i focus on a single stage: the individual tasks are not slower at all, but they run on less executors. in my test query i have 50 tasks and 10 executors. both with useLegacyMode = true and useLegacyMode = false the tasks finish in about 3 seconds and show as running PROCESS_LOCAL. however when useLegacyMode = false the tasks run on just 3 executors out of 10, while with useLegacyMode = true they spread out across 10 executors. all the tasks running on just a few executors leads to the slower results. any idea why this would happen? thanks! koert