Github user kayousterhout commented on the issue:
https://github.com/apache/spark/pull/16053
Thanks for the review @squito. I got sidetracked from this at the end of
last week and forgot to post the results of some benchmarks @shivaram and I did
on a 20-machine m2.4xlarge EC2 machines (160 cores). We ran ~30 trials of code
[1] (a very simple job with 10K tasks per stage) and measured the average time
per stage:
Before this change: 2490ms
With this change: 2345 ms (so ~6% improvement over the baseline)
With @witgo's approach in #15505: 2046 ms (~18% improvement over baseline)
The reason that #15505 has a more significant improvement is that it also
moves the serialization from the TaskSchedulerImpl thread to the
CoarseGrainedSchedulerBackend thread. I added that functionality on top of
this change, and got almost the same improvement as #15505 (average of 2103ms).
I think we should decouple these two changes, both so we have some record of
the improvement form each individual improvement, and because this change is
more about simplifying the code base (the improvement is negligible) while the
other is about performance improvement. I filed a separate JIRA for that issue
here: https://issues.apache.org/jira/browse/SPARK-18890
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]