Github user colorant commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50301634
@mateiz , yep, the map tasks did spill and it seems contribute most to the
increased process time. though in my case only about 400K data been spilled to
disk per task. But it lead to 15 seconds more time for process. Say the task
time increased from 15 seconds to 30 seconds
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---