Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/5635#issuecomment-95287848
/cc @kayousterhout @rxin.
I noticed this in some benchmarking work that I'm doing (more details on
the JIRA: https://issues.apache.org/jira/browse/SPARK-7058). Before this
patch, we would almost always report 1 or 2 millisecond deserialization times.
Now that this metric captures all of the deserialization costs, I'm seeing
tasks that spend between 70 and 150ms in deserialization.
I have several ideas of how to optimize this deserialization to reduce this
time and I'll address them later patches.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]