Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/1083#issuecomment-46235253
It's worth noting that the special handling of the memory serialized
storage level actually introduce a regression. In particular, we first
serialize the values, put it in block manager, and then deserialize the bytes
to return the original values to the user. The tradeoff is that this adds an
extra step to deserialize the bytes in the end, which could be slow for large
partitions.
This special handling will most likely be superseded by a more general
solution for [SPARK-1777](https://issues.apache.org/jira/browse/SPARK-1777),
which avoids unrolling an entire partition if there is not enough space for it,
regardless of the storage level. For now, I will put this PR on hold.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---