Github user kanzhang commented on the pull request:

    https://github.com/apache/spark/pull/1338#issuecomment-50093417
  
    There is one additional issue though. Currently, when reading the Python 
rdd is not batch serialized. This will trigger re-serialization upon writing 
the same rdd, which should be avoided. Will upload a fix.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to