Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/2838#issuecomment-59635035
  
    take() is not the only one which will introduce problems, user could call 
mapPartitions(), and read parts of the items in the infile.
    
    Not only re-use the worker, we also want to re-use the socket.
    
    try to call next() on infile maybe better than current approach. We still 
need a special code to tell JVM that  the socket can be re-used or not, and any 
special code could be conflicted by random data. Such as the serialized data is 
broken, we will read the special code from it. In order to reduce the change of 
conflict, we could use VERY long special code, such as 64 bits or 128 bits or 
even more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to