Github user kalpit commented on the pull request:

    https://github.com/apache/spark/pull/554#issuecomment-41595891
  
    I see your point. I don't have a Python-only use-case that can trigger the 
NPE.
    
    My custom RDD implementation had a corner-case in which RDD's compute() 
method returned a "null" in the iterator stream. I have fixed my custom RDD 
implementation to not do that, so I don't run into this NPE anymore. However, 
should anyone else out there ever implement a custom RDD of similar nature (has 
nulls for some elements in a partition's iterator stream) and tries accessing 
such an RDD from PySpark, he/she would run into the NPE, so I thought it would 
be nicer if we handled nulls in the stream gracefully.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to