Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/30#issuecomment-41798885
  
    Maybe I have something configured wrong, but I'm still getting a lot of 
EOFExceptions.  Certain actions seem to work fine, but when I try to do 
anything that really runs on the executors I get EOFExceptions again and 
/usr/bin/python: No module named pyspark. I'm just using whats checked into 
master.
    
    // this works
    >>> words = sc.textFile("README.md")
    >>> words.filter(lambda w: w.startswith("spar")).take(5)
    >>> words.collect()
    
    // this doesn't
    >>> words = sc.textFile("README.md")
    >>> words.filter(lambda w: w.startswith("spar")).collect()
    >>> wods.count()
    
    ideas?
    
    I checked and PYTHONPATH is set on the executor to be =spark.jar, and py4j 
is in the assembly jar.  launching with MASTER=yarn-client ./bin/pyspark


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to