I got another report about this recently, and figured out that it's caused by having different versions of python in driver and YARN:
http://stackoverflow.com/questions/28879803/spark-runs-in-local-but-not-in-yarn/28931934#28931934 Created JIRA: https://issues.apache.org/jira/browse/SPARK-6216?filter=-1 On Tue, Aug 19, 2014 at 12:12 PM, Davies Liu <dav...@databricks.com> wrote: > This script run very well without your CSV file. Could download you > CSV file into local disks, and narrow down to the lines which triggle > this issue? > > On Tue, Aug 19, 2014 at 12:02 PM, Aaron <aaron.doss...@target.com> wrote: >> These three lines of python code cause the error for me: >> >> sc = SparkContext(appName="foo") >> input = sc.textFile("hdfs://[valid hdfs path]") >> mappedToLines = input.map(lambda myline: myline.split(",")) >> >> The file I'm loading is a simple CSV. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Python-script-runs-fine-in-local-mode-errors-in-other-modes-tp12390p12398.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
