Hi, thanks for your replies. I'm out of office now, so I will check it out on Monday morning, but guess about serialization/deserialization looks plausible.
Thanks, Andrei On Sat, Nov 16, 2013 at 11:11 AM, Jey Kottalam <[email protected]> wrote: > Hi Andrei, > > Could you please post the stderr logfile from the failed executor? You can > find this in the "work" subdirectory of the worker that had the failed > task. You'll need the executor id to find the corresonding stderr file. > > Thanks, > -Jey > > > On Friday, November 15, 2013, Andrei wrote: > >> I have 2 Python modules/scripts - task.py and runner.py. First one >> (task.py) is a little Spark job and works perfectly well by itself. >> However, when called from runner.py with exactly the same arguments, it >> fails with only useless message (both - in terminal and worker logs). >> >> org.apache.spark.SparkException: Python worker exited unexpectedly >> (crashed) >> >> Below there's code for both - task.py and runner.py: >> >> task.py >> ----------- >> >> #!/usr/bin/env pyspark >> from __future__ import print_function >> from pyspark import SparkContext >> >> def process(line): >> return line.strip() >> >> def main(spark_master, path): >> sc = SparkContext(spark_master, 'My Job') >> rdd = sc.textFile(path) >> rdd = rdd.map(process) # this line causes troubles when called >> from runner.py >> count = rdd.count() >> print(count) >> >> if __name__ == '__main__': >> main('spark://spark-master-host:7077', >> 'hdfs://hdfs-namenode-host:8020/path/to/file.log') >> >> >> runner.py >> ------------- >> >> #!/usr/bin/env pyspark >> >> import task >> >> if __name__ == '__main__': >> task.main('spark://spark-master-host:7077', >> 'hdfs://hdfs-namenode-host:8020/path/to/file.log') >> >> >> ------------------------------------------------------------------------------------------- >> >> So, what's the difference between calling PySpark-enabled script directly >> and as Python module? What are good rules for writing multi-module Python >> programs with Spark? >> >> Thanks, >> Andrei >> >
