Hi, I'm trying to run a Python job that keeps failing. It's a map followed by collect, and works fine if I use Python's built-in map instead of Spark.
I tried to replace the mapping function with an identity (lambda x: x) and that works fine with Spark, so Spark seems to be configured correctly. The error I get is: org.apache.spark.SparkException (org.apache.spark.SparkException: Python worker exited unexpectedly (crashed)) The problem is that I can't see what went wrong in the Python code. Any ideas on how to debug this? Thanks, Michal
