This appears to be an issue around using pandas. Even if we just
instantiate a dataframe and do nothing with it, the python worker process
is exiting. But if we remove any pandas references, the same job runs to
completion.
Has anyone run into this before?
-Suren
On Mon, Apr 7, 2014 at 1:10 PM
Hi,
We have a situation where a Pyspark script works fine as a local process
("local" url) on the Master and the Worker nodes, which would indicate that
all python dependencies are set up properly on each machine.
But when we try to run the script at the cluster level (using the master's
url), if