Hey all,

Is there any notion of a lightweight python client for submitting jobs to a
Spark cluster remotely? If I essentially install Spark on the client
machine, and that machine has the same OS, same version of Python, etc.,
then I'm able to communicate with the cluster just fine. But if Python
versions differ slightly, then I start to see a lot of opaque errors that
often bubble up as EOFExceptions. Furthermore, this just seems like a very
heavy weight way to set up a client.

Does anyone have any suggestions for setting up a thin pyspark client on a
node which doesn't necessarily conform to the homogeneity of the target
Spark cluster?

Best,
Chris

Reply via email to