Adam Binford created ZEPPELIN-5276: -------------------------------------- Summary: Pyspark interpreter doesn't add jars to PYTHONPATH for yarn cluster mode Key: ZEPPELIN-5276 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5276 Project: Zeppelin Issue Type: Bug Reporter: Adam Binford
When using the native spark-submit to run a python script directly, Spark adds all the resolved jars from --jars and --packages to the PYTHONPATH. This lets some packages (like delta.io) automagically add their python packages to your session. Because the Pyspark interpreter is launched from a jar during the spark submit, you don't automatically get that behavior. The PysparkInterpreter should add the jars to the python path for you when bootstrapping the python session. I don't know if this only affects yarn cluster mode or other modes as well, as it's the only one we use. Currently, you can manually working around this by setting your python path directly when creating your session, you just need to know the naming format spark saves jars in: PYTHONPATH=./io.delta_delta-core_2.12-0.8.0.jar -- This message was sent by Atlassian Jira (v8.3.4#803005)