squito commented on a change in pull request #23337: [SPARK-26019][PYSPARK]
Allow insecure py4j gateways
URL: https://github.com/apache/spark/pull/23337#discussion_r242988593
##########
File path: python/pyspark/context.py
##########
@@ -112,6 +112,18 @@ def __init__(self, master=None, appName=None,
sparkHome=None, pyFiles=None,
ValueError:...
"""
self._callsite = first_spark_call() or CallSite(None, None, None)
+ if gateway is not None and gateway.gateway_parameters.auth_token is
None:
+ if conf and conf.get("spark.python.allowInsecurePy4j", "false") ==
"true":
Review comment:
wait, lets be clear on the end goal here. The point is for an end user to
be able to enable the opt-in *without* them changing the code in zeppelin at
all. If we're going to change the code in zeppelin, it should just be changed
to do the right thing and create a secure gateway (as zeppelin already has
changed in master, and I think even v0.8.0 now that I look more closely).
so looking at an old version of zeppelin, eg:
https://github.com/apache/zeppelin/blob/v0.7.3/spark/src/main/java/org/apache/zeppelin/spark/PySparkInterpreter.java#L205-L229
If I'm reading that correctly, it looks like environment variables the user
has set when starting zeppelin will get passed through to the python command (I
think that is what `EnvironmentUtils.getProcEnvironment()` does). But there
isn't any way for the user to add additional confs to that command. I assume
things other than zeppelin would work similarly
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]