stczwd commented on a change in pull request #28048:
[SPARK-31142][PYSPARK]Remove useless conf set in pyspark context
URL: https://github.com/apache/spark/pull/28048#discussion_r399929319
##########
File path: python/pyspark/context.py
##########
@@ -181,11 +181,6 @@ def _do_init(self, master, appName, sparkHome, pyFiles,
environment, batchSize,
self.appName = self._conf.get("spark.app.name")
self.sparkHome = self._conf.get("spark.home", None)
- for (k, v) in self._conf.getAll():
Review comment:
> Can you describe which issue you faced? It seems not an issue in the
actual enviornment.
We did meet some problems in actual environment. When the user submits some
basic environment configuration, such as LD_LIBRARY_PATH, we will automatically
load the current or related environment before JVM starts on yarn or k8s.
For example, We have already put hadoop libraries in cluster and set
`LD_LIBRARY_PATH='/usr/native_lib'`. User use these configuration to load
hadoop libraries and python libraries.
```
spark.executorEnv.PYTHONHOME ./python
spark.executorEnv.LD_LIBRARY_PATH $ PYTHONHOME / lib / python2.7 /
site-packages: $ LD_LIBRARY_PATH
```
As pyspark always overwrites the jvm environment, the environment, passed
into python worker, will be invalidated. LD_LIBRARY_PATH will be configured as
`$PYTHONHOME/lib/python2.7/site-packages:$ LD_LIBRARY_PATH`, without any
configuration taking effect.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]