HyukjinKwon commented on pull request #30389:
URL: https://github.com/apache/spark/pull/30389#issuecomment-730281020


   @gaborgsomogyi, I locally tried some other standard ways such as passing an 
argument properly and the changes become sort of big and invasive. What do you 
think about just pass the value via environment in the SparkContext for now?
   
   ```diff
   diff --git 
a/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala 
b/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
   index 527d0d6d3a4..33849f6fcb6 100644
   --- a/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
   +++ b/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
   @@ -85,4 +85,8 @@ private[spark] object PythonUtils {
      def getBroadcastThreshold(sc: JavaSparkContext): Long = {
        
sc.conf.get(org.apache.spark.internal.config.BROADCAST_FOR_UDF_COMPRESSION_THRESHOLD)
      }
   +
   +  def getPythonAuthSocketTimeout(sc: JavaSparkContext): Long = {
   +    
sc.conf.get(org.apache.spark.internal.config.Python.PYTHON_AUTH_SOCKET_TIMEOUT)
   +  }
    }
   diff --git a/python/pyspark/context.py b/python/pyspark/context.py
   index 9c9e3f4b3c8..8956e163000 100644
   --- a/python/pyspark/context.py
   +++ b/python/pyspark/context.py
   @@ -222,6 +222,7 @@ class SparkContext(object):
            # data via a socket.
            # scala's mangled names w/ $ in them require special treatment.
            self._encryption_enabled = 
self._jvm.PythonUtils.isEncryptionEnabled(self._jsc)
   +        os.environ["SPARK_AUTH_SOCKET_TIMEOUT"] = 
self._jvm.PythonUtils.getPythonAuthSocketTimeout(self._jsc)
   
            self.pythonExec = os.environ.get("PYSPARK_PYTHON", 'python')
            self.pythonVer = "%d.%d" % sys.version_info[:2]
   ```
   
   This auto will be called only for RDD, DataFrame APIs such as `collect` and 
broadcasting in driver side. So, I think it's safe to assume there's always 
SparkContext running when we need to know the timeout.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to