[GitHub] [spark] AngersZhuuuu opened a new pull request #34559: [SPARK-37291][SQL][PYSPARK] PySpark init SparkSession should copy conf to sharedState

GitBox Thu, 11 Nov 2021 07:09:54 -0800


AngersZhuuuu opened a new pull request #34559:
URL: https://github.com/apache/spark/pull/34559



   ### What changes were proposed in this pull request?
   When use write pyspark script like
   ```
   conf = SparkConf().setAppName("test")
   sc = SparkContext(conf = conf)
   session = SparkSession().build().enableHiveSupport().getOrCreate()
   ```
   
   It will build a session without hive support since we use a existed 
SparkContext and we create SparkSession use 
   ```
   SparkSession(sc)
   ```
   This cause we loss configuration added by `config()` such as catalog 
implement.
   
   In scala SparkSession, we create SparkSession with SparkContext and option 
configuration and will path option configuration to SharedState then use 
SharedState create SessionState, but in pyspark, we won't pass options 
configuration to shared state, but pass to SessionState, but this time 
SessionState has been initialized.  So it won't support hive.
   
   In this pr, I pass option configurations to SharedState, then when init 
SessionState, this options will be passed to SharedState too.
   
   ### Why are the changes needed?
   Avoid loss configuration when build SparkSession in pyspark
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manuel tested
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] AngersZhuuuu opened a new pull request #34559: [SPARK-37291][SQL][PYSPARK] PySpark init SparkSession should copy conf to sharedState

Reply via email to