HyukjinKwon commented on a change in pull request #34757:
URL: https://github.com/apache/spark/pull/34757#discussion_r769245766



##########
File path: python/pyspark/sql/session.py
##########
@@ -304,8 +306,15 @@ def __init__(
                 and not 
self._jvm.SparkSession.getDefaultSession().get().sparkContext().isStopped()
             ):
                 jsparkSession = 
self._jvm.SparkSession.getDefaultSession().get()
+                getattr(getattr(self._jvm, "SparkSession$"), 
"MODULE$").applyModifiableSettings(

Review comment:
       @AngersZhuuuu, this actually shows a lot of new warnings (see also 
https://github.com/apache/spark/pull/34893). Another reproducer:
   
   ```bash
   ./bin/spark-shell --conf spark.executor.memory=8g --conf 
spark.driver.memory=8g
   ```
   
   ```python
   >>> from pyspark.sql.functions import udf
   >>> udf(lambda x: x)("a")
   21/12/15 14:03:15 WARN SparkSession: Using an existing SparkSession; the 
static sql configurations will not take effect.
   Column<'<lambda>(a)'>
   ```
   
   There are more places to fix like this:
   
   ```python
   ml/util.py:            self._sparkSession = 
SparkSession.builder.getOrCreate()
   sql/column.py:            spark = SparkSession.builder.getOrCreate()
   sql/context.py:            sparkSession = SparkSession.builder.getOrCreate()
   sql/readwriter.py:        spark = SparkSession.builder.getOrCreate()
   sql/readwriter.py:        spark = SparkSession.builder.getOrCreate()
   sql/session.py:                return SparkSession.builder.getOrCreate()
   sql/session.py:        return SparkSession.builder.getOrCreate()
   sql/streaming.py:        spark = SparkSession.builder.getOrCreate()
   sql/streaming.py:        spark = SparkSession.builder.getOrCreate()
   sql/udf.py:        spark = SparkSession.builder.getOrCreate()
   ```

##########
File path: python/pyspark/sql/session.py
##########
@@ -304,8 +306,15 @@ def __init__(
                 and not 
self._jvm.SparkSession.getDefaultSession().get().sparkContext().isStopped()
             ):
                 jsparkSession = 
self._jvm.SparkSession.getDefaultSession().get()
+                getattr(getattr(self._jvm, "SparkSession$"), 
"MODULE$").applyModifiableSettings(

Review comment:
       Would you mind fixing these please?
   
   If we can't make it until Spark 3.3, I think maybe it's just safer to revert 
https://github.com/apache/spark/pull/34757 
https://github.com/apache/spark/pull/34732 and 
https://github.com/apache/spark/pull/34559 for now because each patch here will 
introduce either:
   1. Unexpected configuration propagation of static SQL configuration, or
   2. Too much warnings
   
   Separately, I still feel 
https://github.com/apache/spark/commit/8424f552293677717da7411ed43e68e73aa7f0d6 
is inefficient. We don't know which configurations don't take affect, or why it 
keeps complaining (see the example above) for which configuration. We should 
probably at least print out the keys or lower the level of log.
   
   cc @AngersZhuuuu @yaooqinn @maropu @dongjoon-hyun FYI

##########
File path: python/pyspark/sql/session.py
##########
@@ -304,8 +306,15 @@ def __init__(
                 and not 
self._jvm.SparkSession.getDefaultSession().get().sparkContext().isStopped()
             ):
                 jsparkSession = 
self._jvm.SparkSession.getDefaultSession().get()
+                getattr(getattr(self._jvm, "SparkSession$"), 
"MODULE$").applyModifiableSettings(

Review comment:
       Would you mind fixing these please?
   
   If we can't make it in Spark 3.3, I think maybe it's just safer to revert 
https://github.com/apache/spark/pull/34757 
https://github.com/apache/spark/pull/34732 and 
https://github.com/apache/spark/pull/34559 for now because each patch here will 
introduce either:
   1. Unexpected configuration propagation of static SQL configuration, or
   2. Too much warnings
   
   Separately, I still feel 
https://github.com/apache/spark/commit/8424f552293677717da7411ed43e68e73aa7f0d6 
is inefficient. We don't know which configurations don't take affect, or why it 
keeps complaining (see the example above) for which configuration. We should 
probably at least print out the keys or lower the level of log.
   
   cc @AngersZhuuuu @yaooqinn @maropu @dongjoon-hyun FYI




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to