This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.5 by this push: new bbaeb35d9bd [SPARK-43509][PYTHON][CONNECT][FOLLOW-UP] Check SPARK_CONNECT_MODE_ENABLED when creating a session bbaeb35d9bd is described below commit bbaeb35d9bdba045a25b19f67618beb5b25f54d4 Author: Hyukjin Kwon <gurwls...@apache.org> AuthorDate: Wed Aug 16 19:48:30 2023 +0200 [SPARK-43509][PYTHON][CONNECT][FOLLOW-UP] Check SPARK_CONNECT_MODE_ENABLED when creating a session ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/41013 that adds a check for `SPARK_CONNECT_MODE_ENABLED` when we create a Spark session via `pyspark.sql.SparkSession.getOrCreate()` so the next session can be consistently returned. ### Why are the changes needed? Currently, it returns non Spark connect session if you call `SparkSession.builder.getOrCreate` after creating a Spark Connect session if `SPARK_REMOTE` is not globally set (that is currently automatically set if you launch `./bin/pyspark --remote local`). So this can only be reproducible when you Spark Connect as a library. See how it's tested below. ### Does this PR introduce _any_ user-facing change? The change has not been released yet so there's no behaviour changes to the end users. ### How was this patch tested? Manually tested as below: ```bash cd python python ``` ```python from pyspark.sql import SparkSession SparkSession.builder.remote("local").getOrCreate() SparkSession.builder.getOrCreate() ``` ``` <pyspark.sql.connect.session.SparkSession object at 0x7fa9807fd790> <pyspark.sql.session.SparkSession object at 0x7fd97890e730> ``` ``` <pyspark.sql.connect.session.SparkSession object at 0x7fa9807fd790> <pyspark.sql.connect.session.SparkSession object at 0x7fa9807fd790> ``` Closes #42464 from HyukjinKwon/SPARK-43509. Authored-by: Hyukjin Kwon <gurwls...@apache.org> Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> (cherry picked from commit 11cbdc291c96926820419b97559f25955d7791d6) Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> --- python/pyspark/sql/session.py | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/python/pyspark/sql/session.py b/python/pyspark/sql/session.py index 9141051fdf8..ce197319977 100644 --- a/python/pyspark/sql/session.py +++ b/python/pyspark/sql/session.py @@ -455,7 +455,11 @@ class SparkSession(SparkConversionMixin): opts = dict(self._options) with self._lock: - if "SPARK_REMOTE" in os.environ or "spark.remote" in opts: + if ( + "SPARK_CONNECT_MODE_ENABLED" in os.environ + or "SPARK_REMOTE" in os.environ + or "spark.remote" in opts + ): with SparkContext._lock: from pyspark.sql.connect.session import SparkSession as RemoteSparkSession --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org