HyukjinKwon opened a new pull request, #50575:
URL: https://github.com/apache/spark/pull/50575
### What changes were proposed in this pull request?
This PR proposes to respect `spark.api.mode` and `spark.remote` properly
when parsing arguments in Spark Submission. Currently, the `isRemote` is set
early before starting to parse the configuration arguments at `getApiMode`.
### Why are the changes needed?
In Spark 4.0 release (Spark Connect distribution),
```bash
SPARK_CONNECT_MODE=1 ./bin/pyspark --conf spark.api.mode=classic --master
local
```
It fails as below:
```
/.../spark/python/pyspark/shell.py:94: UserWarning: Failed to initialize
Spark session.
warnings.warn("Failed to initialize Spark session.")
Traceback (most recent call last):
File "/.../spark/python/pyspark/shell.py", line 89, in <module>
spark = SparkSession._create_shell_session()
File "/.../spark/python/pyspark/sql/session.py", line 1249, in
_create_shell_session
return SparkSession._getActiveSessionOrCreate()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/.../spark/python/pyspark/sql/session.py", line 1260, in
_getActiveSessionOrCreate
spark = SparkSession.getActiveSession()
File "/.../spark/python/pyspark/sql/utils.py", line 357, in wrapped
return f(*args, **kwargs)
File "/.../spark/python/pyspark/sql/session.py", line 747, in
getActiveSession
if jSparkSessionClass.getActiveSession().isDefined():
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
TypeError: 'JavaPackage' object is not callable
```
We should fix this.
### Does this PR introduce _any_ user-facing change?
No to end users because the main change has not been released yet.
### How was this patch tested?
Manually tested with some combinations below:
Positive cases:
```
SPARK_CONNECT_MODE=1 ./bin/pyspark --conf spark.api.mode=classic --master
local
SPARK_CONNECT_MODE=1 ./bin/pyspark --master local --conf
spark.api.mode=connect
SPARK_CONNECT_MODE=1 ./bin/pyspark --conf spark.api.mode=classic
SPARK_CONNECT_MODE=1 ./bin/pyspark --conf spark.api.mode=connect
SPARK_CONNECT_MODE=1 ./bin/pyspark
```
Negative cases:
```
SPARK_CONNECT_MODE=1 ./bin/pyspark --conf spark.remote="local[*]" --conf
spark.api.mode=connect --conf spark.master="local[*]"
SPARK_CONNECT_MODE=1 ./bin/pyspark --master "local[*]" --remote "local[*]"
SPARK_CONNECT_MODE=1 ./bin/pyspark --conf spark.remote="local[*]" --conf
spark.api.mode=connect --master "local[*]"
```
### Was this patch authored or co-authored using generative AI tooling?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]