[
https://issues.apache.org/jira/browse/SPARK-44750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ruifeng Zheng updated SPARK-44750:
----------------------------------
Description:
In connect session builder, users use {{config}} method to set options.
However, the options are actually ignored when we create a new session.
{code}
def create(self) -> "SparkSession":
has_channel_builder = self._channel_builder is not None
has_spark_remote = "spark.remote" in self._options
if has_channel_builder and has_spark_remote:
raise ValueError(
"Only one of connection string or channelBuilder "
"can be used to create a new SparkSession."
)
if not has_channel_builder and not has_spark_remote:
raise ValueError(
"Needs either connection string or channelBuilder to create
a new SparkSession."
)
if has_channel_builder:
assert self._channel_builder is not None
session = SparkSession(connection=self._channel_builder)
else:
spark_remote = to_str(self._options.get("spark.remote"))
assert spark_remote is not None
session = SparkSession(connection=spark_remote)
SparkSession._set_default_and_active_session(session)
return session
{code}
we should respect the options by invoking {{session.conf.set}} after creation.
was:
In connect session builder, we use {{config}} method to set options.
However, the options are actually ignored.
{code}
def create(self) -> "SparkSession":
has_channel_builder = self._channel_builder is not None
has_spark_remote = "spark.remote" in self._options
if has_channel_builder and has_spark_remote:
raise ValueError(
"Only one of connection string or channelBuilder "
"can be used to create a new SparkSession."
)
if not has_channel_builder and not has_spark_remote:
raise ValueError(
"Needs either connection string or channelBuilder to create
a new SparkSession."
)
if has_channel_builder:
assert self._channel_builder is not None
session = SparkSession(connection=self._channel_builder)
else:
spark_remote = to_str(self._options.get("spark.remote"))
assert spark_remote is not None
session = SparkSession(connection=spark_remote)
SparkSession._set_default_and_active_session(session)
return session
{code}
we should respect the options by invoking {{session.conf.set}} after creation.
> SparkSession.Builder should respect the options
> -----------------------------------------------
>
> Key: SPARK-44750
> URL: https://issues.apache.org/jira/browse/SPARK-44750
> Project: Spark
> Issue Type: Improvement
> Components: Connect, PySpark
> Affects Versions: 3.5.0, 4.0.0
> Reporter: Ruifeng Zheng
> Assignee: Ruifeng Zheng
> Priority: Major
>
> In connect session builder, users use {{config}} method to set options.
> However, the options are actually ignored when we create a new session.
> {code}
> def create(self) -> "SparkSession":
> has_channel_builder = self._channel_builder is not None
> has_spark_remote = "spark.remote" in self._options
> if has_channel_builder and has_spark_remote:
> raise ValueError(
> "Only one of connection string or channelBuilder "
> "can be used to create a new SparkSession."
> )
> if not has_channel_builder and not has_spark_remote:
> raise ValueError(
> "Needs either connection string or channelBuilder to
> create a new SparkSession."
> )
> if has_channel_builder:
> assert self._channel_builder is not None
> session = SparkSession(connection=self._channel_builder)
> else:
> spark_remote = to_str(self._options.get("spark.remote"))
> assert spark_remote is not None
> session = SparkSession(connection=spark_remote)
> SparkSession._set_default_and_active_session(session)
> return session
> {code}
> we should respect the options by invoking {{session.conf.set}} after creation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]