[
https://issues.apache.org/jira/browse/SPARK-33419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229892#comment-17229892
]
Apache Spark commented on SPARK-33419:
--------------------------------------
User 'yaooqinn' has created a pull request for this issue:
https://github.com/apache/spark/pull/30332
> Unexpected behavior when using SET commands before a query in SparkSession.sql
> ------------------------------------------------------------------------------
>
> Key: SPARK-33419
> URL: https://issues.apache.org/jira/browse/SPARK-33419
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.7, 3.0.2, 3.1.0
> Reporter: Kent Yao
> Priority: Major
>
> SparkSession.sql converts a string value to a DataFrame, and the string value
> should be one single SQL statement ending up w/ or w/o one or more
> semicolons. e.g.
> {code:sql}
> scala> spark.sql(" select 2").show
> +---+
> | 2|
> +---+
> | 2|
> +---+
> scala> spark.sql(" select 2;").show
> +---+
> | 2|
> +---+
> | 2|
> +---+
> scala> spark.sql(" select 2;;;;").show
> +---+
> | 2|
> +---+
> | 2|
> +---+
> {code}
> If you put 2 or more statements in, it fails in the parser e.g.
> {code:java}
> scala> spark.sql(" select 2; select 1;").show
> org.apache.spark.sql.catalyst.parser.ParseException:
> extraneous input 'select' expecting {<EOF>, ';'}(line 1, pos 11)
> == SQL ==
> select 2; select 1;
> -----------^^^
> at
> org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:263)
> at
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:130)
> at
> org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:51)
> at
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:81)
> at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:610)
> at
> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
> at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:610)
> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:769)
> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:607)
> ... 47 elided
> {code}
> As a very generic user scenario, they want to change some settings before
> they execute
> the queries. They may pass a string value like `set spark.sql.abc=2; select
> 1;` into this API, which creates a confusing gap between the actual effect
> and the user's expectations.
> The user may want the query to be executed with spark.sql.abc=2, but Spark
> actually treats the whole part of `2; select 1;` as the value of the property
> 'spark.sql.abc',
> e.g.
> {code:java}
> scala> spark.sql("set spark.sql.abc=2; select 1;").show
> +-------------+------------+
> | key| value|
> +-------------+------------+
> |spark.sql.abc|2; select 1;|
> +-------------+------------+
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]