[
https://issues.apache.org/jira/browse/SPARK-43491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
KuijianLiu updated SPARK-43491:
-------------------------------
Description:
The query results of Spark SQL 3.1.1 and Hive SQL 3.1.0 are inconsistent with
same sql. Spark SQL calculates `{{{}0 in ('00')`{}}} as false, which act
different from `{{{}=`{}}} keyword, but Hive calculates true. Hive is
compatible with the `{{{}in`{}}} keyword in 3.1.0, but SparkSQL does not.
It's better when dataTypes of elements in `{{{}In`{}}} expression are the
same, it should behaviour as same as BinaryComparison like ` {{{}EqualTo`{}}}.
Test SQL:
{code:java}
scala> spark.sql("select 1 as test where 0 = '00'").show
+----+
|test|
+----+
| 1|
+----+
scala> spark.sql("select 1 as test where 0 in ('00')").show
+----+
|test|
+----+
+----+
scala> spark.sql("select 1 as test where 0 = '00'").explain(true)
== Parsed Logical Plan ==
'Project [1 AS test#23]
+- 'Filter (0 = 00)
+- OneRowRelation== Analyzed Logical Plan ==
test: int
Project [1 AS test#23]
+- Filter (0 = cast(00 as int))
+- OneRowRelation== Optimized Logical Plan ==
Project [1 AS test#23]
+- OneRowRelation== Physical Plan ==
*(1) Project [1 AS test#23]
+- *(1) Scan OneRowRelation[]
scala> spark.sql("select 1 as test where 0 in ('00')").explain(true)
== Parsed Logical Plan ==
'Project [1 AS test#25]
+- 'Filter 0 IN (00)
+- OneRowRelation== Analyzed Logical Plan ==
test: int
Project [1 AS test#25]
+- Filter cast(0 as string) IN (cast(00 as string))
+- OneRowRelation== Optimized Logical Plan ==
LocalRelation <empty>, [test#25]== Physical Plan ==
LocalTableScan <empty>, [test#25]
{code}
!image-2023-05-13-13-14-55-853.png!
was:
The query results of Spark SQL 3.1.1 and Hive SQL 3.1.0 are inconsistent with
same sql. Spark SQL calculates `{{{}0 in ('00')`{}}} as false, which act
different from `{{{}=`{}}} keyword, but Hive calculates true. Hive is
compatible with the `{{{}in`{}}} keyword in 3.1.0, but SparkSQL does not.
It's better when dataTypes of elements in `{{{}In`{}}} expression are the
same, it should behaviour as same as BinaryComparison like ` {{{}EqualTo`{}}}.
Test SQL:
{code:java}
scala> spark.sql("select 1 as test where 0 = '00'").show
+----+
|test|
+----+
| 1|
+----+
scala> spark.sql("select 1 as test where 0 in ('00')").show
+----+
|test|
+----+
+----+
scala> spark.sql("select 1 as test where 0 = '00'").explain(true)
== Parsed Logical Plan ==
'Project [1 AS test#23]
+- 'Filter (0 = 00)
+- OneRowRelation== Analyzed Logical Plan ==
test: int
Project [1 AS test#23]
+- Filter (0 = cast(00 as int))
+- OneRowRelation== Optimized Logical Plan ==
Project [1 AS test#23]
+- OneRowRelation== Physical Plan ==
*(1) Project [1 AS test#23]
+- *(1) Scan OneRowRelation[]
scala> spark.sql("select 1 as test where 0 in ('00')").explain(true)
== Parsed Logical Plan ==
'Project [1 AS test#25]
+- 'Filter 0 IN (00)
+- OneRowRelation== Analyzed Logical Plan ==
test: int
Project [1 AS test#25]
+- Filter cast(0 as string) IN (cast(00 as string))
+- OneRowRelation== Optimized Logical Plan ==
LocalRelation <empty>, [test#25]== Physical Plan ==
LocalTableScan <empty>, [test#25]
{code}
> In expression not compatible with EqualTo Expression
> ----------------------------------------------------
>
> Key: SPARK-43491
> URL: https://issues.apache.org/jira/browse/SPARK-43491
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.3.1
> Reporter: KuijianLiu
> Priority: Minor
> Attachments: image-2023-05-13-13-14-55-853.png
>
>
> The query results of Spark SQL 3.1.1 and Hive SQL 3.1.0 are inconsistent
> with same sql. Spark SQL calculates `{{{}0 in ('00')`{}}} as false, which act
> different from `{{{}=`{}}} keyword, but Hive calculates true. Hive is
> compatible with the `{{{}in`{}}} keyword in 3.1.0, but SparkSQL does not.
> It's better when dataTypes of elements in `{{{}In`{}}} expression are the
> same, it should behaviour as same as BinaryComparison like ` {{{}EqualTo`{}}}.
> Test SQL:
> {code:java}
> scala> spark.sql("select 1 as test where 0 = '00'").show
> +----+
> |test|
> +----+
> | 1|
> +----+
> scala> spark.sql("select 1 as test where 0 in ('00')").show
> +----+
> |test|
> +----+
> +----+
> scala> spark.sql("select 1 as test where 0 = '00'").explain(true)
> == Parsed Logical Plan ==
> 'Project [1 AS test#23]
> +- 'Filter (0 = 00)
> +- OneRowRelation== Analyzed Logical Plan ==
> test: int
> Project [1 AS test#23]
> +- Filter (0 = cast(00 as int))
> +- OneRowRelation== Optimized Logical Plan ==
> Project [1 AS test#23]
> +- OneRowRelation== Physical Plan ==
> *(1) Project [1 AS test#23]
> +- *(1) Scan OneRowRelation[]
> scala> spark.sql("select 1 as test where 0 in ('00')").explain(true)
> == Parsed Logical Plan ==
> 'Project [1 AS test#25]
> +- 'Filter 0 IN (00)
> +- OneRowRelation== Analyzed Logical Plan ==
> test: int
> Project [1 AS test#25]
> +- Filter cast(0 as string) IN (cast(00 as string))
> +- OneRowRelation== Optimized Logical Plan ==
> LocalRelation <empty>, [test#25]== Physical Plan ==
> LocalTableScan <empty>, [test#25]
> {code}
>
> !image-2023-05-13-13-14-55-853.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]