[
https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuming Wang updated SPARK-30872:
--------------------------------
Description:
{code:scala}
scala> spark.range(20).selectExpr("id as a", "id as b", "id as
c").write.saveAsTable("t1")
scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c
= 13)").explain(false)
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)])
+- Exchange SinglePartition, true, [id=#76]
+- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
+- *(1) Project
+- *(1) Filter (((((((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L =
13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND
(a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
+- *(1) ColumnarToRow
+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true,
DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)),
isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location:
InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
PartitionFilters: [], PushedFilters: [IsNotNull(c),
Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a),
Or(EqualTo(c,3),EqualT..., ReadSchema: struct<a:bigint,b:bigint,c:bigint>
{code}
We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}.
was:
{code:scala}
scala> spark.range(20).selectExpr("id as a", "id as b", "id as
c").write.saveAsTable("t1")
scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c
= 13)").explain(false)
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)])
+- Exchange SinglePartition, true, [id=#76]
+- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
+- *(1) Project
+- *(1) Filter (((((((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L =
13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND
(a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
+- *(1) ColumnarToRow
+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true,
DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)),
isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location:
InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
PartitionFilters: [], PushedFilters: [IsNotNull(c),
Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a),
Or(EqualTo(c,3),EqualT..., ReadSchema: struct<a:bigint,b:bigint,c:bigint>
{code}
> Constraints inferred from inferred attributes
> ---------------------------------------------
>
> Key: SPARK-30872
> URL: https://issues.apache.org/jira/browse/SPARK-30872
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Yuming Wang
> Priority: Major
>
> {code:scala}
> scala> spark.range(20).selectExpr("id as a", "id as b", "id as
> c").write.saveAsTable("t1")
> scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or
> c = 13)").explain(false)
> == Physical Plan ==
> *(2) HashAggregate(keys=[], functions=[count(1)])
> +- Exchange SinglePartition, true, [id=#76]
> +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
> +- *(1) Project
> +- *(1) Filter (((((((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L =
> 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND
> (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
> +- *(1) ColumnarToRow
> +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched:
> true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)),
> isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet,
> Location:
> InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
> PartitionFilters: [], PushedFilters: [IsNotNull(c),
> Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a),
> Or(EqualTo(c,3),EqualT..., ReadSchema: struct<a:bigint,b:bigint,c:bigint>
> {code}
> We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]