[
https://issues.apache.org/jira/browse/SPARK-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15531051#comment-15531051
]
Josh Rosen commented on SPARK-17712:
------------------------------------
This appears to be an optimizer bug:
{code}
16/09/28 15:18:57 TRACE SparkOptimizer:
=== Applying Rule
org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases ===
Project [1 AS 1#10] Project [1 AS 1#10]
+- Filter false +- Filter false
! +- SubqueryAlias t1 +- Aggregate
[count(1) AS count(1)#9L]
! +- Aggregate [count(1) AS count(1)#9L] +- Range (1, 10,
step=1, splits=Some(8))
! +- SubqueryAlias diamonds
! +- Range (1, 10, step=1, splits=Some(8))
16/09/28 15:18:57 DEBUG SparkOptimizer:
=== Result of Batch Finish Analysis ===
Project [1 AS 1#10] Project [1 AS 1#10]
+- Filter false +- Filter false
! +- SubqueryAlias t1 +- Aggregate
[count(1) AS count(1)#9L]
! +- Aggregate [count(1) AS count(1)#9L] +- Range (1, 10,
step=1, splits=Some(8))
! +- SubqueryAlias diamonds
! +- Range (1, 10, step=1, splits=Some(8))
16/09/28 15:18:57 TRACE SparkOptimizer: Fixed point reached for batch Union
after 1 iterations.
16/09/28 15:18:57 TRACE SparkOptimizer: Batch Union has no effect.
16/09/28 15:18:57 TRACE SparkOptimizer: Fixed point reached for batch Subquery
after 1 iterations.
16/09/28 15:18:57 TRACE SparkOptimizer: Batch Subquery has no effect.
16/09/28 15:18:57 TRACE SparkOptimizer: Fixed point reached for batch Replace
Operators after 1 iterations.
16/09/28 15:18:57 TRACE SparkOptimizer: Batch Replace Operators has no effect.
16/09/28 15:18:57 TRACE SparkOptimizer: Fixed point reached for batch Aggregate
after 1 iterations.
16/09/28 15:18:57 TRACE SparkOptimizer: Batch Aggregate has no effect.
16/09/28 15:18:57 TRACE SparkOptimizer:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.PushDownPredicate ===
Project [1 AS 1#10] Project [1 AS 1#10]
!+- Filter false +- Aggregate [count(1) AS
count(1)#9L]
! +- Aggregate [count(1) AS count(1)#9L] +- Filter false
+- Range (1, 10, step=1, splits=Some(8)) +- Range (1, 10,
step=1, splits=Some(8))
16/09/28 15:18:57 TRACE SparkOptimizer:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ColumnPruning ===
Project [1 AS 1#10] Project [1 AS 1#10]
!+- Aggregate [count(1) AS count(1)#9L] +- Aggregate
! +- Filter false +- Project
! +- Range (1, 10, step=1, splits=Some(8)) +- Filter false
! +- Range (1, 10,
step=1, splits=Some(8))
16/09/28 15:18:57 TRACE SparkOptimizer:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.CollapseProject ===
!Project [1 AS 1#10] Aggregate [1 AS 1#10]
!+- Aggregate +- Project
! +- Project +- Filter false
! +- Filter false +- Range (1, 10,
step=1, splits=Some(8))
! +- Range (1, 10, step=1, splits=Some(8))
16/09/28 15:18:57 TRACE SparkOptimizer:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.PruneFilters ===
Aggregate [1 AS 1#10] Aggregate [1 AS 1#10]
+- Project +- Project
! +- Filter false +- LocalRelation <empty>,
[id#0L]
! +- Range (1, 10, step=1, splits=Some(8))
16/09/28 15:18:57 TRACE SparkOptimizer: Fixed point reached for batch Operator
Optimizations after 2 iterations.
{code}
It looks like the {{PushDownPredicate}} rule is pushing the filter beneath an
aggregate, which is unsound.
> Incorrect result when selecting from aggregate subquery where outer WHERE
> clause constant-folds to false
> --------------------------------------------------------------------------------------------------------
>
> Key: SPARK-17712
> URL: https://issues.apache.org/jira/browse/SPARK-17712
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.2, 2.0.0, 2.0.2
> Reporter: Josh Rosen
> Labels: correctness
>
> Let {{diamonds}} be a non-empty table. The following two queries should both
> return no rows, but the first returns a single row:
> {code}
> SELECT
> 1
> FROM (
> SELECT
> count(*)
> FROM diamonds
> ) t1
> WHERE
> false
> {code}
> {code}
> SELECT
> 1
> FROM (
> SELECT
> *
> FROM diamonds
> ) t1
> WHERE
> false
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]