[
https://issues.apache.org/jira/browse/SPARK-33954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuming Wang updated SPARK-33954:
--------------------------------
Parent: SPARK-34120
Issue Type: Sub-task (was: Improvement)
> Some operator missing rowCount when enable CBO
> ----------------------------------------------
>
> Key: SPARK-33954
> URL: https://issues.apache.org/jira/browse/SPARK-33954
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.2.0
> Reporter: Yuming Wang
> Assignee: Yuming Wang
> Priority: Major
> Fix For: 3.2.0
>
>
> Some operator missing rowCount when enable CBO, for example:
> {code:scala}
> spark.range(1000).selectExpr("id as a", "id as b").write.saveAsTable("t1")
> spark.sql("ANALYZE TABLE t1 COMPUTE STATISTICS FOR ALL COLUMNS")
> spark.sql("set spark.sql.cbo.enabled=true")
> spark.sql("set spark.sql.cbo.planStats.enabled=true")
> spark.sql("select * from (select * from t1 distribute by a limit 100)
> distribute by b").explain("cost")
> {code}
> Current:
> {noformat}
> == Optimized Logical Plan ==
> RepartitionByExpression [b#2129L], Statistics(sizeInBytes=2.3 KiB)
> +- GlobalLimit 100, Statistics(sizeInBytes=2.3 KiB, rowCount=100)
> +- LocalLimit 100, Statistics(sizeInBytes=23.4 KiB)
> +- RepartitionByExpression [a#2128L], Statistics(sizeInBytes=23.4 KiB)
> +- Relation[a#2128L,b#2129L] parquet, Statistics(sizeInBytes=23.4
> KiB, rowCount=1.00E+3)
> {noformat}
> Expected:
> {noformat}
> == Optimized Logical Plan ==
> RepartitionByExpression [b#2129L], Statistics(sizeInBytes=2.3 KiB,
> rowCount=100)
> +- GlobalLimit 100, Statistics(sizeInBytes=2.3 KiB, rowCount=100)
> +- LocalLimit 100, Statistics(sizeInBytes=23.4 KiB, rowCount=1.00E+3)
> +- RepartitionByExpression [a#2128L], Statistics(sizeInBytes=23.4 KiB,
> rowCount=1.00E+3)
> +- Relation[a#2128L,b#2129L] parquet, Statistics(sizeInBytes=23.4
> KiB, rowCount=1.00E+3)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]