924060929 commented on code in PR #38089:
URL: https://github.com/apache/doris/pull/38089#discussion_r1682897843
##########
regression-test/suites/external_table_p0/iceberg/test_iceberg_optimize_count.groovy:
##########
@@ -46,14 +47,16 @@ suite("test_iceberg_optimize_count",
"p0,external,doris,external_docker,external
sqlstr2 = """ select count(*) from sample_cow_parquet; """
sqlstr3 = """ select count(*) from sample_mor_orc; """
sqlstr4 = """ select count(*) from sample_mor_parquet; """
+ sqlstr4_limit = """ select count(*) from sample_mor_parquet limit 1;
"""
Review Comment:
You should add a rbo rule to rewrite `limit 1 + count(*)` to `count(*)` in
rewrite stage, when the aggregate not contains group by key.
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java:
##########
@@ -610,8 +610,8 @@ public Optional<ResultSet> handleQueryInFe(StatementBase
parsedStmt) {
}
if (physicalPlan instanceof PhysicalResultSink
- && physicalPlan.child(0) instanceof PhysicalHashAggregate &&
!getScanNodes().isEmpty()
- && getScanNodes().get(0) instanceof IcebergScanNode) {
+ && !getScanNodes().isEmpty() && getScanNodes().get(0)
instanceof IcebergScanNode
+ &&
getScanNodes().get(0).getPushDownAggNoGroupingOp().equals(TPushAggOp.COUNT)) {
Review Comment:
This condition is trick because it can only process this shape:
```
PhysicalResultSink
|
PhysicalHashAggregate
|
PhysicalFileScan(table=IcebergScan)
```
If there has another PhysicalPlan between PhysicalHashAggregate and
PhysicalFileScan, it will compute a wrong result
```
PhysicalResultSink
|
PhysicalHashAggregate
|
PhysicalHashAggregate
|
PhysicalFileScan(table=IcebergScan)
```
for example:
```sql
select max(cnt) from (select count(*) as cnt from iceberg_table)a
```
I think you should remove this code in NereidsPlanner, the correct approach
is
1. add `PhysicalCountIcebergTable` plan
2. add a rule in Rewrite to rewrite
`LogicalAggregate(LogicalFileScan().when(scan -> scan.getTable instance
IcebergTable)` to `PhysicalCountIcebergTable`
3. translate `PhysicalCountIcebergTable` like `PhysicalOneRowRelation`
5. finally the PhysicalCountIcebergTable can be compute by constant value
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]