mongo360 opened a new pull request, #39446:
URL: https://github.com/apache/doris/pull/39446
## Proposed changes
**Problem**
In legacy planner, the predicate like create_day <= '2024-08-08 23:59:59'
will modify as cast (create_day as DATETIMEV2) <= '2024-08-08 23:59:59' when
the column _create_day_ is DATE; and this with cause the scan keys range not
right in olap scan node.
**Example**
1. Test Table
```
CREATE TABLE `ad_effects` (
`pin_id` BIGINT NOT NULL COMMENT '广告主pin_id',
`day` date NOT NULL COMMENT '点击日期(天)',
`date_time` datetime NOT NULL COMMENT '点击时间',
`impressions` BIGINT SUM NULL DEFAULT "0" COMMENT '用户总展现',
`clicks` BIGINT SUM NULL DEFAULT "0" COMMENT '用户总点击',
`cost` BIGINT SUM NULL DEFAULT "0" COMMENT '广告花费'
) ENGINE=OLAP
AGGREGATE KEY(`pin_id`, `day`, `date_time`)
COMMENT 'OLAP'
PARTITION BY RANGE(`day`)
(PARTITION p20240809 VALUES [('2024-08-09'), ('2024-08-10')),
PARTITION p20240810 VALUES [('2024-08-10'), ('2024-08-11')),
PARTITION p20240811 VALUES [('2024-08-11'), ('2024-08-12')),
PARTITION p20240812 VALUES [('2024-08-12'), ('2024-08-13')),
PARTITION p20240813 VALUES [('2024-08-13'), ('2024-08-14')),
PARTITION p20240814 VALUES [('2024-08-14'), ('2024-08-15')))
DISTRIBUTED BY HASH(`pin_id`) BUCKETS 16
PROPERTIES (
"replication_allocation" = "tag.location.default: 2",
"is_being_synced" = "false",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.time_zone" = "Asia/Shanghai",
"dynamic_partition.start" = "-2147483648",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_allocation" = "tag.location.default: 2",
"dynamic_partition.buckets" = "16",
"dynamic_partition.create_history_partition" = "false",
"dynamic_partition.history_partition_num" = "-1",
"dynamic_partition.hot_partition_num" = "0",
"dynamic_partition.reserved_history_periods" = "NULL",
"dynamic_partition.storage_policy" = "",
"storage_medium" = "hdd",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);
```
2. Test SQL
```
select pin_id,sum(cost) from ad_effects where pin_id = 200 and day
>= '2024-08-11 00:00:00' and day < '2024-08-12 08:00:00' group by
pin_id
```
the profile:
```
VNewOlapScanNode(ad_effects) (id=0):(Active:
1.41ms, % non-child: 0.00%)
- RuntimeFilters: :
- PushDownPredicates: []
- KeyRanges:
ScanKeys:ScanKey=[200,0000-01-01 : 200,9999-12-31]
- TabletIds: [17701]
- RemainedPredicates:
VectorizedFn[VectorizedFnCall[ge](arguments=(CAST day(DateV2) TO
DateTimeV2), DateTimeV2,return=UInt8)]{
CastExpr(CAST DateTimeV2 to DateTimeV2){SlotRef(slot_id=2 type=DATEV2)},
VLiteral (name = DateTimeV2, type = DateTimeV2, value = (2024-08-11
00:00:00))}, VectorizedFn[VectorizedFnCall[lt](arguments=(CAST day(DateV2)
TO DateTimeV2), DateTimeV2,return=UInt8)]{
CastExpr(CAST DateTimeV2 to DateTimeV2){SlotRef(slot_id=2 type=DATEV2)},
VLiteral (name = DateTimeV2, type = DateTimeV2, value = (2024-08-12
08:00:00))}
```
**Solution**
When the predicate is like date column with datetime literal, modify
datetime literal to column type in legacy planner same as nereids planner;
1. explain
```
mysql> explain select pin_id,sum(cost) from ad_effects where pin_id = 200
and day >= '2024-08-11 00:00:00' and day < '2024-08-12 08:00:00' group by
pin_id;
+------------------------------------------------------------------------------------+
| Explain String(Old Planner)
|
+------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0
|
| OUTPUT EXPRS:
|
| <slot 3> `pin_id`
|
| <slot 4> sum(`cost`)
|
| PARTITION: RANDOM
|
|
|
| HAS_COLO_PLAN_NODE: false
|
|
|
| VRESULT SINK
|
|
|
| 1:VAGGREGATE (update finalize)
|
| | output: sum(`cost`)
|
| | group by: `pin_id`
|
| | cardinality=-1
|
| |
|
| 0:VOlapScanNode
|
| TABLE: default_cluster:db.ad_effects(ad_effects), PREAGGREGATION: ON
|
| PREDICATES: `pin_id` = 200 AND `day` >= '2024-08-11' AND `day` <
'2024-08-13' |
| partitions=1/18 (p20240811)
|
| tablets=1/16, tabletList=17701
|
| cardinality=2, avgRowSize=2527.5, numNodes=2
|
| pushAggOp=NONE
|
+------------------------------------------------------------------------------------+
```
3. profile
```
VNewOlapScanNode(ad_effects) (id=0):(Active: 717.471us, %
non-child: 0.00%)
- RuntimeFilters: :
- PushDownPredicates: []
- KeyRanges: ScanKeys:ScanKey=[200,2024-08-11 :
200,2024-08-13)
- TabletIds: [17701]
- UseSpecificThreadToken: False
- AcquireRuntimeFilterTime: 648ns
- AllocateResourceTime: 194.829us
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]