mongo360 opened a new pull request, #39446:
URL: https://github.com/apache/doris/pull/39446

   ## Proposed changes
   
   **Problem**
   In legacy planner, the predicate like create_day <= '2024-08-08 23:59:59' 
will modify as cast (create_day as DATETIMEV2) <= '2024-08-08 23:59:59' when 
the column _create_day_ is DATE; and this with cause the scan keys range not 
right in olap scan node.
   
   **Example**
   1. Test Table
   ```
   CREATE TABLE `ad_effects` (
     `pin_id` BIGINT NOT NULL COMMENT '广告主pin_id',
     `day` date NOT NULL COMMENT '点击日期(天)',
     `date_time` datetime NOT NULL COMMENT '点击时间',
     `impressions` BIGINT SUM NULL DEFAULT "0" COMMENT '用户总展现',
     `clicks` BIGINT SUM NULL DEFAULT "0" COMMENT '用户总点击',
     `cost` BIGINT SUM NULL DEFAULT "0" COMMENT '广告花费'
   ) ENGINE=OLAP
   AGGREGATE KEY(`pin_id`, `day`, `date_time`)
   COMMENT 'OLAP'
   PARTITION BY RANGE(`day`)
   (PARTITION p20240809 VALUES [('2024-08-09'), ('2024-08-10')),
   PARTITION p20240810 VALUES [('2024-08-10'), ('2024-08-11')),
   PARTITION p20240811 VALUES [('2024-08-11'), ('2024-08-12')),
   PARTITION p20240812 VALUES [('2024-08-12'), ('2024-08-13')),
   PARTITION p20240813 VALUES [('2024-08-13'), ('2024-08-14')),
   PARTITION p20240814 VALUES [('2024-08-14'), ('2024-08-15')))
   DISTRIBUTED BY HASH(`pin_id`) BUCKETS 16
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 2",
   "is_being_synced" = "false",
   "dynamic_partition.enable" = "true",
   "dynamic_partition.time_unit" = "DAY",
   "dynamic_partition.time_zone" = "Asia/Shanghai",
   "dynamic_partition.start" = "-2147483648",
   "dynamic_partition.end" = "3",
   "dynamic_partition.prefix" = "p",
   "dynamic_partition.replication_allocation" = "tag.location.default: 2",
   "dynamic_partition.buckets" = "16",
   "dynamic_partition.create_history_partition" = "false",
   "dynamic_partition.history_partition_num" = "-1",
   "dynamic_partition.hot_partition_num" = "0",
   "dynamic_partition.reserved_history_periods" = "NULL",
   "dynamic_partition.storage_policy" = "",
   "storage_medium" = "hdd",
   "storage_format" = "V2",
   "light_schema_change" = "true",
   "disable_auto_compaction" = "false",
   "enable_single_replica_compaction" = "false"
   );
   ```
   2. Test SQL
   ```
   select  pin_id,sum(cost)  from  ad_effects  where  pin_id  =  200  and  day  
>=  '2024-08-11  00:00:00'  and  day  <  '2024-08-12  08:00:00'  group  by  
pin_id
   ```
   the profile:
   ```
                           VNewOlapScanNode(ad_effects)  (id=0):(Active:  
1.41ms,  %  non-child:  0.00%)
                                 -  RuntimeFilters:  :  
                                 -  PushDownPredicates:  []
                                 -  KeyRanges:  
ScanKeys:ScanKey=[200,0000-01-01  :  200,9999-12-31]
                                 -  TabletIds:  [17701]
                                 -  RemainedPredicates:  
VectorizedFn[VectorizedFnCall[ge](arguments=(CAST  day(DateV2)  TO  
DateTimeV2),  DateTimeV2,return=UInt8)]{
   CastExpr(CAST  DateTimeV2  to  DateTimeV2){SlotRef(slot_id=2  type=DATEV2)},
   VLiteral  (name  =  DateTimeV2,  type  =  DateTimeV2,  value  =  (2024-08-11 
 00:00:00))},  VectorizedFn[VectorizedFnCall[lt](arguments=(CAST  day(DateV2)  
TO  DateTimeV2),  DateTimeV2,return=UInt8)]{
   CastExpr(CAST  DateTimeV2  to  DateTimeV2){SlotRef(slot_id=2  type=DATEV2)},
   VLiteral  (name  =  DateTimeV2,  type  =  DateTimeV2,  value  =  (2024-08-12 
 08:00:00))}
   ```
   
   **Solution**
   When the predicate is like date column with datetime literal, modify 
datetime literal to column type in  legacy planner same as nereids planner;
   
   1. explain
   ```
   mysql> explain select pin_id,sum(cost) from ad_effects where pin_id = 200 
and day >= '2024-08-11 00:00:00' and day < '2024-08-12 08:00:00' group by 
pin_id;
   
+------------------------------------------------------------------------------------+
   | Explain String(Old Planner)                                                
        |
   
+------------------------------------------------------------------------------------+
   | PLAN FRAGMENT 0                                                            
        |
   |   OUTPUT EXPRS:                                                            
        |
   |     <slot 3> `pin_id`                                                      
        |
   |     <slot 4> sum(`cost`)                                                   
        |
   |   PARTITION: RANDOM                                                        
        |
   |                                                                            
        |
   |   HAS_COLO_PLAN_NODE: false                                                
        |
   |                                                                            
        |
   |   VRESULT SINK                                                             
        |
   |                                                                            
        |
   |   1:VAGGREGATE (update finalize)                                           
        |
   |   |  output: sum(`cost`)                                                   
        |
   |   |  group by: `pin_id`                                                    
        |
   |   |  cardinality=-1                                                        
        |
   |   |                                                                        
        |
   |   0:VOlapScanNode                                                          
        |
   |      TABLE: default_cluster:db.ad_effects(ad_effects), PREAGGREGATION: ON  
        |
   |      PREDICATES: `pin_id` = 200 AND `day` >= '2024-08-11' AND `day` < 
'2024-08-13' |
   |      partitions=1/18 (p20240811)                                           
        |
   |      tablets=1/16, tabletList=17701                                        
        |
   |      cardinality=2, avgRowSize=2527.5, numNodes=2                          
        |
   |      pushAggOp=NONE                                                        
        |
   
+------------------------------------------------------------------------------------+
   ```
   3. profile
   ```
             VNewOlapScanNode(ad_effects)  (id=0):(Active:  717.471us,  %  
non-child:  0.00%)
                   -  RuntimeFilters:  :  
                   -  PushDownPredicates:  []
                   -  KeyRanges:  ScanKeys:ScanKey=[200,2024-08-11  :  
200,2024-08-13)
                   -  TabletIds:  [17701]
                   -  UseSpecificThreadToken:  False
                   -  AcquireRuntimeFilterTime:  648ns
                   -  AllocateResourceTime:  194.829us
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to