ddsr-ops opened a new issue, #30542: URL: https://github.com/apache/doris/issues/30542
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version Doris version doris-1.2.4-rc01 ### What's Wrong? I have tuned the options related to runtime filter, as follows: ``` set runtime_bloom_filter_size = 4194304; set runtime_filter_wait_time_ms = 4000; set runtime_bloom_filter_max_size = 33554432; ``` The DML: ``` set runtime_bloom_filter_size = 4194304; set runtime_filter_wait_time_ms = 4000; set runtime_bloom_filter_max_size = 33554432; insert into mdm.m_trip_order_detail_routine with label m_trip_order_detail_routine_2_1706578101095 with temp as ( select * from ( select order_no, biz_id, trip_no, fellow_no, ttsp_order_no, trip_sts_seq, trans_time, recv_time, batch_no, base_amount, discount_amount, trans_amount, fine_amount, trans_at, settlement_amount, data_time table_update, row_number() over(partition by ttsp_order_no order by offset_id desc ) rk from fdm.f_csm_j_trip_order_flink_load where data_time between str_to_date('2024-01-30 08:24:00','%Y-%m-%d %H:%i:%s') and str_to_date('2024-01-30 09:25:00','%Y-%m-%d %H:%i:%s') ) z where rk =1 ) select b.recv_time, b.trip_no, b.fellow_no, b.biz_id , a.ttsp_order_no, b.zg_order_no, b.trip_recv_time, b.op_id, b.app_id, b.user_id, b.ttsp_trip_no, b.ttsp_fellow_no, b.in_voucher_no, b.out_voucher_no, b.in_line_no, b.out_line_no, b.in_station_no, b.out_station_no, b.in_terminal_no, b.out_terminal_no, b.in_time, b.out_time, b.in_line_name, b.out_line_name, b.in_station_name, b.out_station_name, b.in_status, b.out_status, b.in_sys_time, b.out_sys_time, b.in_confirm_type, b.out_confirm_type, b.trip_version, b.service_id, b.trip_sts, b.sub_app_id, a.order_no, a.trans_time, a.base_amount, a.discount_amount, a.trans_amount, a.fine_amount, a.recv_time order_recv_time, a.trans_at, a.settlement_amount, a.batch_no, a.trip_sts_seq order_version, b.discount, b.zg_order_status, b.refund_amt, b.actual_trans_amt, b.c_update_time, b.zg_fine_amount, b.zg_settlement_amount, b.zg_user_id, b.trade_amt, b.zg_refund_amt, b.zg_trade_time, b.zg_refund_time, b.uo_channel_id, b.uo_trade_amt , b.uo_order_status, b.uo_trade_time, b.uo_refund_amt, b.uo_refund_status, b.uo_refund_time, a.table_update from mdm.m_trip_order_detail_routine b join temp a on a.biz_id =b.biz_id and a.trip_no =b.trip_no and a.fellow_no =b.fellow_no ``` Table schema m_trip_order_detail_routine: ``` Create Table: CREATE TABLE `m_trip_order_detail_routine` ( `recv_time` datetime NULL COMMENT '出行平台接收到数据的时间', `trip_no` varchar(64) NULL COMMENT '行程单号', `fellow_no` varchar(16) NULL COMMENT '带人行程单号', `biz_id` varchar(20) NULL COMMENT '出行业务ID', `ttsp_order_no` varchar(32) NULL COMMENT '平台唯一订单号', `zg_order_no` varchar(36) NULL COMMENT '消费订单号', `trip_recv_time` datetime NULL COMMENT '出行平台接收到数据的时间', `op_id` varchar(20) NULL COMMENT '运营商ID', `app_id` varchar(16) NULL COMMENT '出行应用ID', `user_id` varchar(32) NULL COMMENT '用户id', `ttsp_trip_no` varchar(16) NULL COMMENT '出行平台行程单号', `ttsp_fellow_no` varchar(16) NULL COMMENT '出行平台带人行程单号', `in_voucher_no` varchar(32) NULL COMMENT '进站凭证号', `out_voucher_no` varchar(32) NULL COMMENT '出站凭证号', `in_line_no` varchar(16) NULL COMMENT '进站线路号', `out_line_no` varchar(16) NULL COMMENT '出站线路号', `in_station_no` varchar(16) NULL COMMENT '进站站点号', `out_station_no` varchar(16) NULL COMMENT '出站站点号', `in_terminal_no` varchar(16) NULL COMMENT '进站终端号', `out_terminal_no` varchar(16) NULL COMMENT '出站终端号', `in_time` datetime NULL COMMENT '进站时间', `out_time` datetime NULL COMMENT '出站时间', `in_line_name` varchar(128) NULL, `out_line_name` varchar(128) NULL, `in_station_name` varchar(32) NULL COMMENT '进站站点名称', `out_station_name` varchar(32) NULL COMMENT '出站站点名称', `in_status` varchar(16) NULL COMMENT '进站状态', `out_status` varchar(16) NULL COMMENT '出站状态', `in_sys_time` datetime NULL COMMENT '进站码生成时间', `out_sys_time` datetime NULL COMMENT '出站码生成时间', `in_confirm_type` varchar(16) NULL COMMENT '进站确认类型', `out_confirm_type` varchar(16) NULL COMMENT '出站确认类型', `trip_version` int(11) NULL COMMENT '行程状态版本号', `service_id` varchar(16) NULL COMMENT '业务类型', `trip_sts` varchar(16) NULL COMMENT '状态', `sub_app_id` varchar(16) NULL COMMENT '出行应用子ID', `order_no` varchar(64) NULL COMMENT '消费订单号', `trans_time` datetime NULL COMMENT '消费交易时间', `base_amount` decimal(16, 2) NULL COMMENT '基础金额', `discount_amount` decimal(16, 2) NULL COMMENT '优惠金额', `trans_amount` decimal(16, 2) NULL COMMENT '消费金额', `fine_amount` decimal(16, 2) NULL COMMENT '扣罚金额', `order_recv_time` datetime NULL COMMENT '出行平台接收到数据的时间', `trans_at` decimal(16, 2) NULL COMMENT '清算金额', `settlement_amount` decimal(16, 2) NULL COMMENT '结算票价', `batch_no` varchar(16) NULL, `order_version` int(11) NULL COMMENT '行程版本号', `discount` varchar(50) NULL COMMENT '折扣明细 1-6:人才绿码 7-12:川大 13-18:福利金 19-24:UPS', `zg_order_status` varchar(8) NULL COMMENT '订单状态', `refund_amt` decimal(16, 2) NULL COMMENT '申请退款金额', `actual_trans_amt` decimal(16, 2) NULL COMMENT '消费金额', `c_update_time` datetime NULL COMMENT '更新时间', `zg_fine_amount` decimal(16, 2) NULL COMMENT '扣罚金额', `zg_settlement_amount` decimal(16, 2) NULL COMMENT '结算票价', `zg_user_id` varchar(36) NULL COMMENT '中广用户编号', `trade_amt` decimal(16, 2) NULL COMMENT '交易金额', `zg_refund_amt` decimal(16, 2) NULL COMMENT '退款金额', `zg_trade_time` datetime NULL COMMENT '交易成功时间', `zg_refund_time` datetime NULL COMMENT '退款成功时间', `uo_channel_id` varchar(64) NULL, `uo_trade_amt` decimal(16, 2) NULL COMMENT '结算票价', `uo_order_status` varchar(8) NULL COMMENT '发送状态', `uo_trade_time` datetime NULL COMMENT '修改时间', `uo_refund_amt` decimal(16, 2) NULL COMMENT '退款金额,以元', `uo_refund_status` varchar(8) NULL COMMENT '发送状态 01成功 02失败', `uo_refund_time` datetime NULL COMMENT '撤销请求时间', `table_update` datetime NULL COMMENT '数据更新时间' ) ENGINE=OLAP UNIQUE KEY(`recv_time`, `trip_no`, `fellow_no`, `biz_id`, `ttsp_order_no`, `zg_order_no`) COMMENT '行程同步数据' PARTITION BY RANGE(`recv_time`) (PARTITION p_202012 VALUES [('2020-12-01 00:00:00'), ('2021-01-01 00:00:00')), PARTITION p_202101 VALUES [('2021-01-01 00:00:00'), ('2021-02-01 00:00:00')), ...... PARTITION p_202402 VALUES [('2024-02-01 00:00:00'), ('2024-03-01 00:00:00'))) DISTRIBUTED BY HASH(`ttsp_order_no`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 3", "colocate_with" = "__global__t_trip_no", "dynamic_partition.enable" = "true", "dynamic_partition.time_unit" = "MONTH", "dynamic_partition.time_zone" = "Asia/Shanghai", "dynamic_partition.start" = "-2147483648", "dynamic_partition.end" = "1", "dynamic_partition.prefix" = "p_", "dynamic_partition.replication_allocation" = "tag.location.default: 3", "dynamic_partition.buckets" = "10", "dynamic_partition.create_history_partition" = "false", "dynamic_partition.history_partition_num" = "-1", "dynamic_partition.hot_partition_num" = "0", "dynamic_partition.reserved_history_periods" = "NULL", "dynamic_partition.storage_policy" = "", "dynamic_partition.storage_medium" = "SSD", "dynamic_partition.start_day_of_month" = "1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); ``` Table schema f_csm_j_trip_order_flink_load: ``` Create Table: CREATE TABLE `f_csm_j_trip_order_flink_load` ( `data_time` datetime NULL COMMENT '数据入doris时间', `ttsp_order_no` varchar(32) NULL COMMENT '平台唯一订单号', `partition_id` bigint(20) NULL, `offset_id` bigint(20) NULL, `recv_time` datetime NULL COMMENT '出行平台接收到数据的时间', `trip_no` varchar(64) NULL COMMENT '行程单号', `biz_id` varchar(20) NULL, `order_no` varchar(64) NULL COMMENT '消费订单号', `fellow_no` varchar(16) NULL COMMENT '带人行程单号', `fee_mode` varchar(32) NULL COMMENT '计费模式', `trans_time` datetime NULL COMMENT '消费交易时间', `trans_amount` decimal(16, 2) NULL COMMENT '消费金额', `base_amount` decimal(16, 2) NULL COMMENT '基础金额', `discount_amount` decimal(16, 2) NULL COMMENT '优惠金额', `discount_desc` varchar(128) NULL COMMENT '优惠说明', `fine_amount` decimal(16, 2) NULL COMMENT '扣罚金额', `fine_desc` varchar(128) NULL COMMENT '扣罚说明', `settlement_mode` varchar(16) NULL COMMENT '结算方式', `settlement_amount` decimal(16, 2) NULL COMMENT '结算票价', `app_id` varchar(16) NULL COMMENT '出行应用ID', `trace_no` varchar(6) NULL COMMENT '系统跟踪号,每秒内不重复', `sync_time` datetime NULL COMMENT '同步时间', `sync_status` varchar(32) NULL COMMENT '同步状态', `sync_remark` varchar(128) NULL COMMENT '同步备注', `settle_rec_id` varchar(18) NULL COMMENT '结算任务id', `settle_status` varchar(12) NULL COMMENT '结算状态', `settle_type` varchar(32) NULL COMMENT '清算类型', `terminal_no` varchar(16) NULL COMMENT '(计费时使用的)终端号', `settle_date` varchar(8) NULL COMMENT '出行服务结算日期', `settle_time` datetime NULL COMMENT '结算时间,返回结果文件时处理的时间', `settle_err_code` varchar(255) NULL COMMENT '错误代码', `settle_err_msg` varchar(255) NULL COMMENT '结算错误信息', `settle_err_detail` varchar(255) NULL COMMENT '结算错误信息明细', `trip_sts_seq` int(11) NULL COMMENT '行程版本号', `voucher_data_carrier` varchar(32) NULL COMMENT '数据载体', `extra` varchar(1024) NULL COMMENT '扩展数据', `payment_time` datetime NULL COMMENT '支付成功时间', `channel_serial_no` varchar(32) NULL COMMENT '渠道流水号,对应消费接口的查询流水号query_id(清算之后回填)', `sys_tra_no` varchar(12) NULL COMMENT '6位数字。若是同城清算的,对应城清算拒绝文件的系统跟踪号字段', `order_tra_time` datetime NULL COMMENT '交易传输时间', `clearing_date` varchar(8) NULL COMMENT 'mmdd,返回的清算日期,若是同城清算的,则对应为同城清算拒绝文件中的清算日期settle_dt字段的mmdd部分;对于非同城清算的,对应原消费通知报文中清算日期', `trans_curr_cd` varchar(4) NULL COMMENT '清算币种', `trans_at` decimal(16, 2) NULL COMMENT '清算金额', `batch_no` varchar(16) NULL, `order_source` varchar(32) NULL, `debezium_op` varchar(4) NULL ) ENGINE=OLAP DUPLICATE KEY(`data_time`, `ttsp_order_no`) COMMENT '行程同步数据' PARTITION BY RANGE(`data_time`) (PARTITION p_2024012009 VALUES [('2024-01-20 09:00:00'), ('2024-01-20 10:00:00')), PARTITION p_2024012010 VALUES [('2024-01-20 10:00:00'), ('2024-01-20 11:00:00')), ...... PARTITION p_2024013010 VALUES [('2024-01-30 10:00:00'), ('2024-01-30 11:00:00'))) DISTRIBUTED BY HASH(`ttsp_order_no`) BUCKETS 5 PROPERTIES ( "replication_allocation" = "tag.location.default: 3", "dynamic_partition.enable" = "true", "dynamic_partition.time_unit" = "HOUR", "dynamic_partition.time_zone" = "Asia/Shanghai", "dynamic_partition.start" = "-240", "dynamic_partition.end" = "1", "dynamic_partition.prefix" = "p_", "dynamic_partition.replication_allocation" = "tag.location.default: 3", "dynamic_partition.buckets" = "5", "dynamic_partition.create_history_partition" = "true", "dynamic_partition.history_partition_num" = "10", "dynamic_partition.hot_partition_num" = "0", "dynamic_partition.reserved_history_periods" = "NULL", "dynamic_partition.storage_policy" = "", "dynamic_partition.storage_medium" = "SSD", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); ``` In addition, table rows count : ``` mysql> select count(*) from fdm.f_csm_j_trip_order_flink_load \G *************************** 1. row *************************** count(*): 68450543 1 row in set (0.12 sec) mysql> select count(*) from mdm.m_trip_order_detail_routine \G *************************** 1. row *************************** count(*): 1040415451 1 row in set (5.88 sec) ``` Plan fragment profile: ``` I0130 09:19:27.034929 44927 plan_fragment_executor.cpp:713] Fragment ad9ae4ed5f764155-873e7f0edbd7f903:(Active: 1m8s, non-child: 0.00%) - DoriBeVersion: Unknown - FragmentCpuTime: 1s827ms - RowsProduced: 141.15K BlockMgr: - BlockWritesOutstanding: 0 - BlocksCreated: 0 - BlocksRecycled: 0 - BufferedPins: 0 - BytesWritten: 0 - MaxBlockSize: 8.00 MB - TotalBufferWaitTime: 0.000ns - TotalEncryptionTime: 0.000ns - TotalIntegrityCheckTime: 0.000ns - TotalReadBlockTime: 0.000ns OlapTableSink:(Active: 1s014ms, non-child: 1.48%) - CloseWaitTime: 5.453ms - ConvertBatchTime: 0.000ns - MaxAddBatchExecTime: 331.429ms - NonBlockingSendTime: 1m7s - NonBlockingSendWorkTime: 1s401ms - SerializeBatchTime: 1s238ms - NumberBatchAdded: 50 - NumberNodeChannels: 5 - OpenTime: 1.623ms - RowsFiltered: 0 - RowsRead: 141.15K - RowsReturned: 141.15K - SendDataTime: 908.767ms - WaitMemLimitTime: 0.000ns - TotalAddBatchExecTime: 3s011ms - ValidateDataTime: 93.839ms VHASH_JOIN_NODE (id=5):(Active: 1m7s, non-child: 0.00%) - BroadcastJoin: true - ShareHashTableEnabled: true - HashTableBuckets: 1048576, - HashTableFilledBuckets: 477349, - BuildBuckets: 1.05M - FilledBuckets: 477.35K - JoinFilterTimer: 20.303us - PeakMemoryUsage: 0 - ProjectionTime: 29.545ms - PushDownComputeTime: 55.830ms - PushDownTime: 11.920us - RowsReturned: 141.15K - RowsReturnedRate: 2.09 K/sec ProbePhase: - ProbeExprCallTime: 244.800us - ProbeFindNextTime: 1m6s - ProbeRows: 168.41K - ProbeTime: 1m6s - ProbeWhenBuildSideOutputTime: 123.699ms - ProbeWhenProbeSideOutputTime: 102.629ms - ProbeWhenSearchHashTableTime: 76.187ms BuildPhase: - BuildExprCallTime: 2.982us - BuildRows: 477.35K - BuildSideHashComputingTime: 56.529ms - BuildSideMergeBlockTime: 178.039ms - BuildTableConvertToPartitionedTime: 0.000ns - BuildTableExpanseTime: 61.215ms - BuildTableInsertTime: 194.293ms - BuildTableTime: 194.555ms - BuildTime: 1s035ms VEXCHANGE_NODE (id=7):(Active: 596.178ms, non-child: 0.87%) - BytesReceived: 26.65 MB - DataArrivalWaitTime: 564.256ms - DecompressBytes: 107.24 MB - DecompressTime: 56.169ms - DeserializeRowBatchTimer: 96.386ms - FirstBatchArrivalWaitTime: 348.472ms - PeakMemoryUsage: 2.66 KB - ProjectionTime: 0.000ns - RowsReturned: 477.35K - RowsReturnedRate: 800.68 K/sec - SendersBlockedTotalTimer(*): 0.000ns VNewOlapScanNode(m_trip_order_detail_routine) (id=0):(Active: 1m7s, non-child: 98.00%) - PushDownPredicates: [{__DORIS_DELETE_SIGN__ IN [0]}] - KeyRanges: ScanKeys:ScanKey=[null(0000-01-01 00:00:00) : 9999-12-31 23:59:59] - TabletIds: [3447039, 3447291, 3447839, 3447947, 3447051, 3447043, 3447799, 3447831, 3447027, 3447643, 3448199, 3447803, 3447435, 3448319, 3447451, 3448451, 3447191, 3448203, 3447835, 3447959, 3447963, 3447279, 3447283, 3447951, 3448191, 3448351] - RemainedDownPredicates: VectorizedFn[eq({"__DORIS_DELETE_SIGN__", "Int8"})]{ SlotRef(slot_id=175 type=TINYINT codegen=false), VLiteral (name = Int8, type = Int8, value = (^@176641498629376))} - AcuireRuntimeFilterTime: 988.746ms - GetNextTime: 1m6s - MaxScannerThreadNum: 12 - NewlyCreateFreeBlocksNum: 0 - NumDiskAccess: 8 - NumScanners: 26 - PeakMemoryUsage: 0 - PreAllocFreeBlocksNum: 60 - ProcessConjunctTime: 315.516us - ProjectionTime: 0.000ns - RowsRead: 208.40M - RowsReturned: 168.41K - RowsReturnedRate: 2.50 K/sec - ScannerBatchWaitTime: 1m6s - ScannerCtxSchedCount: 37 - ScannerSchedCount: 35 - ScannerWorkerWaitTime: 383.824ms - TabletNum: 26 - TotalReadThroughput: 0 VScanner: - PreEvaluatePredicates: ComparisonPredicateBase(TINYINT, EQ), column_id=67, opposite=false - BlockConvertTime: 0.000ns - BlockFetchTime: 3m41s - ReaderInitTime: 0.000ns - RowsDelFiltered: 0 - ScannerConvertBlockTime: 0.000ns - ScannerCpuTime: 11m40s - ScannerFilterTime: 24s346ms - ScannerGetBlockTime: 11m18s - ScannerInitTime: 7.292ms - ScannerPrefilterTime: 0.000ns SegmentIterator: - BitmapIndexFilterTimer: 226.769us - BlockConditionsFilteredTime: 5.801ms - BlockInitSeekCount: 1.65K - BlockInitSeekTime: 9.486ms - BlockInitTime: 43.798ms - BlockLoadTime: 3m41s - BlocksLoad: 51.89K - CachedPagesNum: 0 - CompressedBytesRead: 32.85 GB - DecompressorTimer: 14s343ms - FirstReadSeekCount: 234 - FirstReadSeekTime: 300.387ms - FirstReadTime: 4s613ms - IOTimer: 18s978ms - LazyReadSeekCount: 0 - LazyReadSeekTime: 0.000ns - LazyReadTime: 3m36s - NumSegmentFiltered: 0 - NumSegmentTotal: 234 - OutputColumnTime: 223.037ms - RawRowsRead: 208.57M - RowsBitmapIndexFiltered: 0 - RowsBloomFilterFiltered: 0 - RowsConditionsFiltered: 0 - RowsKeyRangeFiltered: 0 - RowsShortCircuitPredFiltered: 0 - RowsShortCircuitPredInput: 0 - RowsStatsFiltered: 0 - RowsVectorPredFiltered: 0 - RowsVectorPredInput: 206.05M - ShortPredEvalTime: 1.152ms - TotalPagesNum: 1.22M - UncompressedBytesRead: 45.97 GB - VectorPredEvalTime: 179.666ms RuntimeFilter: (id = 0, type = in_or_bloomfilter): - Info: [IsPushDown = false, IsEffective = true, IsIgnored = false, HasRemoteTarget = false, HasLocalTarget = true] - RealRuntimeFilterType: bloomfilter - AWaitTimeCost: 988.740ms RuntimeFilter: (id = 1, type = in_or_bloomfilter): - Info: [IsPushDown = false, IsEffective = true, IsIgnored = false, HasRemoteTarget = false, HasLocalTarget = true] - RealRuntimeFilterType: bloomfilter - AWaitTimeCost: 729.000ns RuntimeFilter: (id = 2, type = in_or_bloomfilter): - Info: [IsPushDown = false, IsEffective = true, IsIgnored = false, HasRemoteTarget = false, HasLocalTarget = true] - RealRuntimeFilterType: bloomfilter - AWaitTimeCost: 460.000ns BuildPhase: - BuildExprCallTime: 2.982us - BuildRows: 477.35K - BuildSideHashComputingTime: 56.529ms - BuildSideMergeBlockTime: 178.039ms - BuildTableConvertToPartitionedTime: 0.000ns - BuildTableExpanseTime: 61.215ms - BuildTableInsertTime: 194.293ms - BuildTableTime: 194.555ms - BuildTime: 1s035ms ``` Question 1: the `Rows affected` should not be equal to zero. ``` [2024-01-30, 09:28:21 CST] {sql.py:324} INFO - Rows affected: 0 ``` Question 2: In common, it takes less than 10 seconds to finish issuing the sql by virtue of bloom filter. However, it costs more than one minute to run the sql without the runtime filter pushdown even if I have added the options related to the bloom filter optimizing the join query as the official doc, which occurs occasionally. In the case, rows from CTE `temp` are relatively more than other cases where predicates are pushed down. ### What You Expected? I hope the predicates can be pushed down even if the quantity of the rows count of `temp` is a little big. The rows affected should match the actual number, not zero. ### How to Reproduce? _No response_ ### Anything Else? Due to time plan, my doris env could not be upgraded for now. If the way I use the runtime bloom filter options is wrong, please correct it, thank you very much. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
