Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-04-05 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2732615222

   
   
   TPC-H: Total hot run time: 32593 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 7798c7d0e7a225beec50b9af5a4af9d923b9828e, 
data reload: false
   
   -- Round 1 --
   q1   24109   526050165016
   q2   2048314 183 183
   q3   10365   1264699 699
   q4   10216   1041539 539
   q5   7882240423842384
   q6   188 164 136 136
   q7   926 793 621 621
   q8   9332128911361136
   q9   4877478549364785
   q10  6818232219001900
   q11  491 281 252 252
   q12  351 362 219 219
   q13  17769   368931023102
   q14  219 221 211 211
   q15  531 476 483 476
   q16  625 608 573 573
   q17  589 874 348 348
   q18  6754639763116311
   q19  2469957 568 568
   q20  308 320 192 192
   q21  2965221219471947
   q22  1057995 1012995
   Total cold run time: 110889 ms
   Total hot run time: 32593 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   5292515751265126
   q2   233 338 226 226
   q3   2186268123062306
   q4   1412186013751375
   q5   4233420243934202
   q6   212 170 129 129
   q7   2003195917771777
   q8   2673264625862586
   q9   7261725872757258
   q10  3002317626802680
   q11  571 515 489 489
   q12  663 813 627 627
   q13  3506397632773277
   q14  271 303 270 270
   q15  522 502 484 484
   q16  639 701 647 647
   q17  1152163413201320
   q18  7684760675707570
   q19  873 809 916 809
   q20  1982206218751875
   q21  5456470448864704
   q22  10741032997 997
   Total cold run time: 52900 ms
   Total hot run time: 50734 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-04-05 Thread via GitHub


HappenLee commented on code in PR #49212:
URL: https://github.com/apache/doris/pull/49212#discussion_r2004749195


##
be/src/vec/exprs/lambda_function/varray_map_function.cpp:
##
@@ -184,57 +184,73 @@ class ArrayMapFunction : public LambdaFunction {
 data_types.push_back(col_type.get_nested_type());
 }
 
-ColumnPtr result_col = nullptr;
+MutableColumnPtr result_col = nullptr;
 DataTypePtr res_type;
 std::string res_name;
 
 //process first row
-args.array_start = (*args.offsets_ptr)[args.current_row_idx - 1];
-args.cur_size = (*args.offsets_ptr)[args.current_row_idx] - 
args.array_start;
-
-while (args.current_row_idx < block->rows()) {
-Block lambda_block;
-for (int i = 0; i < names.size(); i++) {
-ColumnWithTypeAndName data_column;
-if (_contains_column_id(args, i) || i >= gap) {
-data_column = ColumnWithTypeAndName(data_types[i], 
names[i]);
+args_info.array_start = 
(*args_info.offsets_ptr)[args_info.current_row_idx - 1];
+args_info.cur_size =
+(*args_info.offsets_ptr)[args_info.current_row_idx] - 
args_info.array_start;
+
+// lambda block to exectute the lambda, and reuse the memory
+Block lambda_block;
+auto column_size = names.size();
+MutableColumns columns(column_size);
+while (args_info.current_row_idx < block->rows()) {
+bool mem_reuse = lambda_block.mem_reuse();
+for (int i = 0; i < column_size; i++) {
+if (mem_reuse) {
+columns[i] = 
lambda_block.get_by_position(i).column->assume_mutable();
 } else {
-data_column = ColumnWithTypeAndName(
-
data_types[i]->create_column_const_with_default_value(0), data_types[i],
-names[i]);
+if (_contains_column_id(output_slot_ref_indexs, i) || i >= 
gap) {
+// TODO: maybe could create const column, so not 
insert_many_from when extand data
+// but now here handle batch_size of array nested data 
every time, so maybe have different rows
+columns[i] = data_types[i]->create_column();
+} else {
+columns[i] = data_types[i]
+ 
->create_column_const_with_default_value(0)
+ ->assume_mutable();
+}
 }
-lambda_block.insert(std::move(data_column));
 }
-
-MutableColumns columns = lambda_block.mutate_columns();
+// batch_size of array nested data every time inorder to avoid 
memory overflow
 while (columns[gap]->size() < batch_size) {
 long max_step = batch_size - columns[gap]->size();
-long current_step =
-std::min(max_step, (long)(args.cur_size - 
args.current_offset_in_array));
-size_t pos = args.array_start + args.current_offset_in_array;
+long current_step = std::min(
+max_step, (long)(args_info.cur_size - 
args_info.current_offset_in_array));
+size_t pos = args_info.array_start + 
args_info.current_offset_in_array;
 for (int i = 0; i < arguments.size(); ++i) {
 columns[gap + i]->insert_range_from(*lambda_datas[i], pos, 
current_step);
 }
-args.current_offset_in_array += current_step;
-args.current_repeat_times += current_step;
-if (args.current_offset_in_array >= args.cur_size) {
-args.current_row_eos = true;
+args_info.current_offset_in_array += current_step;
+args_info.current_repeat_times += current_step;
+if (args_info.current_offset_in_array >= args_info.cur_size) {
+args_info.current_row_eos = true;
 }
-_extend_data(columns, block, args, gap);
-if (args.current_row_eos) {
-args.current_row_idx++;
-args.current_offset_in_array = 0;
-if (args.current_row_idx >= block->rows()) {
+_extend_data(columns, block, args_info.current_repeat_times, 
gap,

Review Comment:
   set `args_info.current_repeat_times = 0;  ` in 
`_extend_data` func



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


--

Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-04-05 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739056984

   
   
   TPC-DS: Total hot run time: 185267 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit e0fbf908c3c2f25ea5f4787d0c98edba51895f98, 
data reload: false
   
   query1   1021471 457 457
   query2   6550194218611861
   query3   6801223 222 222
   query4   26784   23470   22976   22976
   query5   4394668 488 488
   query6   305 202 188 188
   query7   4607497 298 298
   query8   296 297 228 228
   query9   8601257425622562
   query10  489 305 256 256
   query11  15418   15110   14814   14814
   query12  154 107 104 104
   query13  1646509 396 396
   query14  9374610160806080
   query15  206 194 175 175
   query16  7332638 480 480
   query17  1215719 576 576
   query18  2006397 298 298
   query19  185 177 151 151
   query20  116 115 123 115
   query21  214 120 104 104
   query22  4279421342844213
   query23  33901   33059   33164   33059
   query24  7731238523682368
   query25  543 441 386 386
   query26  1243270 155 155
   query27  2282473 328 328
   query28  4064244924012401
   query29  746 579 424 424
   query30  287 216 194 194
   query31  928 853 782 782
   query32  73  66  65  65
   query33  554 361 295 295
   query34  779 866 493 493
   query35  812 822 724 724
   query36  981 970 896 896
   query37  123 105 75  75
   query38  4266409040984090
   query39  1438139313901390
   query40  207 118 105 105
   query41  53  50  54  50
   query42  121 100 104 100
   query43  481 508 484 484
   query44  1276792 803 792
   query45  178 214 169 169
   query46  824 1019615 615
   query47  1775181017601760
   query48  380 431 317 317
   query49  773 509 422 422
   query50  693 720 409 409
   query51  4214423641524152
   query52  105 105 101 101
   query53  230 255 189 189
   query54  495 510 418 418
   query55  78  79  78  78
   query56  277 263 267 263
   query57  1115114810941094
   query58  243 241 241 241
   query59  2513259327282593
   query60  289 273 263 263
   query61  133 148 149 148
   query62  805 740 656 656
   query63  225 197 199 197
   query64  44151092747 747
   query65  4408431543334315
   query66  1160437 318 318
   query67  15804   15432   15463   15432
   query68  8923873 495 495
   query69  466 301 271 271
   query70  1217111311541113
   query71  470 299 264 264
   query72  5377358137223581
   query73  794 724 344 344
   query74  9017940189608960
   query75  4128317526982698
   query76  36481183733 733
   query77  809 368 281 281
   query78  10091   10127   93029302
   query79  2539824 585 585
   query80  633 513 446 446
   query81  498 261 223 223
   query82  690 122 94  94
   query83  198 173 159 159
   query84  281 93  75  75
   query85  784 356 391 356
   query86  388 329 286 286
   query87  4396461643664366
   query88  3640223422512234
   query89  394 313 281 281
   query90  1899213 207 207
   query91  152 143 110 110
   query92  77  65  55  55
   query93  17391043572 572
   query94  666 409 307 307
   query95  427 267 258 258
   query96  477 562 276 276
   query97  3343345632743274
   query98  240 217 202 202
   query99  1411152612641264
   Total cold run time: 275826 ms
   Total hot run time: 185267 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to

Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-04-04 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2732883924

   # BE UT Coverage Report
   Increment line coverage `0.00% (0/45)` :tada:
   
   [Increment coverage 
report](http://coverage.selectdb-in.cc/coverage/7798c7d0e7a225beec50b9af5a4af9d923b9828e_7798c7d0e7a225beec50b9af5a4af9d923b9828e/increment_report/index.html)
   [Complete coverage 
report](http://coverage.selectdb-in.cc/coverage/7798c7d0e7a225beec50b9af5a4af9d923b9828e_7798c7d0e7a225beec50b9af5a4af9d923b9828e/report/index.html)
   | Category  | Coverage   |
   |---||
   | Function Coverage | 48.88% (13091/26780) |
   | Line Coverage | 38.43% (112824/293571) |
   | Region Coverage   | 37.25% (57395/154063) |
   | Branch Coverage   | 32.33% (28838/89198) |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-04-04 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2732690046

   
   
   ClickBench: Total hot run time: 31.47 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 7798c7d0e7a225beec50b9af5a4af9d923b9828e, 
data reload: false
   
   query1   0.040.040.04
   query2   0.120.100.11
   query3   0.240.190.19
   query4   1.580.200.18
   query5   0.610.580.60
   query6   1.180.720.73
   query7   0.020.020.01
   query8   0.040.030.03
   query9   0.580.510.53
   query10  0.560.590.57
   query11  0.160.110.11
   query12  0.140.120.11
   query13  0.620.600.61
   query14  2.802.822.80
   query15  0.930.850.85
   query16  0.400.380.38
   query17  1.081.061.04
   query18  0.210.190.20
   query19  1.961.981.87
   query20  0.020.010.02
   query21  15.35   0.870.55
   query22  0.741.130.63
   query23  15.08   1.380.58
   query24  7.121.651.06
   query25  0.570.170.12
   query26  0.730.160.14
   query27  0.050.060.05
   query28  9.150.840.43
   query29  12.56   3.983.30
   query30  0.250.080.07
   query31  2.810.600.38
   query32  3.230.550.47
   query33  3.013.033.05
   query34  15.69   5.254.48
   query35  4.534.514.50
   query36  0.680.490.48
   query37  0.080.060.06
   query38  0.040.040.03
   query39  0.030.020.02
   query40  0.160.140.13
   query41  0.070.030.02
   query42  0.030.020.02
   query43  0.030.030.03
   Total cold run time: 105.28 s
   Total hot run time: 31.47 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-04-04 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2732670523

   
   
   TPC-DS: Total hot run time: 192302 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 7798c7d0e7a225beec50b9af5a4af9d923b9828e, 
data reload: false
   
   query1   1405108910281028
   query2   6119194119221922
   query3   11041   456745024502
   query4   54396   24930   23238   23238
   query5   5197534 487 487
   query6   347 195 194 194
   query7   4979490 290 290
   query8   352 245 234 234
   query9   6130260426302604
   query10  436 311 250 250
   query11  15204   15150   14852   14852
   query12  159 107 120 107
   query13  1104511 396 396
   query14  10128   643670816436
   query15  198 199 189 189
   query16  7058645 475 475
   query17  1084704 564 564
   query18  1543400 317 317
   query19  198 188 186 186
   query20  131 130 122 122
   query21  209 132 107 107
   query22  4585464344084408
   query23  33937   33300   33436   33300
   query24  5673239624062396
   query25  462 478 416 416
   query26  679 284 163 163
   query27  1864513 341 341
   query28  2787251324712471
   query29  613 587 455 455
   query30  278 233 189 189
   query31  871 871 803 803
   query32  75  68  65  65
   query33  447 379 321 321
   query34  765 878 499 499
   query35  828 875 750 750
   query36  968 1014942 942
   query37  118 100 75  75
   query38  4281437541834183
   query39  1521158114561456
   query40  204 122 108 108
   query41  53  52  53  52
   query42  120 113 102 102
   query43  512 523 494 494
   query44  1324814 803 803
   query45  184 185 166 166
   query46  843 1026653 653
   query47  1869192518341834
   query48  386 411 314 314
   query49  716 535 440 440
   query50  704 748 421 421
   query51  4353431341954195
   query52  108 107 97  97
   query53  230 275 190 190
   query54  478 493 414 414
   query55  84  83  82  82
   query56  265 272 274 272
   query57  1161119211071107
   query58  243 251 234 234
   query59  2752280928382809
   query60  289 269 264 264
   query61  129 122 119 119
   query62  728 744 663 663
   query63  232 190 194 190
   query64  17341024679 679
   query65  4582442544504425
   query66  745 430 305 305
   query67  15764   15534   15308   15308
   query68  7101870 495 495
   query69  545 289 250 250
   query70  1202115011041104
   query71  466 301 290 290
   query72  5817371938913719
   query73  1241742 351 351
   query74  9055915489988998
   query75  3310315026462646
   query76  38731186732 732
   query77  562 375 267 267
   query78  987310091   93899389
   query79  2524834 587 587
   query80  645 521 463 463
   query81  546 263 223 223
   query82  673 125 96  96
   query83  335 174 148 148
   query84  286 96  69  69
   query85  784 357 303 303
   query86  420 322 275 275
   query87  4458440444084404
   query88  3789227722742274
   query89  413 312 278 278
   query90  1878209 213 209
   query91  142 139 109 109
   query92  76  57  59  57
   query93  19891065583 583
   query94  685 413 298 298
   query95  347 274 258 258
   query96  494 571 277 277
   query97  3337337832233223
   query98  228 206 203 203
   query99  1418140212511251
   Total cold run time: 297474 ms
   Total hot run time: 192302 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t

Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-04-04 Thread via GitHub


github-actions[bot] commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2742559541

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-27 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739135113

   
   
   ClickBench: Total hot run time: 31.62 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit c5f7184a5aef71cf09db1f1d31383d786768ef75, 
data reload: false
   
   query1   0.040.040.03
   query2   0.130.110.10
   query3   0.250.200.19
   query4   1.590.200.19
   query5   0.610.580.59
   query6   1.210.730.70
   query7   0.030.020.01
   query8   0.040.030.04
   query9   0.580.540.53
   query10  0.560.600.57
   query11  0.150.110.11
   query12  0.140.110.11
   query13  0.620.600.60
   query14  2.682.802.84
   query15  0.940.860.87
   query16  0.390.380.38
   query17  1.011.021.00
   query18  0.210.190.19
   query19  1.941.981.81
   query20  0.020.010.01
   query21  15.36   0.880.54
   query22  0.771.060.66
   query23  15.06   1.420.63
   query24  7.242.121.18
   query25  0.480.260.09
   query26  0.650.170.14
   query27  0.050.040.04
   query28  9.820.800.42
   query29  12.54   4.033.34
   query30  0.240.090.07
   query31  2.810.580.39
   query32  3.240.540.48
   query33  2.933.063.04
   query34  15.79   5.094.49
   query35  4.534.554.51
   query36  0.680.510.49
   query37  0.090.060.06
   query38  0.050.040.03
   query39  0.030.020.02
   query40  0.170.130.14
   query41  0.080.020.02
   query42  0.040.020.02
   query43  0.030.040.03
   Total cold run time: 105.82 s
   Total hot run time: 31.62 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-21 Thread via GitHub


HappenLee merged PR #49212:
URL: https://github.com/apache/doris/pull/49212


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-21 Thread via GitHub


github-actions[bot] commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2742559480

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-20 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739457062

   # BE UT Coverage Report
   Increment line coverage `0.00% (0/84)` :tada:
   
   [Increment coverage 
report](http://coverage.selectdb-in.cc/coverage/ae43e17d027d5a34914cbc4b9605c3f0b4de2599_ae43e17d027d5a34914cbc4b9605c3f0b4de2599/increment_report/index.html)
   [Complete coverage 
report](http://coverage.selectdb-in.cc/coverage/ae43e17d027d5a34914cbc4b9605c3f0b4de2599_ae43e17d027d5a34914cbc4b9605c3f0b4de2599/report/index.html)
   | Category  | Coverage   |
   |---||
   | Function Coverage | 48.78% (13064/26781) |
   | Line Coverage | 38.36% (112664/293678) |
   | Region Coverage   | 37.15% (57268/154137) |
   | Branch Coverage   | 32.26% (28786/89240) |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739268340

   
   
   ClickBench: Total hot run time: 31.36 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit ae43e17d027d5a34914cbc4b9605c3f0b4de2599, 
data reload: false
   
   query1   0.040.040.03
   query2   0.120.100.10
   query3   0.240.200.19
   query4   1.590.190.19
   query5   0.600.580.57
   query6   1.180.710.73
   query7   0.020.020.01
   query8   0.040.040.04
   query9   0.590.530.53
   query10  0.590.610.57
   query11  0.150.110.11
   query12  0.140.110.12
   query13  0.610.600.60
   query14  2.802.702.68
   query15  0.920.860.85
   query16  0.380.390.37
   query17  1.021.051.03
   query18  0.220.200.19
   query19  1.921.891.86
   query20  0.010.010.01
   query21  15.40   0.900.53
   query22  0.761.100.71
   query23  14.95   1.390.62
   query24  6.641.911.08
   query25  0.490.160.21
   query26  0.690.160.12
   query27  0.050.050.05
   query28  9.680.880.41
   query29  12.59   3.893.27
   query30  0.250.100.06
   query31  2.830.590.39
   query32  3.220.540.46
   query33  2.993.003.02
   query34  15.86   5.094.46
   query35  4.514.524.48
   query36  0.660.490.48
   query37  0.080.070.06
   query38  0.050.040.04
   query39  0.030.030.03
   query40  0.180.130.13
   query41  0.080.030.02
   query42  0.040.020.02
   query43  0.040.040.03
   Total cold run time: 105.25 s
   Total hot run time: 31.36 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739252550

   
   
   TPC-DS: Total hot run time: 186279 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit ae43e17d027d5a34914cbc4b9605c3f0b4de2599, 
data reload: false
   
   query1   990 459 460 459
   query2   6544187318891873
   query3   6798216 221 216
   query4   25978   23968   23551   23551
   query5   5070665 506 506
   query6   302 198 182 182
   query7   4607494 294 294
   query8   302 249 248 248
   query9   8648262126152615
   query10  502 317 273 273
   query11  15847   15320   14915   14915
   query12  168 119 102 102
   query13  1659535 417 417
   query14  10686   673464266426
   query15  205 190 177 177
   query16  7656623 458 458
   query17  1527729 558 558
   query18  1984402 304 304
   query19  188 183 162 162
   query20  121 118 114 114
   query21  210 125 110 110
   query22  4312434142414241
   query23  34198   32832   32887   32832
   query24  7104247223882388
   query25  513 453 397 397
   query26  1233265 161 161
   query27  2151475 324 324
   query28  4007240724112407
   query29  724 559 417 417
   query30  286 215 191 191
   query31  919 894 780 780
   query32  74  68  63  63
   query33  549 345 307 307
   query34  787 839 506 506
   query35  809 865 730 730
   query36  981 1003895 895
   query37  118 98  79  79
   query38  4224411741764117
   query39  1447141214291412
   query40  201 113 104 104
   query41  55  58  51  51
   query42  121 101 99  99
   query43  492 508 485 485
   query44  1314812 811 811
   query45  179 168 167 167
   query46  821 1019632 632
   query47  1740177017391739
   query48  384 436 300 300
   query49  780 506 427 427
   query50  684 725 426 426
   query51  4154420241244124
   query52  100 102 97  97
   query53  228 261 191 191
   query54  486 498 403 403
   query55  83  82  85  82
   query56  298 274 256 256
   query57  1122113410641064
   query58  251 231 232 231
   query59  2555274325802580
   query60  281 280 247 247
   query61  121 118 119 118
   query62  782 745 695 695
   query63  244 192 187 187
   query64  42571050661 661
   query65  431343114311
   query66  1056406 298 298
   query67  15791   15482   15364   15364
   query68  7903881 513 513
   query69  552 299 265 265
   query70  1216112611281126
   query71  474 300 272 272
   query72  5718356337493563
   query73  813 753 358 358
   query74  9253917790649064
   query75  3796315227182718
   query76  37471245773 773
   query77  784 370 301 301
   query78  10206   10049   92829282
   query79  3020828 585 585
   query80  691 588 431 431
   query81  507 255 224 224
   query82  660 127 93  93
   query83  191 165 155 155
   query84  239 93  72  72
   query85  762 352 308 308
   query86  391 320 290 290
   query87  44034712
   query88  3796229522772277
   query89  395 316 278 278
   query90  1975213 225 213
   query91  143 136 109 109
   query92  72  62  54  54
   query93  21831062571 571
   query94  680 418 292 292
   query95  355 271 254 254
   query96  484 566 284 284
   query97  3265348732873287
   query98  235 209 221 209
   query99  1354138512571257
   Total cold run time: 277715 ms
   Total hot run time: 186279 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t

Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739216216

   
   
   TPC-H: Total hot run time: 32354 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit ae43e17d027d5a34914cbc4b9605c3f0b4de2599, 
data reload: false
   
   -- Round 1 --
   q1   24116   504550165016
   q2   2041288 179 179
   q3   10402   1224690 690
   q4   10214   1002535 535
   q5   7521239923572357
   q6   187 168 134 134
   q7   914 742 614 614
   q8   9316128910981098
   q9   4968474346804680
   q10  6814232118821882
   q11  478 289 252 252
   q12  366 355 223 223
   q13  17773   369730473047
   q14  237 236 210 210
   q15  532 477 480 477
   q16  642 617 583 583
   q17  563 869 333 333
   q18  6943651964236423
   q19  1203942 551 551
   q20  319 325 200 200
   q21  2832218919111911
   q22  10491035959 959
   Total cold run time: 109430 ms
   Total hot run time: 32354 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   5117509250645064
   q2   244 326 223 223
   q3   2151268623282328
   q4   1484186814611461
   q5   4228411244224112
   q6   210 167 130 130
   q7   2021194417981798
   q8   2649252625112511
   q9   7301719371427142
   q10  2973325627462746
   q11  597 499 495 495
   q12  693 745 593 593
   q13  3453382933133313
   q14  278 278 271 271
   q15  511 473 483 473
   q16  648 683 665 665
   q17  1123159314101410
   q18  7756765075327532
   q19  788 769 897 769
   q20  2005205918641864
   q21  5389500947824782
   q22  10471003993 993
   Total cold run time: 52666 ms
   Total hot run time: 50675 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


zhangstar333 commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739068477

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


zhangstar333 commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739168792

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


zhangstar333 commented on code in PR #49212:
URL: https://github.com/apache/doris/pull/49212#discussion_r2004814123


##
be/src/vec/exprs/lambda_function/varray_map_function.cpp:
##
@@ -184,57 +184,73 @@ class ArrayMapFunction : public LambdaFunction {
 data_types.push_back(col_type.get_nested_type());
 }
 
-ColumnPtr result_col = nullptr;
+MutableColumnPtr result_col = nullptr;
 DataTypePtr res_type;
 std::string res_name;
 
 //process first row
-args.array_start = (*args.offsets_ptr)[args.current_row_idx - 1];
-args.cur_size = (*args.offsets_ptr)[args.current_row_idx] - 
args.array_start;
-
-while (args.current_row_idx < block->rows()) {
-Block lambda_block;
-for (int i = 0; i < names.size(); i++) {
-ColumnWithTypeAndName data_column;
-if (_contains_column_id(args, i) || i >= gap) {
-data_column = ColumnWithTypeAndName(data_types[i], 
names[i]);
+args_info.array_start = 
(*args_info.offsets_ptr)[args_info.current_row_idx - 1];
+args_info.cur_size =
+(*args_info.offsets_ptr)[args_info.current_row_idx] - 
args_info.array_start;
+
+// lambda block to exectute the lambda, and reuse the memory
+Block lambda_block;
+auto column_size = names.size();
+MutableColumns columns(column_size);
+while (args_info.current_row_idx < block->rows()) {
+bool mem_reuse = lambda_block.mem_reuse();
+for (int i = 0; i < column_size; i++) {
+if (mem_reuse) {

Review Comment:
   have call move
   `lambda_block.insert(vectorized::ColumnWithTypeAndName(std::move(columns[i])`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739127903

   
   
   TPC-DS: Total hot run time: 185391 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit c5f7184a5aef71cf09db1f1d31383d786768ef75, 
data reload: false
   
   query1   1022502 460 460
   query2   6551186418971864
   query3   6794220 217 217
   query4   26370   23793   22967   22967
   query5   4410685 486 486
   query6   304 199 194 194
   query7   4620505 291 291
   query8   325 254 259 254
   query9   8642257825792578
   query10  461 317 260 260
   query11  15677   15142   15020   15020
   query12  165 112 106 106
   query13  1661521 405 405
   query14  9901657165926571
   query15  202 192 182 182
   query16  7889628 459 459
   query17  1606742 563 563
   query18  1992411 309 309
   query19  205 189 160 160
   query20  122 118 118 118
   query21  216 122 111 111
   query22  4124427341224122
   query23  33820   33019   32898   32898
   query24  7668235523562355
   query25  519 443 388 388
   query26  1207272 156 156
   query27  2075486 329 329
   query28  3909240124102401
   query29  734 538 416 416
   query30  283 211 186 186
   query31  955 860 798 798
   query32  73  64  63  63
   query33  591 354 297 297
   query34  773 853 477 477
   query35  795 802 734 734
   query36  971 986 903 903
   query37  117 99  76  76
   query38  4147419641104110
   query39  1459142413861386
   query40  207 124 103 103
   query41  53  52  50  50
   query42  117 104 101 101
   query43  487 506 486 486
   query44  1291788 785 785
   query45  174 174 165 165
   query46  847 1027619 619
   query47  1764176417431743
   query48  376 432 293 293
   query49  762 503 426 426
   query50  679 712 411 411
   query51  4232421341754175
   query52  107 103 102 102
   query53  233 266 185 185
   query54  479 476 400 400
   query55  82  77  80  77
   query56  272 268 259 259
   query57  1139113410711071
   query58  277 229 237 229
   query59  2707267925492549
   query60  291 273 271 271
   query61  119 116 114 114
   query62  833 733 672 672
   query63  233 192 182 182
   query64  4306987 653 653
   query65  4422432243314322
   query66  1064400 306 306
   query67  15758   15552   15364   15364
   query68  8243874 497 497
   query69  467 291 254 254
   query70  1234112510721072
   query71  479 286 267 267
   query72  5594353537533535
   query73  775 736 355 355
   query74  8987903689528952
   query75  3843315127252725
   query76  37021170740 740
   query77  775 366 285 285
   query78  10021   10098   93459345
   query79  2781820 589 589
   query80  644 516 446 446
   query81  462 259 224 224
   query82  674 128 100 100
   query83  203 173 152 152
   query84  283 92  71  71
   query85  770 414 303 303
   query86  332 320 283 283
   query87  4456466543434343
   query88  3512223222282228
   query89  390 315 278 278
   query90  1959210 209 209
   query91  139 143 111 111
   query92  81  63  57  57
   query93  14081063578 578
   query94  670 398 299 299
   query95  363 272 262 262
   query96  490 566 276 276
   query97  3345334833233323
   query98  221 209 200 200
   query99  1416138812491249
   Total cold run time: 275385 ms
   Total hot run time: 185391 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t

Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739113111

   
   
   TPC-H: Total hot run time: 32496 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit c5f7184a5aef71cf09db1f1d31383d786768ef75, 
data reload: false
   
   -- Round 1 --
   q1   24528   507450485048
   q2   2046314 179 179
   q3   10368   1293711 711
   q4   10220   1005543 543
   q5   7558245023542354
   q6   209 170 135 135
   q7   915 755 608 608
   q8   9313131011601160
   q9   5023474047834740
   q10  6807230719051905
   q11  504 267 254 254
   q12  350 351 225 225
   q13  17771   367030673067
   q14  235 233 209 209
   q15  533 484 467 467
   q16  621 617 585 585
   q17  570 849 353 353
   q18  6842665062746274
   q19  1062958 552 552
   q20  323 325 198 198
   q21  2918212819211921
   q22  1074104710081008
   Total cold run time: 109790 ms
   Total hot run time: 32496 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   5186513650715071
   q2   237 328 235 235
   q3   2142266123032303
   q4   1398184313901390
   q5   4245414544814145
   q6   222 170 127 127
   q7   2059198117641764
   q8   2610264325412541
   q9   7309724071977197
   q10  3040325327932793
   q11  587 516 509 509
   q12  670 747 607 607
   q13  3531382733163316
   q14  295 292 307 292
   q15  532 488 462 462
   q16  630 688 627 627
   q17  1154161313291329
   q18  7936743475717434
   q19  846 861 966 861
   q20  1984204418511851
   q21  5429491647294729
   q22  1094104910071007
   Total cold run time: 53136 ms
   Total hot run time: 50590 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739063460

   
   
   ClickBench: Total hot run time: 31.41 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit e0fbf908c3c2f25ea5f4787d0c98edba51895f98, 
data reload: false
   
   query1   0.040.040.04
   query2   0.120.100.11
   query3   0.250.190.19
   query4   1.600.190.19
   query5   0.600.590.60
   query6   1.190.710.71
   query7   0.020.020.01
   query8   0.040.030.04
   query9   0.570.530.53
   query10  0.570.580.58
   query11  0.150.110.11
   query12  0.130.110.11
   query13  0.620.600.62
   query14  2.792.782.73
   query15  0.930.850.86
   query16  0.390.370.38
   query17  1.041.041.02
   query18  0.210.200.20
   query19  1.871.911.80
   query20  0.010.010.02
   query21  15.35   0.890.55
   query22  0.751.360.80
   query23  14.73   1.380.66
   query24  6.881.660.98
   query25  0.510.270.05
   query26  0.550.160.15
   query27  0.050.050.05
   query28  9.720.860.43
   query29  12.58   3.963.27
   query30  0.260.090.07
   query31  2.830.580.39
   query32  3.230.550.47
   query33  3.043.043.00
   query34  15.84   5.114.50
   query35  4.554.524.47
   query36  0.670.500.49
   query37  0.090.060.06
   query38  0.050.040.03
   query39  0.040.020.03
   query40  0.170.150.13
   query41  0.090.030.02
   query42  0.040.020.02
   query43  0.040.030.03
   Total cold run time: 105.2 s
   Total hot run time: 31.41 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739046858

   
   
   TPC-H: Total hot run time: 32658 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit e0fbf908c3c2f25ea5f4787d0c98edba51895f98, 
data reload: false
   
   -- Round 1 --
   q1   24419   506650395039
   q2   2047300 175 175
   q3   10569   1272694 694
   q4   10286   1026538 538
   q5   9079244923532353
   q6   270 163 136 136
   q7   916 760 623 623
   q8   9329127711001100
   q9   5631492448804880
   q10  6802231618971897
   q11  470 276 257 257
   q12  357 353 213 213
   q13  17767   369230673067
   q14  246 222 210 210
   q15  556 482 474 474
   q16  636 630 581 581
   q17  601 856 347 347
   q18  6914654263276327
   q19  1928990 570 570
   q20  313 306 192 192
   q21  2796212519711971
   q22  1041105410141014
   Total cold run time: 112973 ms
   Total hot run time: 32658 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   5200514454295144
   q2   237 330 236 236
   q3   2134265623472347
   q4   1479184114301430
   q5   4251448944814481
   q6   220 171 126 126
   q7   1989192618091809
   q8   2556252324772477
   q9   7227721671867186
   q10  3012322227622762
   q11  576 525 514 514
   q12  675 760 628 628
   q13  3476381532973297
   q14  275 289 269 269
   q15  538 482 480 480
   q16  656 699 647 647
   q17  1146158113441344
   q18  7797761875047504
   q19  812 819 862 819
   q20  1984206018991899
   q21  5448480247364736
   q22  1080103910311031
   Total cold run time: 52768 ms
   Total hot run time: 51166 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


HappenLee commented on code in PR #49212:
URL: https://github.com/apache/doris/pull/49212#discussion_r2004750242


##
be/src/vec/exprs/lambda_function/varray_map_function.cpp:
##
@@ -184,57 +184,73 @@ class ArrayMapFunction : public LambdaFunction {
 data_types.push_back(col_type.get_nested_type());
 }
 
-ColumnPtr result_col = nullptr;
+MutableColumnPtr result_col = nullptr;
 DataTypePtr res_type;
 std::string res_name;
 
 //process first row
-args.array_start = (*args.offsets_ptr)[args.current_row_idx - 1];
-args.cur_size = (*args.offsets_ptr)[args.current_row_idx] - 
args.array_start;
-
-while (args.current_row_idx < block->rows()) {
-Block lambda_block;
-for (int i = 0; i < names.size(); i++) {
-ColumnWithTypeAndName data_column;
-if (_contains_column_id(args, i) || i >= gap) {
-data_column = ColumnWithTypeAndName(data_types[i], 
names[i]);
+args_info.array_start = 
(*args_info.offsets_ptr)[args_info.current_row_idx - 1];
+args_info.cur_size =
+(*args_info.offsets_ptr)[args_info.current_row_idx] - 
args_info.array_start;
+
+// lambda block to exectute the lambda, and reuse the memory
+Block lambda_block;
+auto column_size = names.size();
+MutableColumns columns(column_size);
+while (args_info.current_row_idx < block->rows()) {
+bool mem_reuse = lambda_block.mem_reuse();
+for (int i = 0; i < column_size; i++) {
+if (mem_reuse) {

Review Comment:
   if mem_reuse ? columns seem already contain the right column. why need do 
the op:
   ```
   columns[i] = lambda_block.get_by_position(i).column->assume_mutable();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


HappenLee commented on code in PR #49212:
URL: https://github.com/apache/doris/pull/49212#discussion_r2004749195


##
be/src/vec/exprs/lambda_function/varray_map_function.cpp:
##
@@ -184,57 +184,73 @@ class ArrayMapFunction : public LambdaFunction {
 data_types.push_back(col_type.get_nested_type());
 }
 
-ColumnPtr result_col = nullptr;
+MutableColumnPtr result_col = nullptr;
 DataTypePtr res_type;
 std::string res_name;
 
 //process first row
-args.array_start = (*args.offsets_ptr)[args.current_row_idx - 1];
-args.cur_size = (*args.offsets_ptr)[args.current_row_idx] - 
args.array_start;
-
-while (args.current_row_idx < block->rows()) {
-Block lambda_block;
-for (int i = 0; i < names.size(); i++) {
-ColumnWithTypeAndName data_column;
-if (_contains_column_id(args, i) || i >= gap) {
-data_column = ColumnWithTypeAndName(data_types[i], 
names[i]);
+args_info.array_start = 
(*args_info.offsets_ptr)[args_info.current_row_idx - 1];
+args_info.cur_size =
+(*args_info.offsets_ptr)[args_info.current_row_idx] - 
args_info.array_start;
+
+// lambda block to exectute the lambda, and reuse the memory
+Block lambda_block;
+auto column_size = names.size();
+MutableColumns columns(column_size);
+while (args_info.current_row_idx < block->rows()) {
+bool mem_reuse = lambda_block.mem_reuse();
+for (int i = 0; i < column_size; i++) {
+if (mem_reuse) {
+columns[i] = 
lambda_block.get_by_position(i).column->assume_mutable();
 } else {
-data_column = ColumnWithTypeAndName(
-
data_types[i]->create_column_const_with_default_value(0), data_types[i],
-names[i]);
+if (_contains_column_id(output_slot_ref_indexs, i) || i >= 
gap) {
+// TODO: maybe could create const column, so not 
insert_many_from when extand data
+// but now here handle batch_size of array nested data 
every time, so maybe have different rows
+columns[i] = data_types[i]->create_column();
+} else {
+columns[i] = data_types[i]
+ 
->create_column_const_with_default_value(0)
+ ->assume_mutable();
+}
 }
-lambda_block.insert(std::move(data_column));
 }
-
-MutableColumns columns = lambda_block.mutate_columns();
+// batch_size of array nested data every time inorder to avoid 
memory overflow
 while (columns[gap]->size() < batch_size) {
 long max_step = batch_size - columns[gap]->size();
-long current_step =
-std::min(max_step, (long)(args.cur_size - 
args.current_offset_in_array));
-size_t pos = args.array_start + args.current_offset_in_array;
+long current_step = std::min(
+max_step, (long)(args_info.cur_size - 
args_info.current_offset_in_array));
+size_t pos = args_info.array_start + 
args_info.current_offset_in_array;
 for (int i = 0; i < arguments.size(); ++i) {
 columns[gap + i]->insert_range_from(*lambda_datas[i], pos, 
current_step);
 }
-args.current_offset_in_array += current_step;
-args.current_repeat_times += current_step;
-if (args.current_offset_in_array >= args.cur_size) {
-args.current_row_eos = true;
+args_info.current_offset_in_array += current_step;
+args_info.current_repeat_times += current_step;
+if (args_info.current_offset_in_array >= args_info.cur_size) {
+args_info.current_row_eos = true;
 }
-_extend_data(columns, block, args, gap);
-if (args.current_row_eos) {
-args.current_row_idx++;
-args.current_offset_in_array = 0;
-if (args.current_row_idx >= block->rows()) {
+_extend_data(columns, block, args_info.current_repeat_times, 
gap,

Review Comment:
   set `args_info.current_repeat_times = 0;  ` in 
`_extend_data` func



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


--

Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-19 Thread via GitHub


zhangstar333 commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2739023676

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2735262781

   
   
   ClickBench: Total hot run time: 31.02 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit e0fbf908c3c2f25ea5f4787d0c98edba51895f98, 
data reload: false
   
   query1   0.040.030.03
   query2   0.120.100.10
   query3   0.260.190.19
   query4   1.590.190.11
   query5   0.570.540.55
   query6   1.200.710.71
   query7   0.020.020.02
   query8   0.040.040.03
   query9   0.590.520.52
   query10  0.620.590.56
   query11  0.160.100.11
   query12  0.150.120.11
   query13  0.620.610.62
   query14  2.672.822.82
   query15  0.930.850.85
   query16  0.380.390.39
   query17  1.011.031.04
   query18  0.210.200.20
   query19  2.091.911.81
   query20  0.020.010.01
   query21  15.35   0.920.53
   query22  0.761.280.62
   query23  14.94   1.340.66
   query24  6.961.940.76
   query25  0.490.250.09
   query26  0.500.160.13
   query27  0.050.060.05
   query28  9.760.940.44
   query29  12.56   3.943.28
   query30  0.250.090.06
   query31  2.820.590.39
   query32  3.230.530.46
   query33  2.993.003.02
   query34  15.89   5.164.49
   query35  4.584.534.53
   query36  0.660.510.48
   query37  0.090.060.06
   query38  0.050.050.03
   query39  0.030.020.02
   query40  0.170.120.14
   query41  0.080.030.03
   query42  0.030.030.02
   query43  0.030.030.03
   Total cold run time: 105.56 s
   Total hot run time: 31.02 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2735387013

   # BE UT Coverage Report
   Increment line coverage `0.00% (0/84)` :tada:
   
   [Increment coverage 
report](http://coverage.selectdb-in.cc/coverage/e0fbf908c3c2f25ea5f4787d0c98edba51895f98_e0fbf908c3c2f25ea5f4787d0c98edba51895f98/increment_report/index.html)
   [Complete coverage 
report](http://coverage.selectdb-in.cc/coverage/e0fbf908c3c2f25ea5f4787d0c98edba51895f98_e0fbf908c3c2f25ea5f4787d0c98edba51895f98/report/index.html)
   | Category  | Coverage   |
   |---||
   | Function Coverage | 48.90% (13096/26783) |
   | Line Coverage | 38.45% (112903/293650) |
   | Region Coverage   | 37.26% (57421/154103) |
   | Branch Coverage   | 32.34% (28850/89220) |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2735257544

   
   
   TPC-DS: Total hot run time: 185839 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit e0fbf908c3c2f25ea5f4787d0c98edba51895f98, 
data reload: false
   
   query1   994 502 476 476
   query2   6549188819061888
   query3   6814209 214 209
   query4   26598   23719   23506   23506
   query5   4316662 465 465
   query6   299 207 203 203
   query7   4604498 290 290
   query8   281 237 230 230
   query9   8590261926102610
   query10  474 315 255 255
   query11  15568   15217   15287   15217
   query12  157 111 106 106
   query13  1648511 382 382
   query14  8685634862246224
   query15  212 190 183 183
   query16  7126654 464 464
   query17  963 703 539 539
   query18  1946398 300 300
   query19  188 180 150 150
   query20  118 117 113 113
   query21  207 117 103 103
   query22  4088431840394039
   query23  33683   32861   32947   32861
   query24  8302238024002380
   query25  527 459 375 375
   query26  1230268 154 154
   query27  2748503 327 327
   query28  4397244523892389
   query29  778 575 417 417
   query30  287 250 188 188
   query31  947 833 750 750
   query32  71  69  66  66
   query33  556 377 390 377
   query34  789 834 500 500
   query35  801 849 749 749
   query36  964 999 933 933
   query37  130 102 78  78
   query38  4198417441034103
   query39  1449141614271416
   query40  213 115 108 108
   query41  55  55  50  50
   query42  117 105 101 101
   query43  488 489 475 475
   query44  1343787 782 782
   query45  177 171 166 166
   query46  846 1027612 612
   query47  1755176817031703
   query48  383 414 294 294
   query49  785 488 423 423
   query50  674 726 418 418
   query51  4157417541184118
   query52  108 107 98  98
   query53  241 266 186 186
   query54  481 498 419 419
   query55  83  85  81  81
   query56  297 271 248 248
   query57  1137118310941094
   query58  256 240 254 240
   query59  2720272225152515
   query60  293 278 281 278
   query61  152 140 146 140
   query62  810 746 670 670
   query63  233 194 194 194
   query64  44611084756 756
   query65  4432439643484348
   query66  1142403 304 304
   query67  15792   15647   15757   15647
   query68  8414885 499 499
   query69  475 319 265 265
   query70  1222113711311131
   query71  474 336 265 265
   query72  5415356537363565
   query73  796 737 349 349
   query74  8952905887548754
   query75  3756310726612661
   query76  36331171759 759
   query77  839 388 283 283
   query78  10176   10271   92369236
   query79  1982828 584 584
   query80  636 526 446 446
   query81  473 256 220 220
   query82  498 132 96  96
   query83  181 167 152 152
   query84  281 95  73  73
   query85  861 356 303 303
   query86  370 321 311 311
   query87  4438457443044304
   query88  3533229122982291
   query89  399 312 277 277
   query90  1928212 209 209
   query91  142 140 111 111
   query92  76  61  56  56
   query93  15021057575 575
   query94  662 418 303 303
   query95  365 268 253 253
   query96  486 569 279 279
   query97  338332683268
   query98  237 201 202 201
   query99  1486141712771277
   Total cold run time: 273934 ms
   Total hot run time: 185839 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


doris-robot commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2735244507

   
   
   TPC-H: Total hot run time: 32519 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit e0fbf908c3c2f25ea5f4787d0c98edba51895f98, 
data reload: false
   
   -- Round 1 --
   q1   24254   507650445044
   q2   2050318 177 177
   q3   10460   1274692 692
   q4   10244   1021549 549
   q5   7524241523112311
   q6   191 166 131 131
   q7   910 748 603 603
   q8   9309132310311031
   q9   5075474447494744
   q10  6816229318961896
   q11  488 278 263 263
   q12  365 363 221 221
   q13  17765   370031183118
   q14  226 230 213 213
   q15  559 497 499 497
   q16  625 624 578 578
   q17  594 860 342 342
   q18  6853644263566356
   q19  1288956 561 561
   q20  337 348 196 196
   q21  3015218719921992
   q22  1095105510041004
   Total cold run time: 110043 ms
   Total hot run time: 32519 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   5121512751295127
   q2   248 327 233 233
   q3   2129267023072307
   q4   1470182413591359
   q5   4255411945014119
   q6   206 166 125 125
   q7   2035195917441744
   q8   2643269526232623
   q9   7251722171077107
   q10  3010321527502750
   q11  588 503 498 498
   q12  687 779 629 629
   q13  3589392833043304
   q14  284 306 264 264
   q15  589 517 508 508
   q16  645 691 640 640
   q17  1171158613461346
   q18  7767754974017401
   q19  896 867 1019867
   q20  2047204719441944
   q21  5536491048264826
   q22  1089103210401032
   Total cold run time: 53256 ms
   Total hot run time: 50753 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


zhangstar333 commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2735226120

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


HappenLee commented on code in PR #49212:
URL: https://github.com/apache/doris/pull/49212#discussion_r2001598067


##
be/src/vec/exprs/lambda_function/varray_map_function.cpp:
##
@@ -77,9 +77,8 @@ class ArrayMapFunction : public LambdaFunction {
 
 std::string get_name() const override { return name; }
 
-doris::Status execute(VExprContext* context, doris::vectorized::Block* 
block,
-  int* result_column_id, const DataTypePtr& 
result_type,
-  const VExprSPtrs& children) override {
+Status execute(VExprContext* context, vectorized::Block* block, int* 
result_column_id,
+   const DataTypePtr& result_type, const VExprSPtrs& children) 
override {
 LambdaArgs args;
 // collect used slot ref in lambda function body
 _collect_slot_ref_column_id(children[0], args);

Review Comment:
   refactor all private func of ArrayMapFunction, only pass the use param, not 
args



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


HappenLee commented on code in PR #49212:
URL: https://github.com/apache/doris/pull/49212#discussion_r2001594891


##
be/src/vec/exprs/lambda_function/varray_map_function.cpp:
##
@@ -233,7 +239,14 @@ class ArrayMapFunction : public LambdaFunction {
 }
 }
 
-lambda_block.set_columns(std::move(columns));
+if (!mem_reuse) {
+for (int i = 0; i < column_size; ++i) {
+
lambda_block.insert(vectorized::ColumnWithTypeAndName(std::move(columns[i]),
+  
data_types[i], names[i]));
+}
+} else {
+columns.clear();

Review Comment:
   why here need clear



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


zhangstar333 commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2732267325

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improve](function) memory reuse in array_map fucntion [doris]

2025-03-18 Thread via GitHub


hello-stephen commented on PR #49212:
URL: https://github.com/apache/doris/pull/49212#issuecomment-2732265543

   
   Thank you for your contribution to Apache Doris.
   Don't know what should be done next? See [How to process your 
PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR).
   
   Please clearly describe your PR:
   1. What problem was fixed (it's best to include specific error reporting 
information). How it was fixed.
   2. Which behaviors were modified. What was the previous behavior, what is it 
now, why was it modified, and what possible impacts might there be.
   3. What features were added. Why was this function added?
   4. Which code was refactored and why was this part of the code refactored?
   5. Which functions were optimized and what is the difference before and 
after the optimization?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org