Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-22 Thread via GitHub


github-actions[bot] commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2070059047

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-22 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2070033053

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-22 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2068909001

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-22 Thread via GitHub


github-actions[bot] commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2068596760

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-22 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2068591583

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-18 Thread via GitHub


github-actions[bot] commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2065717486

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-18 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2063151194

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2062917868

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2062762165

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit 1e9c538cc0d20350249adac9c302d25d91fbf73a with 
default session variables
   Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  33 seconds loaded 861443392 Bytes, about 24 MB/s
   Insert into select:   13.5 seconds inserted 1000 Rows, about 740K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2062759100

   
   
   ClickBench: Total hot run time: 30.28 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 1e9c538cc0d20350249adac9c302d25d91fbf73a, 
data reload: false
   
   query1   0.040.040.03
   query2   0.090.040.04
   query3   0.240.050.05
   query4   1.660.090.08
   query5   0.510.500.50
   query6   1.480.720.72
   query7   0.030.010.02
   query8   0.060.050.04
   query9   0.550.490.49
   query10  0.560.540.55
   query11  0.170.120.12
   query12  0.150.130.12
   query13  0.600.580.58
   query14  0.760.780.76
   query15  0.830.810.82
   query16  0.370.360.38
   query17  1.011.041.01
   query18  0.200.280.23
   query19  1.751.741.76
   query20  0.010.020.01
   query21  15.43   0.650.66
   query22  4.478.061.58
   query23  18.31   1.361.27
   query24  1.300.400.26
   query25  0.160.090.08
   query26  0.260.160.17
   query27  0.090.080.08
   query28  13.40   1.000.97
   query29  12.60   3.313.28
   query30  0.270.080.07
   query31  2.810.380.38
   query32  3.280.450.47
   query33  2.772.912.85
   query34  17.11   4.394.44
   query35  4.494.494.51
   query36  0.650.460.46
   query37  0.180.160.16
   query38  0.160.140.14
   query39  0.050.040.04
   query40  0.190.150.14
   query41  0.100.050.05
   query42  0.060.050.05
   query43  0.050.050.04
   Total cold run time: 109.26 s
   Total hot run time: 30.28 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2062754463

   
   
   TPC-DS: Total hot run time: 188253 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 1e9c538cc0d20350249adac9c302d25d91fbf73a, 
data reload: false
   
   query1   908 375 377 375
   query2   6501256424202420
   query3   6679217 216 216
   query4   24732   21458   21438   21438
   query5   4251473 466 466
   query6   291 194 193 193
   query7   4608323 310 310
   query8   247 190 207 190
   query9   8688250424722472
   query10  613 285 289 285
   query11  14799   14318   14240   14240
   query12  153 101 97  97
   query13  1663385 375 375
   query14  9895783080537830
   query15  319 194 207 194
   query16  8179280 286 280
   query17  1868606 570 570
   query18  2113294 301 294
   query19  258 171 174 171
   query20  108 99  101 99
   query21  209 139 129 129
   query22  5028478449184784
   query23  33884   33379   33161   33161
   query24  12004   306230003000
   query25  676 395 403 395
   query26  1758168 169 168
   query27  2959329 324 324
   query28  7532208820712071
   query29  1061664 668 664
   query30  295 188 181 181
   query31  1013750 757 750
   query32  106 65  70  65
   query33  792 296 288 288
   query34  964 495 488 488
   query35  909 747 725 725
   query36  1079955 928 928
   query37  201 86  84  84
   query38  3415324532113211
   query39  1596153815551538
   query40  302 144 139 139
   query41  54  53  51  51
   query42  117 108 106 106
   query43  585 558 555 555
   query44  1250766 764 764
   query45  300 282 277 277
   query46  1107731 738 731
   query47  1967185018841850
   query48  386 312 319 312
   query49  1233456 429 429
   query50  779 401 417 401
   query51  6593669067126690
   query52  112 107 105 105
   query53  367 294 295 294
   query54  364 277 280 277
   query55  89  86  87  86
   query56  299 274 271 271
   query57  1220115411751154
   query58  282 249 253 249
   query59  3550318332243183
   query60  305 284 281 281
   query61  144 140 140 140
   query62  659 470 464 464
   query63  329 295 296 295
   query64  6534417039493949
   query65  3172309530653065
   query66  1467402 403 402
   query67  15331   14881   15028   14881
   query68  5297555 558 555
   query69  561 340 349 340
   query70  1223123412481234
   query71  1423128812851285
   query72  6754278526022602
   query73  729 336 339 336
   query74  6877652464386438
   query75  3439272326822682
   query76  3320964 1048964
   query77  653 316 318 316
   query78  10893   10332   10339   10332
   query79  9083561 547 547
   query80  1905510 514 510
   query81  550 258 258 258
   query82  1582111 112 111
   query83  346 198 194 194
   query84  275 97  96  96
   query85  1728330 281 281
   query86  493 328 302 302
   query87  3524329233463292
   query88  5581252925332529
   query89  568 406 402 402
   query90  1972210 214 210
   query91  139 108 112 108
   query92  73  61  60  60
   query93  7323537 519 519
   query94  1261232 207 207
   query95  428 344 340 340
   query96  634 282 282 282
   query97  3135297329832973
   query98  259 233 231 231
   query99  1281890 847 847
   Total cold run time: 304560 ms
   Total hot run time: 188253 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2062731697

   
   
   TPC-H: Total hot run time: 39316 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 1e9c538cc0d20350249adac9c302d25d91fbf73a, 
data reload: false
   
   -- Round 1 --
   q1   17986   446543564356
   q2   2954211 202 202
   q3   11178   115511861155
   q4   10669   877 778 778
   q5   7662281026732673
   q6   221 135 132 132
   q7   1007607 597 597
   q8   9221206220462046
   q9   7993717571107110
   q10  8561355735683557
   q11  453 236 237 236
   q12  418 220 220 220
   q13  17763   295330172953
   q14  281 232 231 231
   q15  534 495 478 478
   q16  534 380 372 372
   q17  961 656 710 656
   q18  7483679166376637
   q19  3341155414771477
   q20  671 319 312 312
   q21  3524284228322832
   q22  363 306 318 306
   Total cold run time: 113778 ms
   Total hot run time: 39316 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4351420742244207
   q2   367 282 275 275
   q3   2985275227302730
   q4   1880158915531553
   q5   5341536052905290
   q6   219 124 124 124
   q7   2256189618881888
   q8   3202337233933372
   q9   9314927593549275
   q10  3850370537043704
   q11  582 477 490 477
   q12  724 593 590 590
   q13  17304   294229362936
   q14  301 275 274 274
   q15  519 480 478 478
   q16  490 422 448 422
   q17  1749146914501450
   q18  7708751574737473
   q19  1677155215311531
   q20  1938175617801756
   q21  5036469647604696
   q22  546 470 469 469
   Total cold run time: 72339 ms
   Total hot run time: 54970 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


mrhhsg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2062682515

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2061847108

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit ff77b2ff8f72522cb8f0c8664225d13982f2fcaf with 
default session variables
   Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
   Stream load orc:  59 seconds loaded 1101869774 Bytes, about 17 MB/s
   Stream load parquet:  32 seconds loaded 861443392 Bytes, about 25 MB/s
   Insert into select:   13.5 seconds inserted 1000 Rows, about 740K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2061842354

   
   
   ClickBench: Total hot run time: 30.26 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit ff77b2ff8f72522cb8f0c8664225d13982f2fcaf, 
data reload: false
   
   query1   0.040.040.03
   query2   0.080.040.04
   query3   0.230.050.06
   query4   1.680.070.08
   query5   0.500.480.49
   query6   1.460.720.72
   query7   0.020.010.01
   query8   0.050.040.05
   query9   0.550.500.50
   query10  0.540.540.53
   query11  0.160.120.11
   query12  0.150.120.12
   query13  0.600.590.59
   query14  0.750.780.76
   query15  0.830.820.80
   query16  0.350.360.36
   query17  0.951.011.04
   query18  0.200.270.22
   query19  1.761.741.69
   query20  0.010.010.01
   query21  15.40   0.660.66
   query22  4.357.601.60
   query23  18.29   1.411.30
   query24  1.780.280.23
   query25  0.140.090.08
   query26  0.270.190.17
   query27  0.070.080.08
   query28  13.31   1.000.99
   query29  12.55   3.283.28
   query30  0.280.080.07
   query31  2.830.380.39
   query32  3.250.470.47
   query33  2.832.822.84
   query34  17.09   4.474.47
   query35  4.484.474.45
   query36  0.660.460.47
   query37  0.180.150.17
   query38  0.150.150.14
   query39  0.050.040.04
   query40  0.170.140.14
   query41  0.090.050.05
   query42  0.050.050.05
   query43  0.040.040.04
   Total cold run time: 109.22 s
   Total hot run time: 30.26 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2061825074

   
   
   TPC-DS: Total hot run time: 188966 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit ff77b2ff8f72522cb8f0c8664225d13982f2fcaf, 
data reload: false
   
   query1   893 380 360 360
   query2   6225271424332433
   query3   6658216 214 214
   query4   22760   21490   21625   21490
   query5   4164454 472 454
   query6   290 202 191 191
   query7   4595302 306 302
   query8   243 190 187 187
   query9   8772248224772477
   query10  445 280 278 278
   query11  14660   14191   14225   14191
   query12  151 100 97  97
   query13  1669382 379 379
   query14  10063   774380937743
   query15  268 195 216 195
   query16  8203297 293 293
   query17  1860611 590 590
   query18  2148296 306 296
   query19  351 167 167 167
   query20  110 102 101 101
   query21  205 136 135 135
   query22  5017485648844856
   query23  33876   33365   33413   33365
   query24  10648   309030573057
   query25  608 409 408 408
   query26  687 180 170 170
   query27  2233377 397 377
   query28  6007213421282128
   query29  880 659 663 659
   query30  308 204 193 193
   query31  989 785 764 764
   query32  104 63  65  63
   query33  694 288 277 277
   query34  996 499 524 499
   query35  880 762 759 759
   query36  1096924 955 924
   query37  126 87  87  87
   query38  3525337134583371
   query39  1661161615841584
   query40  193 141 148 141
   query41  52  49  52  49
   query42  116 108 123 108
   query43  602 552 565 552
   query44  1160781 772 772
   query45  340 297 293 293
   query46  1126752 731 731
   query47  2038196719541954
   query48  384 314 311 311
   query49  877 437 431 431
   query50  789 403 423 403
   query51  6884671667856716
   query52  118 98  100 98
   query53  360 286 306 286
   query54  338 268 274 268
   query55  90  87  85  85
   query56  318 271 277 271
   query57  1349120712261207
   query58  267 250 245 245
   query59  3599344730473047
   query60  302 272 277 272
   query61  124 121 121 121
   query62  610 459 455 455
   query63  316 295 291 291
   query64  4800398141813981
   query65  3111306230483048
   query66  800 398 386 386
   query67  15550   15301   15185   15185
   query68  5310557 550 550
   query69  511 342 343 342
   query70  1241120912041204
   query71  1407127612761276
   query72  6547268324572457
   query73  725 330 335 330
   query74  6710634364706343
   query75  3365269627002696
   query76  3288107210371037
   query77  445 311 315 311
   query78  10936   10258   10259   10258
   query79  7073532 532 532
   query80  1881489 489 489
   query81  541 252 257 252
   query82  880 116 107 107
   query83  288 191 185 185
   query84  273 90  96  90
   query85  1770306 336 306
   query86  496 328 328 328
   query87  3510331733463317
   query88  5404252025532520
   query89  519 399 392 392
   query90  1984207 207 207
   query91  133 107 107 107
   query92  75  60  58  58
   query93  5180525 531 525
   query94  1137205 207 205
   query95  432 341 336 336
   query96  628 279 278 278
   query97  3143297929772977
   query98  248 249 238 238
   query99  1209892 894 892
   Total cold run time: 288970 ms
   Total hot run time: 188966 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2061799707

   
   
   TPC-H: Total hot run time: 38717 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit ff77b2ff8f72522cb8f0c8664225d13982f2fcaf, 
data reload: false
   
   -- Round 1 --
   q1   17626   437242974297
   q2   2030202 207 202
   q3   10419   111311711113
   q4   10197   771 778 771
   q5   7519270626642664
   q6   219 134 136 134
   q7   1029638 586 586
   q8   9229209920522052
   q9   7295659665846584
   q10  8563354335233523
   q11  451 240 233 233
   q12  479 219 226 219
   q13  17782   293629782936
   q14  280 222 245 222
   q15  525 486 485 485
   q16  522 379 379 379
   q17  970 713 759 713
   q18  7337675168436751
   q19  5782156415381538
   q20  659 331 315 315
   q21  3526278127042704
   q22  364 296 306 296
   Total cold run time: 112803 ms
   Total hot run time: 38717 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4446427342304230
   q2   383 283 264 264
   q3   3021273327902733
   q4   1853158415681568
   q5   5348535953165316
   q6   215 127 127 127
   q7   2251186618971866
   q8   3219334233123312
   q9   8678861686948616
   q10  4100388039763880
   q11  600 515 509 509
   q12  836 626 644 626
   q13  17317   324531193119
   q14  315 306 282 282
   q15  536 488 487 487
   q16  509 432 450 432
   q17  1888154114991499
   q18  8109805077857785
   q19  1677152815661528
   q20  2055185418831854
   q21  8182503249614961
   q22  528 480 487 480
   Total cold run time: 76066 ms
   Total hot run time: 55474 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1569132542


##
be/src/runtime/fragment_mgr.cpp:
##
@@ -828,8 +842,23 @@ std::string FragmentMgr::dump_pipeline_tasks(int64_t 
duration) {
 
 Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,

Review Comment:
   warning: function 'exec_plan_fragment' has cognitive complexity of 80 
(threshold 50) [readability-function-cognitive-complexity]
   ```cpp
   Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,
   ^
   ```
   
   Additional context
   
   **be/src/runtime/fragment_mgr.cpp:855:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   while (pos < total_size) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:867:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:867:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:869:** +1
   ```cpp
   const bool enable_pipeline_x = 
params.query_options.__isset.enable_pipeline_x_engine &&

^
   ```
   **be/src/runtime/fragment_mgr.cpp:871:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (enable_pipeline_x) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:883:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!prepare_st.ok()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:892:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:892:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:895:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (handler) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:911:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:911:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:932:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:932:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:934:** +1, nesting level increased to 1
   ```cpp
   } else {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:935:** nesting level increased to 2
   ```cpp
   auto pre_and_submit = [&](int i) {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:942:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (iter != _pipeline_map.end()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:950:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:950:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
  

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1569131901


##
be/src/runtime/fragment_mgr.cpp:
##
@@ -828,8 +842,23 @@ std::string FragmentMgr::dump_pipeline_tasks(int64_t 
duration) {
 

Review Comment:
   warning: function 'exec_plan_fragment' has cognitive complexity of 80 
(threshold 50) [readability-function-cognitive-complexity]
   ```cpp
   Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,
   ^
   ```
   
   Additional context
   
   **be/src/runtime/fragment_mgr.cpp:854:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   while (pos < total_size) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:866:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:866:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:868:** +1
   ```cpp
   const bool enable_pipeline_x = 
params.query_options.__isset.enable_pipeline_x_engine &&

^
   ```
   **be/src/runtime/fragment_mgr.cpp:870:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (enable_pipeline_x) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:882:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!prepare_st.ok()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:891:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:891:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:894:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (handler) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:910:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:910:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:931:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:931:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:933:** +1, nesting level increased to 1
   ```cpp
   } else {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:934:** nesting level increased to 2
   ```cpp
   auto pre_and_submit = [&](int i) {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:941:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (iter != _pipeline_map.end()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:949:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:949:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
   ^
   ```
   

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2061705183

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2060791866

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2060759961

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit 923659b830a0ddd6266a3ab0ed784c8ae124da12 with 
default session variables
   Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  33 seconds loaded 861443392 Bytes, about 24 MB/s
   Insert into select:   13.6 seconds inserted 1000 Rows, about 735K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2060753952

   
   
   ClickBench: Total hot run time: 30.54 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 923659b830a0ddd6266a3ab0ed784c8ae124da12, 
data reload: false
   
   query1   0.040.040.04
   query2   0.080.040.04
   query3   0.230.050.06
   query4   1.660.100.08
   query5   0.510.490.50
   query6   1.430.740.72
   query7   0.020.010.02
   query8   0.050.040.04
   query9   0.550.490.49
   query10  0.560.560.57
   query11  0.170.130.11
   query12  0.150.130.12
   query13  0.600.580.58
   query14  0.750.750.78
   query15  0.830.800.81
   query16  0.370.370.36
   query17  1.030.980.94
   query18  0.210.230.27
   query19  1.891.721.73
   query20  0.020.010.01
   query21  15.41   0.650.65
   query22  4.407.412.17
   query23  18.24   1.321.21
   query24  1.740.230.24
   query25  0.140.090.09
   query26  0.260.160.15
   query27  0.080.080.08
   query28  13.40   0.980.99
   query29  12.60   3.253.23
   query30  0.300.070.08
   query31  2.820.380.37
   query32  3.280.470.46
   query33  2.812.852.82
   query34  17.19   4.354.41
   query35  4.544.474.46
   query36  0.660.470.46
   query37  0.190.160.16
   query38  0.150.150.14
   query39  0.050.030.04
   query40  0.180.150.16
   query41  0.100.060.05
   query42  0.060.050.05
   query43  0.040.040.04
   Total cold run time: 109.79 s
   Total hot run time: 30.54 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2060742817

   
   
   TPC-DS: Total hot run time: 187418 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 923659b830a0ddd6266a3ab0ed784c8ae124da12, 
data reload: false
   
   query1   935 384 376 376
   query2   6522256925702569
   query3   6669227 220 220
   query4   24145   21264   21447   21264
   query5   4194462 463 462
   query6   282 192 188 188
   query7   4618306 310 306
   query8   254 200 196 196
   query9   8545245424702454
   query10  637 288 290 288
   query11  14873   14199   14207   14199
   query12  178 104 99  99
   query13  1654400 382 382
   query14  10245   808176257625
   query15  285 218 197 197
   query16  8175292 286 286
   query17  1875607 588 588
   query18  2136301 293 293
   query19  343 174 211 174
   query20  109 98  102 98
   query21  201 133 130 130
   query22  5010486248514851
   query23  34044   33099   33289   33099
   query24  11752   296930152969
   query25  687 404 411 404
   query26  1770177 163 163
   query27  2963319 322 319
   query28  7364207520542054
   query29  1026638 626 626
   query30  322 184 180 180
   query31  968 763 742 742
   query32  104 67  69  67
   query33  835 322 285 285
   query34  956 487 507 487
   query35  872 745 734 734
   query36  1079907 934 907
   query37  301 88  88  88
   query38  3468324732163216
   query39  1577151615401516
   query40  298 144 141 141
   query41  54  52  54  52
   query42  116 108 106 106
   query43  633 590 562 562
   query44  1233769 755 755
   query45  306 295 275 275
   query46  1090725 779 725
   query47  1905186618491849
   query48  382 310 309 309
   query49  1234429 428 428
   query50  774 393 390 390
   query51  6885676367896763
   query52  121 100 110 100
   query53  360 297 298 297
   query54  371 306 268 268
   query55  88  88  85  85
   query56  290 272 269 269
   query57  1235113011241124
   query58  275 249 252 249
   query59  3602320731493149
   query60  327 292 286 286
   query61  126 126 122 122
   query62  655 453 458 453
   query63  326 291 297 291
   query64  6255400140554001
   query65  3181307930803079
   query66  1451412 401 401
   query67  15304   14886   14963   14886
   query68  5163541 550 541
   query69  526 340 345 340
   query70  1211124612431243
   query71  1389128712821282
   query72  6542266224972497
   query73  722 336 331 331
   query74  6863645264916452
   query75  3408278026892689
   query76  3224100610261006
   query77  462 311 314 311
   query78  10875   10188   10215   10188
   query79  2890565 540 540
   query80  2068479 482 479
   query81  541 254 257 254
   query82  786 114 111 111
   query83  304 193 192 192
   query84  287 96  96  96
   query85  1973297 284 284
   query86  499 306 305 305
   query87  3478325132833251
   query88  4585243524372435
   query89  506 391 390 390
   query90  2044206 207 206
   query91  140 110 179 110
   query92  73  61  61  61
   query93  4547504 501 501
   query94  1258203 205 203
   query95  419 335 327 327
   query96  597 278 275 275
   query97  3204291229842912
   query98  257 253 228 228
   query99  1239902 862 862
   Total cold run time: 292930 ms
   Total hot run time: 187418 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2060720298

   
   
   TPC-H: Total hot run time: 39170 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 923659b830a0ddd6266a3ab0ed784c8ae124da12, 
data reload: false
   
   -- Round 1 --
   q1   17616   442943694369
   q2   2444197 206 197
   q3   11017   114611941146
   q4   10507   792 785 785
   q5   7783274027042704
   q6   225 140 140 140
   q7   1044656 652 652
   q8   9361211520742074
   q9   7645672566296629
   q10  8500354135553541
   q11  456 239 233 233
   q12  488 228 229 228
   q13  17782   298429582958
   q14  271 233 233 233
   q15  523 499 495 495
   q16  538 384 373 373
   q17  973 718 756 718
   q18  7397688067076707
   q19  1940152215351522
   q20  663 316 315 315
   q21  3541284228642842
   q22  373 309 313 309
   Total cold run time: 111087 ms
   Total hot run time: 39170 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4303420341944194
   q2   388 268 277 268
   q3   3014279628112796
   q4   1909164416321632
   q5   5364529252775277
   q6   206 127 127 127
   q7   2224188318681868
   q8   3203337633623362
   q9   8613856985838569
   q10  3887367436803674
   q11  578 481 480 480
   q12  791 584 596 584
   q13  15857   289429542894
   q14  306 275 287 275
   q15  506 483 475 475
   q16  474 433 411 411
   q17  1770148914811481
   q18  7556746475857464
   q19  1721157015871570
   q20  1944175317311731
   q21  4818485449624854
   q22  529 459 445 445
   Total cold run time: 69961 ms
   Total hot run time: 54431 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1568428351


##
be/src/runtime/fragment_mgr.cpp:
##
@@ -827,8 +841,23 @@ std::string FragmentMgr::dump_pipeline_tasks(int64_t 
duration) {
 
 Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,

Review Comment:
   warning: function 'exec_plan_fragment' has cognitive complexity of 80 
(threshold 50) [readability-function-cognitive-complexity]
   ```cpp
   Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,
   ^
   ```
   
   Additional context
   
   **be/src/runtime/fragment_mgr.cpp:854:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   while (pos < total_size) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:866:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:866:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:868:** +1
   ```cpp
   const bool enable_pipeline_x = 
params.query_options.__isset.enable_pipeline_x_engine &&

^
   ```
   **be/src/runtime/fragment_mgr.cpp:870:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (enable_pipeline_x) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:882:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!prepare_st.ok()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:891:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:891:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:894:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (handler) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:910:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:910:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:931:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:931:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:933:** +1, nesting level increased to 1
   ```cpp
   } else {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:934:** nesting level increased to 2
   ```cpp
   auto pre_and_submit = [&](int i) {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:941:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (iter != _pipeline_map.end()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:949:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:949:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
  

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-17 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2060673280

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-15 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2058182942

   
   
   TPC-H: Total hot run time: 39157 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 93fd0d4c5cd290d0674cbe2e5819e8a3499851c0, 
data reload: false
   
   -- Round 1 --
   q1   17632   481546224622
   q2   2160185 185 185
   q3   10550   113811491138
   q4   10238   795 967 795
   q5   7610275026852685
   q6   223 137 138 137
   q7   1051621 585 585
   q8   9274211820702070
   q9   7343659065676567
   q10  8625357335463546
   q11  451 252 239 239
   q12  416 228 222 222
   q13  17807   299229512951
   q14  298 237 233 233
   q15  521 477 502 477
   q16  522 390 389 389
   q17  977 699 731 699
   q18  7580688167236723
   q19  6121155514871487
   q20  645 313 313 313
   q21  3547278828272788
   q22  359 315 306 306
   Total cold run time: 113950 ms
   Total hot run time: 39157 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4335427742174217
   q2   386 291 272 272
   q3   3011278427882784
   q4   1855158416011584
   q5   5374536553465346
   q6   212 124 128 124
   q7   2234187218821872
   q8   3269337433773374
   q9   8635857985948579
   q10  3881374837563748
   q11  570 491 497 491
   q12  772 579 569 569
   q13  16542   298129652965
   q14  305 284 266 266
   q15  520 469 475 469
   q16  479 447 431 431
   q17  1784148314921483
   q18  7989796880737968
   q19  1730159515381538
   q20  2052187418681868
   q21  5243493949584939
   q22  573 521 518 518
   Total cold run time: 71751 ms
   Total hot run time: 55405 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-15 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1566641145


##
be/src/vec/spill/spill_stream_manager.h:
##
@@ -31,48 +31,65 @@ class RuntimeProfile;
 
 namespace vectorized {
 
+class SpillStreamManager;
 class SpillDataDir {
 public:
-SpillDataDir(const std::string& path, int64_t capacity_bytes = -1,
+SpillDataDir(std::string path, int64_t capacity_bytes,
  TStorageMedium::type storage_medium = TStorageMedium::HDD);
 
 Status init();
 
 const std::string& path() const { return _path; }
 
-bool is_ssd_disk() const { return _storage_medium == TStorageMedium::SSD; }
-
 TStorageMedium::type storage_medium() const { return _storage_medium; }
 
 // check if the capacity reach the limit after adding the incoming data
 // return true if limit reached, otherwise, return false.
-// TODO(cmy): for now we can not precisely calculate the capacity Doris 
used,
-// so in order to avoid running out of disk capacity, we currently use the 
actual
-// disk available capacity and total capacity to do the calculation.
-// So that the capacity Doris actually used may exceeds the user specified 
capacity.
 bool reach_capacity_limit(int64_t incoming_data_size);
 
 Status update_capacity();
 
-double get_usage(int64_t incoming_data_size) const {
+void update_spill_data_usage(int64_t incoming_data_size) {
+std::lock_guard l(_mutex);
+_spill_data_bytes += incoming_data_size;
+}
+
+int64_t get_spill_data_bytes() {

Review Comment:
   warning: method 'get_spill_data_bytes' can be made const 
[readability-make-member-function-const]
   
   ```suggestion
   int64_t get_spill_data_bytes() const {
   ```
   



##
be/src/vec/spill/spill_stream_manager.h:
##
@@ -31,48 +31,65 @@
 
 namespace vectorized {
 
+class SpillStreamManager;
 class SpillDataDir {
 public:
-SpillDataDir(const std::string& path, int64_t capacity_bytes = -1,
+SpillDataDir(std::string path, int64_t capacity_bytes,
  TStorageMedium::type storage_medium = TStorageMedium::HDD);
 
 Status init();
 
 const std::string& path() const { return _path; }
 
-bool is_ssd_disk() const { return _storage_medium == TStorageMedium::SSD; }
-
 TStorageMedium::type storage_medium() const { return _storage_medium; }
 
 // check if the capacity reach the limit after adding the incoming data
 // return true if limit reached, otherwise, return false.
-// TODO(cmy): for now we can not precisely calculate the capacity Doris 
used,
-// so in order to avoid running out of disk capacity, we currently use the 
actual
-// disk available capacity and total capacity to do the calculation.
-// So that the capacity Doris actually used may exceeds the user specified 
capacity.
 bool reach_capacity_limit(int64_t incoming_data_size);
 
 Status update_capacity();
 
-double get_usage(int64_t incoming_data_size) const {
+void update_spill_data_usage(int64_t incoming_data_size) {
+std::lock_guard l(_mutex);
+_spill_data_bytes += incoming_data_size;
+}
+
+int64_t get_spill_data_bytes() {
+std::lock_guard l(_mutex);
+return _spill_data_bytes;
+}
+
+int64_t get_spill_data_limit() {

Review Comment:
   warning: method 'get_spill_data_limit' can be made const 
[readability-make-member-function-const]
   
   ```suggestion
   int64_t get_spill_data_limit() const {
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-15 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2058133160

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2054696456

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit a167937c7eeb4a191c7387e653801ff78510bec9 with 
default session variables
   Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  33 seconds loaded 861443392 Bytes, about 24 MB/s
   Insert into select:   13.5 seconds inserted 1000 Rows, about 740K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2054681891

   
   
   ClickBench: Total hot run time: 30.94 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit a167937c7eeb4a191c7387e653801ff78510bec9, 
data reload: false
   
   query1   0.040.030.03
   query2   0.090.040.04
   query3   0.240.050.05
   query4   1.680.070.08
   query5   0.490.480.50
   query6   1.480.660.66
   query7   0.020.010.01
   query8   0.050.040.05
   query9   0.570.500.49
   query10  0.560.570.55
   query11  0.160.120.11
   query12  0.140.120.12
   query13  0.610.600.58
   query14  0.750.780.79
   query15  0.840.800.81
   query16  0.370.380.39
   query17  0.960.950.98
   query18  0.230.230.25
   query19  1.811.751.79
   query20  0.020.010.02
   query21  15.40   0.660.67
   query22  4.227.012.40
   query23  18.28   1.431.33
   query24  1.660.210.25
   query25  0.140.080.07
   query26  0.280.170.16
   query27  0.080.080.08
   query28  13.47   1.000.98
   query29  12.63   3.353.30
   query30  0.270.070.08
   query31  2.870.380.38
   query32  3.260.480.46
   query33  2.822.822.77
   query34  17.22   4.404.40
   query35  4.494.524.47
   query36  0.650.460.46
   query37  0.190.160.15
   query38  0.150.160.14
   query39  0.040.040.04
   query40  0.180.140.14
   query41  0.100.050.05
   query42  0.050.060.05
   query43  0.050.040.05
   Total cold run time: 109.61 s
   Total hot run time: 30.94 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2054656128

   
   
   TPC-DS: Total hot run time: 186854 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit a167937c7eeb4a191c7387e653801ff78510bec9, 
data reload: false
   
   query1   1230113911201120
   query2   6213300524302430
   query3   6661228 213 213
   query4   36900   21615   21402   21402
   query5   4182433 454 433
   query6   267 210 209 209
   query7   4062324 325 324
   query8   270 182 180 180
   query9   6224239123762376
   query10  397 274 308 274
   query11  14766   14487   14290   14290
   query12  157 109 101 101
   query13  1002374 370 370
   query14  10085   673867646738
   query15  220 194 209 194
   query16  6708311 300 300
   query17  1403618 618 618
   query18  1406305 296 296
   query19  211 182 174 174
   query20  112 109 101 101
   query21  207 134 129 129
   query22  5026484148214821
   query23  34449   33148   33695   33148
   query24  11751   307631293076
   query25  621 433 424 424
   query26  1370187 181 181
   query27  3220403 413 403
   query28  7275218721752175
   query29  899 676 666 666
   query30  299 186 185 185
   query31  948 761 821 761
   query32  77  70  69  69
   query33  599 308 314 308
   query34  1002514 513 513
   query35  931 746 760 746
   query36  1077958 958 958
   query37  180 92  89  89
   query38  3777360236103602
   query39  1664162815871587
   query40  195 153 159 153
   query41  59  56  53  53
   query42  120 112 113 112
   query43  597 587 592 587
   query44  1514781 748 748
   query45  313 277 289 277
   query46  1120768 808 768
   query47  2090197919721972
   query48  400 324 313 313
   query49  899 415 419 415
   query50  807 426 443 426
   query51  6843666767906667
   query52  114 108 104 104
   query53  358 290 292 290
   query54  316 284 266 266
   query55  98  85  92  85
   query56  288 270 264 264
   query57  1250117611291129
   query58  307 263 269 263
   query59  3290322133623221
   query60  290 292 284 284
   query61  169 172 131 131
   query62  695 455 459 455
   query63  314 293 304 293
   query64  4016405438043804
   query65  3168310130473047
   query66  841 369 364 364
   query67  15513   14935   14962   14935
   query68  5100558 554 554
   query69  565 374 361 361
   query70  1263116611871166
   query71  505 326 319 319
   query72  6610271524642464
   query73  752 351 340 340
   query74  6968642464546424
   query75  3151235523232323
   query76  3403117511441144
   query77  658 300 310 300
   query78  10958   10203   10330   10203
   query79  3478533 539 533
   query80  2238481 462 462
   query81  532 300 251 251
   query82  1598113 114 113
   query83  370 203 206 203
   query84  274 97  94  94
   query85  1480295 279 279
   query86  476 350 282 282
   query87  3842353935823539
   query88  5620242424382424
   query89  507 413 391 391
   query90  1861198 197 197
   query91  134 110 109 109
   query92  72  61  66  61
   query93  4751526 520 520
   query94  1190221 215 215
   query95  416 313 2137313
   query96  627 280 280 280
   query97  2669249124942491
   query98  253 242 229 229
   query99  1238893 860 860
   Total cold run time: 297391 ms
   Total hot run time: 186854 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2054600767

   
   
   TPC-H: Total hot run time: 38773 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit a167937c7eeb4a191c7387e653801ff78510bec9, 
data reload: false
   
   -- Round 1 --
   q1   17640   439842744274
   q2   2021202 188 188
   q3   10462   113311861133
   q4   10202   779 802 779
   q5   7564273426602660
   q6   222 132 133 132
   q7   1007607 590 590
   q8   9225207420562056
   q9   7940661565066506
   q10  8631355435533553
   q11  468 242 249 242
   q12  498 231 219 219
   q13  19230   293529352935
   q14  277 227 240 227
   q15  526 492 479 479
   q16  522 376 389 376
   q17  972 616 684 616
   q18  7426692068516851
   q19  5930153815461538
   q20  706 326 311 311
   q21  3483279428402794
   q22  367 314 324 314
   Total cold run time: 115319 ms
   Total hot run time: 38773 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4402427642794276
   q2   376 268 277 268
   q3   2988279428252794
   q4   1861163915821582
   q5   5351538653195319
   q6   212 124 126 124
   q7   2245188418561856
   q8   3207334133363336
   q9   8610856988148569
   q10  4069398539883985
   q11  611 494 527 494
   q12  841 612 672 612
   q13  16975   316331553155
   q14  327 282 325 282
   q15  523 489 469 469
   q16  501 468 458 458
   q17  1861153815281528
   q18  8135800979977997
   q19  1690156816291568
   q20  2056187818501850
   q21  10398   506849704970
   q22  577 479 471 471
   Total cold run time: 77816 ms
   Total hot run time: 55963 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-14 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2054289737

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-11 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2050863535

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-11 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2049936182

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-11 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2049380566

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-11 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1560781668


##
be/src/vec/spill/spill_stream_manager.cpp:
##
@@ -271,18 +282,68 @@
 RETURN_NOT_OK_STATUS_WITH_WARN(Status::IOError("opendir failed, 
path={}", _path),
"check file exist failed");
 }
-
+return update_capacity();
+}
+Status SpillDataDir::update_capacity() {
+std::lock_guard l(_mutex);
+RETURN_IF_ERROR(io::global_local_filesystem()->get_space_info(_path, 
&_disk_capacity_bytes,
+  
&_available_bytes));
+if (_shared_with_storage_path) {
+_limit_bytes = (size_t)(_disk_capacity_bytes *
+(config::storage_flood_stage_usage_percent / 
100.0) *
+(config::spill_storage_usage_percent / 100.0));
+} else {
+_limit_bytes =
+(size_t)(_disk_capacity_bytes * 
(config::spill_storage_usage_percent / 100.0));
+}
 return Status::OK();
 }
 bool SpillDataDir::reach_capacity_limit(int64_t incoming_data_size) {
-double used_pct = get_usage(incoming_data_size);
-int64_t left_bytes = _available_bytes - incoming_data_size;
-if (used_pct >= config::storage_flood_stage_usage_percent / 100.0 &&
-left_bytes <= config::storage_flood_stage_left_capacity_bytes) {
-LOG(WARNING) << "reach capacity limit. used pct: " << used_pct
- << ", left bytes: " << left_bytes << ", path: " << _path;
-return true;
+std::lock_guard l(_mutex);
+if (_shared_with_storage_path) {
+VLOG_DEBUG << fmt::format(
+"spill data path: {}, limit: {}, used: {}, available: {}, "
+"incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_limit_bytes),
+PrettyPrinter::print_bytes(_used_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+int64_t left_bytes = _available_bytes - incoming_data_size;
+if (_used_bytes + incoming_data_size > _limit_bytes ||
+left_bytes <= config::storage_flood_stage_left_capacity_bytes) {
+LOG(WARNING) << fmt::format(
+"spill data reach limit, path: {}, limit: {}, used: {}, 
available: {}, "
+"incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_limit_bytes),
+PrettyPrinter::print_bytes(_used_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+return true;
+}
+return false;
+} else {
+double used_pct = _disk_capacity_bytes == 0
+  ? 0
+  : (_disk_capacity_bytes - _available_bytes + 
incoming_data_size) /
+(double)_disk_capacity_bytes;
+VLOG_DEBUG << fmt::format(
+"spill data path: {}, capacity: {}, available: {}, used pct: 
{}, incoming bytes: "
+"{}",
+_path, PrettyPrinter::print_bytes(_disk_capacity_bytes),
+PrettyPrinter::print_bytes(_available_bytes), used_pct,
+PrettyPrinter::print_bytes(incoming_data_size));
+if (used_pct >= config::spill_storage_usage_percent / 100.0) {
+LOG(WARNING) << fmt::format(
+"spill data reach limit, path: {}, capacity: {}, 
available: {}, incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_disk_capacity_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+return true;

Review Comment:
   warning: redundant boolean literal in conditional return statement 
[readability-simplify-boolean-expr]
   
   be/src/vec/spill/spill_stream_manager.cpp:336:
   ```diff
   - if (used_pct >= config::spill_storage_usage_percent / 100.0) {
   - LOG(WARNING) << fmt::format(
   - "spill data reach limit, path: {}, capacity: {}, 
available: {}, incoming "
   - "bytes: {}",
   - _path, 
PrettyPrinter::print_bytes(_disk_capacity_bytes),
   - PrettyPrinter::print_bytes(_available_bytes),
   - PrettyPrinter::print_bytes(incoming_data_size));
   - return true;
   - }
   - return false;
   + return used_pct >= config::spill_storage_usage_percent / 100.0;
   ```
   



##
be/src/vec/spill/spill_stream_manager.cpp:
##
@@ -271,18 +282,68 @@ Status SpillDataDir::init() {
 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-11 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2049284935

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-11 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2049273996

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-11 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2049164994

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-10 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1560424242


##
be/src/runtime/fragment_mgr.cpp:
##
@@ -823,8 +837,23 @@ std::string FragmentMgr::dump_pipeline_tasks(int64_t 
duration) {
 
 Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,

Review Comment:
   warning: function 'exec_plan_fragment' has cognitive complexity of 80 
(threshold 50) [readability-function-cognitive-complexity]
   ```cpp
   Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,
   ^
   ```
   
   Additional context
   
   **be/src/runtime/fragment_mgr.cpp:850:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   while (pos < total_size) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:862:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:862:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:864:** +1
   ```cpp
   const bool enable_pipeline_x = 
params.query_options.__isset.enable_pipeline_x_engine &&

^
   ```
   **be/src/runtime/fragment_mgr.cpp:866:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (enable_pipeline_x) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:878:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!prepare_st.ok()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:887:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:887:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:890:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (handler) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:906:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:906:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger || 
!params.need_wait_execution_trigger) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:927:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:927:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:929:** +1, nesting level increased to 1
   ```cpp
   } else {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:930:** nesting level increased to 2
   ```cpp
   auto pre_and_submit = [&](int i) {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:937:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (iter != _pipeline_map.end()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:945:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:945:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
  

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-10 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2048907571

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-10 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2047855920

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-10 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1559413445


##
be/src/runtime/fragment_mgr.cpp:
##
@@ -823,8 +836,22 @@ std::string FragmentMgr::dump_pipeline_tasks(int64_t 
duration) {
 
 Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,

Review Comment:
   warning: function 'exec_plan_fragment' has cognitive complexity of 90 
(threshold 50) [readability-function-cognitive-complexity]
   ```cpp
   Status FragmentMgr::exec_plan_fragment(const TPipelineFragmentParams& params,
   ^
   ```
   
   Additional context
   
   **be/src/runtime/fragment_mgr.cpp:848:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   while (pos < total_size) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:860:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:860:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_get_query_ctx(params, params.query_id, true, 
query_ctx));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:862:** +1
   ```cpp
   const bool enable_pipeline_x = 
params.query_options.__isset.enable_pipeline_x_engine &&

^
   ```
   **be/src/runtime/fragment_mgr.cpp:864:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (enable_pipeline_x) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:876:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (!prepare_st.ok()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:884:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   for (size_t i = 0; i < params.local_params.size(); i++) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:886:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:886:** +4, including nesting penalty of 3, 
nesting level increased to 4
   ```cpp
   RETURN_IF_ERROR(_runtimefilter_controller.add_entity(
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:889:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (!i && handler) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:889:** +1
   ```cpp
   if (!i && handler) {
  ^
   ```
   **be/src/runtime/fragment_mgr.cpp:896:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (iter != _pipeline_map.end()) {
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:903:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:903:** +1
   ```cpp
   if (!params.__isset.need_wait_execution_trigger ||
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:925:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:925:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(context->submit());
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/runtime/fragment_mgr.cpp:927:** +1, nesting level increased to 1
   ```cpp
   } else {
 ^
   ```
   **be/src/runtime/fragment_mgr.cpp:928:** nesting level increased to 2
   ```cpp
   auto pre_and_submit = [&](int i) {
 ^
   ```
   

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-10 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2047495437

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-10 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2046931517

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-09 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2046471162

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-09 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2046436145

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-09 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2046317849

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-09 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2045741051

   TeamCity be ut coverage result:
Function Coverage: 35.63% (8904/24990) 
Line Coverage: 27.36% (73130/267308)
Region Coverage: 26.52% (37789/142471)
Branch Coverage: 23.33% (19257/82538)
Coverage Report: 
http://coverage.selectdb-in.cc/coverage/bc3113099881a46ebf1fd803527093ebf796e729_bc3113099881a46ebf1fd803527093ebf796e729/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-09 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2045534905

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1555196151


##
be/src/vec/spill/spill_stream_manager.cpp:
##
@@ -271,18 +275,68 @@ Status SpillDataDir::init() {
 RETURN_NOT_OK_STATUS_WITH_WARN(Status::IOError("opendir failed, 
path={}", _path),
"check file exist failed");
 }
-
+return update_capacity();
+}
+Status SpillDataDir::update_capacity() {
+std::lock_guard l(_mutex);
+RETURN_IF_ERROR(io::global_local_filesystem()->get_space_info(_path, 
&_disk_capacity_bytes,
+  
&_available_bytes));
+if (_shared_with_storage_path) {
+_limit_bytes = (size_t)(_disk_capacity_bytes *
+(config::storage_flood_stage_usage_percent / 
100.0) *
+(config::spill_storage_usage_percent / 100.0));
+} else {
+_limit_bytes =
+(size_t)(_disk_capacity_bytes * 
(config::spill_storage_usage_percent / 100.0));
+}
 return Status::OK();
 }
 bool SpillDataDir::reach_capacity_limit(int64_t incoming_data_size) {
-double used_pct = get_usage(incoming_data_size);
-int64_t left_bytes = _available_bytes - incoming_data_size;
-if (used_pct >= config::storage_flood_stage_usage_percent / 100.0 &&
-left_bytes <= config::storage_flood_stage_left_capacity_bytes) {
-LOG(WARNING) << "reach capacity limit. used pct: " << used_pct
- << ", left bytes: " << left_bytes << ", path: " << _path;
-return true;
+std::lock_guard l(_mutex);
+if (_shared_with_storage_path) {
+VLOG_DEBUG << fmt::format(
+"spill data path: {}, limit: {}, used: {}, available: {}, "
+"incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_limit_bytes),
+PrettyPrinter::print_bytes(_used_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+int64_t left_bytes = _available_bytes - incoming_data_size;
+if (_used_bytes + incoming_data_size > _limit_bytes ||
+left_bytes <= config::storage_flood_stage_left_capacity_bytes) {
+LOG(WARNING) << fmt::format(
+"spill data reach limit, path: {}, limit: {}, used: {}, 
available: {}, "
+"incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_limit_bytes),
+PrettyPrinter::print_bytes(_used_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+return true;
+}
+return false;
+} else {
+double used_pct = _disk_capacity_bytes == 0
+  ? 0
+  : (_disk_capacity_bytes - _available_bytes + 
incoming_data_size) /
+(double)_disk_capacity_bytes;
+VLOG_DEBUG << fmt::format(
+"spill data path: {}, capacity: {}, available: {}, used pct: 
{}, incoming bytes: "
+"{}",
+_path, PrettyPrinter::print_bytes(_disk_capacity_bytes),
+PrettyPrinter::print_bytes(_available_bytes), used_pct,
+PrettyPrinter::print_bytes(incoming_data_size));
+if (used_pct >= config::spill_storage_usage_percent / 100.0) {
+LOG(WARNING) << fmt::format(
+"spill data reach limit, path: {}, capacity: {}, 
available: {}, incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_disk_capacity_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+return true;

Review Comment:
   warning: redundant boolean literal in conditional return statement 
[readability-simplify-boolean-expr]
   
   be/src/vec/spill/spill_stream_manager.cpp:329:
   ```diff
   - if (used_pct >= config::spill_storage_usage_percent / 100.0) {
   - LOG(WARNING) << fmt::format(
   - "spill data reach limit, path: {}, capacity: {}, 
available: {}, incoming "
   - "bytes: {}",
   - _path, 
PrettyPrinter::print_bytes(_disk_capacity_bytes),
   - PrettyPrinter::print_bytes(_available_bytes),
   - PrettyPrinter::print_bytes(incoming_data_size));
   - return true;
   - }
   - return false;
   + return used_pct >= config::spill_storage_usage_percent / 100.0;
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041814823

   run builall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041367935

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit 5c3196f917b933a2cd6b40925fcea2c658135e0d with 
default session variables
   Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 MB/s
   Insert into select:   16.1 seconds inserted 1000 Rows, about 621K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041367030

   
   
   ClickBench: Total hot run time: 30.13 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 5c3196f917b933a2cd6b40925fcea2c658135e0d, 
data reload: false
   
   query1   0.030.030.03
   query2   0.080.040.04
   query3   0.240.050.05
   query4   1.660.060.06
   query5   0.480.480.49
   query6   1.150.660.66
   query7   0.020.020.01
   query8   0.050.040.04
   query9   0.570.490.51
   query10  0.550.570.56
   query11  0.160.120.11
   query12  0.140.120.12
   query13  0.610.600.59
   query14  0.780.790.81
   query15  0.860.830.84
   query16  0.350.360.36
   query17  0.980.960.98
   query18  0.250.250.26
   query19  1.861.741.78
   query20  0.020.010.01
   query21  15.43   0.650.64
   query22  4.155.971.85
   query23  17.74   1.311.27
   query24  1.510.200.19
   query25  0.150.090.08
   query26  0.270.160.16
   query27  0.080.070.08
   query28  13.85   0.960.94
   query29  12.65   3.303.23
   query30  0.260.060.06
   query31  2.850.380.39
   query32  3.270.470.47
   query33  2.862.872.89
   query34  15.51   4.354.34
   query35  4.384.344.39
   query36  0.680.470.46
   query37  0.190.160.16
   query38  0.150.140.14
   query39  0.040.030.04
   query40  0.180.150.14
   query41  0.100.050.05
   query42  0.070.060.04
   query43  0.040.040.04
   Total cold run time: 107.25 s
   Total hot run time: 30.13 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041365320

   
   
   TPC-DS: Total hot run time: 181269 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 5c3196f917b933a2cd6b40925fcea2c658135e0d, 
data reload: false
   
   query1   899 112411061106
   query2   6309199818851885
   query3   6678206 207 206
   query4   24568   21404   21389   21389
   query5   4176398 407 398
   query6   281 206 179 179
   query7   4598301 295 295
   query8   229 167 168 167
   query9   8446225622472247
   query10  451 232 234 232
   query11  14948   14396   14508   14396
   query12  139 90  84  84
   query13  1634365 366 365
   query14  8563694668416841
   query15  210 178 184 178
   query16  6609268 267 267
   query17  984 593 563 563
   query18  1788281 279 279
   query19  199 152 155 152
   query20  93  85  88  85
   query21  207 130 128 128
   query22  5001483047814781
   query23  33536   32943   32519   32519
   query24  11776   319831393139
   query25  721 406 414 406
   query26  1836164 160 160
   query27  3245372 382 372
   query28  7455191518731873
   query29  1136610 612 610
   query30  307 175 175 175
   query31  1019732 773 732
   query32  107 60  57  57
   query33  673 252 240 240
   query34  1208496 509 496
   query35  859 711 722 711
   query36  1016868 898 868
   query37  242 76  74  74
   query38  3585362335693569
   query39  1630159015761576
   query40  247 135 128 128
   query41  50  46  46  46
   query42  118 103 106 103
   query43  462 442 436 436
   query44  1165748 742 742
   query45  271 265 272 265
   query46  1117829 792 792
   query47  1979189718751875
   query48  377 310 300 300
   query49  919 363 362 362
   query50  817 412 418 412
   query51  6891674967516749
   query52  106 93  93  93
   query53  350 289 292 289
   query54  284 229 220 220
   query55  90  71  72  71
   query56  246 230 251 230
   query57  1283116311611161
   query58  241 235 217 217
   query59  2848260524602460
   query60  253 225 227 225
   query61  95  88  85  85
   query62  678 434 453 434
   query63  308 272 270 270
   query64  5887313731593137
   query65  3015299829842984
   query66  1333313 316 313
   query67  15686   14883   14745   14745
   query68  7389561 575 561
   query69  545 298 288 288
   query70  1322111810801080
   query71  530 266 262 262
   query72  6260257323932393
   query73  800 317 316 316
   query74  6793631563716315
   query75  3594233022962296
   query76  6072121312161213
   query77  607 245 245 245
   query78  10870   10184   10013   10013
   query79  9501531 530 530
   query80  1337427 434 427
   query81  508 234 240 234
   query82  639 96  95  95
   query83  214 168 181 168
   query84  270 86  88  86
   query85  1348289 287 287
   query86  442 298 290 290
   query87  3680345134803451
   query88  3528227222772272
   query89  545 373 367 367
   query90  2019179 176 176
   query91  133 101 110 101
   query92  58  47  48  47
   query93  6572535 533 533
   query94  972 189 183 183
   query95  437 319 319 319
   query96  615 271 266 266
   query97  2679248724902487
   query98  228 208 217 208
   query99  1364868 816 816
   Total cold run time: 296348 ms
   Total hot run time: 181269 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041361893

   
   
   TPC-H: Total hot run time: 38419 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 5c3196f917b933a2cd6b40925fcea2c658135e0d, 
data reload: false
   
   -- Round 1 --
   q1   17596   413840824082
   q2   2016187 179 179
   q3   10473   116212861162
   q4   10207   819 1002819
   q5   7556292029492920
   q6   217 129 131 129
   q7   1090651 613 613
   q8   9414201020492010
   q9   6784623361636163
   q10  8450350735223507
   q11  420 235 231 231
   q12  389 204 202 202
   q13  17783   292329212921
   q14  265 232 228 228
   q15  509 481 487 481
   q16  512 382 371 371
   q17  960 910 861 861
   q18  7273646463856385
   q19  1612155115201520
   q20  569 309 296 296
   q21  3547307630463046
   q22  335 293 294 293
   Total cold run time: 107977 ms
   Total hot run time: 38419 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4066403140684031
   q2   332 219 219 219
   q3   2976296229272927
   q4   1880185018221822
   q5   5251522352395223
   q6   209 125 125 125
   q7   2245181318131813
   q8   3221328632823282
   q9   8423846584448444
   q10  3779396440123964
   q11  566 459 474 459
   q12  763 576 600 576
   q13  15771   307231353072
   q14  323 272 267 267
   q15  531 505 469 469
   q16  498 441 439 439
   q17  1779176217621762
   q18  8363785975467546
   q19  2495168416681668
   q20  2056181418821814
   q21  5122498849354935
   q22  489 427 427 427
   Total cold run time: 71138 ms
   Total hot run time: 55284 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1554836376


##
be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:
##
@@ -55,10 +57,69 @@ Status 
PartitionedHashJoinSinkLocalState::close(RuntimeState* state, Status exec
 return PipelineXSpillSinkLocalState::close(state, exec_status);
 }
 
+size_t PartitionedHashJoinSinkLocalState::revocable_mem_size(RuntimeState* 
state) const {

Review Comment:
   warning: method 'revocable_mem_size' can be made static 
[readability-convert-member-functions-to-static]
   
   ```suggestion
   size_t PartitionedHashJoinSinkLocalState::revocable_mem_size(RuntimeState* 
state) {
   ```
   
   be/src/pipeline/exec/partitioned_hash_join_sink_operator.h:50:
   ```diff
   - size_t revocable_mem_size(RuntimeState* state) const;
   + static size_t revocable_mem_size(RuntimeState* state) ;
   ```
   



##
be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:
##
@@ -183,10 +283,53 @@
 }
 
 Status PartitionedHashJoinSinkOperatorX::prepare(RuntimeState* state) {
-return Status::OK();
+RETURN_IF_ERROR(_inner_sink_operator->set_child(_child_x));
+return _inner_sink_operator->prepare(state);
 }
 
 Status PartitionedHashJoinSinkOperatorX::open(RuntimeState* state) {
+return _inner_sink_operator->open(state);
+}
+
+Status 
PartitionedHashJoinSinkOperatorX::_setup_internal_operator(RuntimeState* state) 
{

Review Comment:
   warning: method '_setup_internal_operator' can be made static 
[readability-convert-member-functions-to-static]
   
   be/src/pipeline/exec/partitioned_hash_join_sink_operator.h:138:
   ```diff
   - Status _setup_internal_operator(RuntimeState* state);
   + static Status _setup_internal_operator(RuntimeState* state);
   ```
   



##
be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:
##
@@ -124,14 +185,53 @@
 return Status::OK();
 }
 
+Status PartitionedHashJoinSinkLocalState::_partition_block(RuntimeState* state,

Review Comment:
   warning: method '_partition_block' can be made static 
[readability-convert-member-functions-to-static]
   
   be/src/pipeline/exec/partitioned_hash_join_sink_operator.h:59:
   ```diff
   - Status _partition_block(RuntimeState* state, vectorized::Block* 
in_block, size_t begin,
   + static Status _partition_block(RuntimeState* state, vectorized::Block* 
in_block, size_t begin,
   ```
   



##
be/src/pipeline/exec/partitioned_hash_join_probe_operator.cpp:
##
@@ -570,6 +563,15 @@ Status 
PartitionedHashJoinProbeOperatorX::push(RuntimeState* state, vectorized::
 return Status::OK();
 }
 
+Status 
PartitionedHashJoinProbeOperatorX::_setup_internal_operator_for_non_spill(

Review Comment:
   warning: method '_setup_internal_operator_for_non_spill' can be made static 
[readability-convert-member-functions-to-static]
   
   be/src/pipeline/exec/partitioned_hash_join_probe_operator.h:197:
   ```diff
   - [[nodiscard]] Status _setup_internal_operator_for_non_spill(
   + [[nodiscard]] static Status _setup_internal_operator_for_non_spill(
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-07 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041349312

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041289769

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit 4bdb91326ceeced20d90db9bd2102e833499f60d with 
default session variables
   Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 MB/s
   Insert into select:   17.0 seconds inserted 1000 Rows, about 588K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041289016

   
   
   ClickBench: Total hot run time: 29.84 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 4bdb91326ceeced20d90db9bd2102e833499f60d, 
data reload: false
   
   query1   0.040.040.03
   query2   0.090.040.04
   query3   0.240.060.05
   query4   1.660.080.08
   query5   0.480.480.48
   query6   1.160.660.66
   query7   0.030.010.01
   query8   0.050.040.04
   query9   0.570.500.50
   query10  0.540.560.57
   query11  0.160.120.11
   query12  0.140.120.12
   query13  0.610.600.60
   query14  0.770.780.81
   query15  0.860.840.83
   query16  0.350.350.36
   query17  1.001.001.00
   query18  0.250.240.27
   query19  1.881.741.72
   query20  0.020.010.01
   query21  15.40   0.640.63
   query22  3.786.901.42
   query23  17.92   1.351.33
   query24  1.610.210.19
   query25  0.150.080.07
   query26  0.270.160.16
   query27  0.080.070.08
   query28  13.80   0.960.95
   query29  12.57   3.313.28
   query30  0.260.060.06
   query31  2.860.400.37
   query32  3.270.470.47
   query33  2.902.892.96
   query34  15.50   4.334.34
   query35  4.374.394.37
   query36  0.690.470.47
   query37  0.180.150.16
   query38  0.160.140.13
   query39  0.040.030.04
   query40  0.180.150.18
   query41  0.090.040.04
   query42  0.060.040.05
   query43  0.040.030.04
   Total cold run time: 107.08 s
   Total hot run time: 29.84 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041287865

   
   
   TPC-DS: Total hot run time: 180915 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 4bdb91326ceeced20d90db9bd2102e833499f60d, 
data reload: false
   
   query1   1214370 1112370
   query2   6194200419501950
   query3   6662205 203 203
   query4   24683   21374   21448   21374
   query5   4170397 407 397
   query6   270 190 175 175
   query7   4612299 297 297
   query8   233 170 181 170
   query9   8478225022632250
   query10  454 251 244 244
   query11  14961   14443   14434   14434
   query12  140 88  86  86
   query13  1636364 362 362
   query14  8495686569726865
   query15  208 171 183 171
   query16  6720269 268 268
   query17  999 591 569 569
   query18  1848283 277 277
   query19  195 156 159 156
   query20  92  89  87  87
   query21  197 140 130 130
   query22  4934485248434843
   query23  33439   33122   32590   32590
   query24  11702   323432283228
   query25  711 432 433 432
   query26  1903166 158 158
   query27  3287375 370 370
   query28  7358187218691869
   query29  1302598 636 598
   query30  307 168 172 168
   query31  987 781 752 752
   query32  101 57  55  55
   query33  667 243 243 243
   query34  1331518 526 518
   query35  855 706 716 706
   query36  998 882 885 882
   query37  276 73  71  71
   query38  3562369135353535
   query39  1650157015881570
   query40  243 138 129 129
   query41  54  45  45  45
   query42  112 105 107 105
   query43  467 432 426 426
   query44  1205763 769 763
   query45  291 288 274 274
   query46  1113827 813 813
   query47  1963184318621843
   query48  372 307 304 304
   query49  972 375 387 375
   query50  822 413 403 403
   query51  6824672167216721
   query52  108 96  96  96
   query53  363 298 288 288
   query54  282 239 240 239
   query55  86  80  82  80
   query56  251 242 233 233
   query57  1255120211411141
   query58  231 212 220 212
   query59  2857242723462346
   query60  244 228 232 228
   query61  95  90  89  89
   query62  671 446 462 446
   query63  305 288 286 286
   query64  5790319131573157
   query65  3077300730163007
   query66  1311322 318 318
   query67  15282   14760   14717   14717
   query68  7756560 584 560
   query69  559 314 300 300
   query70  1342109611331096
   query71  521 273 272 272
   query72  6260255324092409
   query73  795 320 321 320
   query74  6784628364936283
   query75  3595227123322271
   query76  6059124012461240
   query77  636 246 245 245
   query78  10899   10130   10102   10102
   query79  9797527 536 527
   query80  1851434 430 430
   query81  522 238 230 230
   query82  685 91  95  91
   query83  206 166 165 165
   query84  269 88  84  84
   query85  1373293 281 281
   query86  452 285 297 285
   query87  3725353134723472
   query88  4164225122822251
   query89  553 368 373 368
   query90  1931185 179 179
   query91  133 106 107 106
   query92  59  52  47  47
   query93  6802527 522 522
   query94  1041176 179 176
   query95  422 322 307 307
   query96  624 267 267 267
   query97  2682246425022464
   query98  235 220 228 220
   query99  1309838 838 838
   Total cold run time: 298518 ms
   Total hot run time: 180915 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041285275

   
   
   TPC-H: Total hot run time: 38561 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 4bdb91326ceeced20d90db9bd2102e833499f60d, 
data reload: false
   
   -- Round 1 --
   q1   17688   409040834083
   q2   2014195 183 183
   q3   10470   115212451152
   q4   10210   857 988 857
   q5   7507299029422942
   q6   217 138 135 135
   q7   1097629 603 603
   q8   9442203420182018
   q9   6964619061996190
   q10  8468357635183518
   q11  421 242 242 242
   q12  382 217 212 212
   q13  17782   289129232891
   q14  268 234 237 234
   q15  527 482 477 477
   q16  508 370 378 370
   q17  954 886 889 886
   q18  7235656663456345
   q19  3539153715301530
   q20  551 321 305 305
   q21  3572311530883088
   q22  357 300 302 300
   Total cold run time: 110173 ms
   Total hot run time: 38561 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4129406340394039
   q2   322 212 217 212
   q3   2956296229322932
   q4   1869183318271827
   q5   5213521052435210
   q6   210 122 128 122
   q7   2241179218081792
   q8   3197325432523252
   q9   8438847584488448
   q10  3787447040134013
   q11  570 461 465 461
   q12  761 616 642 616
   q13  17164   313831273127
   q14  302 270 272 270
   q15  529 484 502 484
   q16  467 420 409 409
   q17  1783178117571757
   q18  7991778276537653
   q19  1748166516461646
   q20  1976180518241805
   q21  5210489950134899
   q22  520 438 450 438
   Total cold run time: 71383 ms
   Total hot run time: 55412 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041277301

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041161378

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit 7563f8e38c5069f01dd6ec4bdf89b3e4fefe13ba with 
default session variables
   Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 MB/s
   Insert into select:   16.0 seconds inserted 1000 Rows, about 625K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041160716

   
   
   ClickBench: Total hot run time: 29.44 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 7563f8e38c5069f01dd6ec4bdf89b3e4fefe13ba, 
data reload: false
   
   query1   0.040.040.04
   query2   0.070.040.04
   query3   0.230.050.05
   query4   1.680.080.07
   query5   0.490.480.49
   query6   1.150.660.65
   query7   0.020.020.01
   query8   0.060.040.04
   query9   0.560.500.52
   query10  0.560.570.57
   query11  0.150.120.11
   query12  0.140.120.13
   query13  0.620.590.60
   query14  0.770.780.78
   query15  0.840.830.85
   query16  0.350.360.35
   query17  0.961.000.98
   query18  0.250.250.25
   query19  1.771.701.72
   query20  0.010.010.01
   query21  15.40   0.640.63
   query22  4.027.041.24
   query23  17.87   1.291.17
   query24  1.520.420.19
   query25  0.160.070.08
   query26  0.290.170.16
   query27  0.070.070.08
   query28  13.31   0.970.94
   query29  12.54   3.343.29
   query30  0.260.060.05
   query31  2.870.380.38
   query32  3.300.470.47
   query33  2.872.842.89
   query34  15.47   4.394.34
   query35  4.374.394.41
   query36  0.670.470.47
   query37  0.180.150.14
   query38  0.150.140.13
   query39  0.040.040.03
   query40  0.180.150.15
   query41  0.100.040.05
   query42  0.050.050.05
   query43  0.040.040.03
   Total cold run time: 106.45 s
   Total hot run time: 29.44 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041159263

   
   
   TPC-DS: Total hot run time: 180630 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 7563f8e38c5069f01dd6ec4bdf89b3e4fefe13ba, 
data reload: false
   
   query1   1222358 1108358
   query2   6355200118801880
   query3   6663210 200 200
   query4   24307   21334   21469   21334
   query5   4168386 395 386
   query6   287 186 171 171
   query7   4594292 302 292
   query8   238 183 174 174
   query9   8483226322462246
   query10  445 236 252 236
   query11  14970   14477   14432   14432
   query12  138 87  88  87
   query13  1641375 378 375
   query14  8535684169106841
   query15  214 169 169 169
   query16  6800262 262 262
   query17  1018591 563 563
   query18  1841278 271 271
   query19  205 152 157 152
   query20  91  89  92  89
   query21  198 127 128 127
   query22  4947485047754775
   query23  33482   32846   32732   32732
   query24  10474   321131933193
   query25  665 425 413 413
   query26  952 172 159 159
   query27  3093367 374 367
   query28  6943189719181897
   query29  1069606 606 606
   query30  306 169 177 169
   query31  995 761 751 751
   query32  100 55  58  55
   query33  665 249 244 244
   query34  1236501 542 501
   query35  833 726 720 720
   query36  1031878 869 869
   query37  112 73  79  73
   query38  3674357535413541
   query39  1617159615981596
   query40  175 136 132 132
   query41  49  46  51  46
   query42  114 110 110 110
   query43  482 444 423 423
   query44  1211752 782 752
   query45  284 276 288 276
   query46  1077818 829 818
   query47  1946184518451845
   query48  378 306 301 301
   query49  894 386 371 371
   query50  818 398 410 398
   query51  6814670067666700
   query52  103 93  87  87
   query53  355 284 302 284
   query54  263 225 234 225
   query55  89  74  71  71
   query56  243 226 231 226
   query57  1285120111661166
   query58  235 218 215 215
   query59  2739250623982398
   query60  253 228 227 227
   query61  91  89  90  89
   query62  654 441 459 441
   query63  299 278 282 278
   query64  4492318331373137
   query65  3035299129942991
   query66  846 316 310 310
   query67  15255   14920   14998   14920
   query68  9191569 572 569
   query69  579 322 312 312
   query70  1450115310991099
   query71  478 266 263 263
   query72  6982257123712371
   query73  1599319 318 318
   query74  6675627662166216
   query75  3579229822612261
   query76  5901108412451084
   query77  663 242 242 242
   query78  10776   10175   10050   10050
   query79  10062   523 523 523
   query80  1675418 444 418
   query81  490 234 224 224
   query82  391 94  91  91
   query83  202 161 167 161
   query84  265 84  82  82
   query85  887 285 296 285
   query86  350 286 290 286
   query87  3713343834993438
   query88  3805227922882279
   query89  547 375 377 375
   query90  1910173 178 173
   query91  130 98  98  98
   query92  60  46  47  46
   query93  6703526 528 526
   query94  1171180 177 177
   query95  438 314 307 307
   query96  619 270 278 270
   query97  2659251425262514
   query98  228 212 218 212
   query99  1171882 817 817
   Total cold run time: 294435 ms
   Total hot run time: 180630 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041156058

   
   
   TPC-H: Total hot run time: 38736 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 7563f8e38c5069f01dd6ec4bdf89b3e4fefe13ba, 
data reload: false
   
   -- Round 1 --
   q1   17682   414540734073
   q2   2018186 179 179
   q3   10484   113613651136
   q4   10195   862 976 862
   q5   7491300229612961
   q6   219 135 137 135
   q7   1113603 605 603
   q8   9395207620512051
   q9   6710622261566156
   q10  8483358135143514
   q11  420 240 251 240
   q12  382 214 210 210
   q13  17784   290528972897
   q14  265 234 239 234
   q15  517 484 480 480
   q16  499 396 368 368
   q17  969 932 922 922
   q18  7268650664696469
   q19  1635154315491543
   q20  563 325 325 325
   q21  3488309130733073
   q22  353 305 311 305
   Total cold run time: 107933 ms
   Total hot run time: 38736 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4092408040344034
   q2   327 216 214 214
   q3   2955293529492935
   q4   1862186318381838
   q5   5238522052365220
   q6   204 124 125 124
   q7   2246180317881788
   q8   3222328832653265
   q9   8454848285148482
   q10  3739400940464009
   q11  557 456 445 445
   q12  743 571 582 571
   q13  16858   308731593087
   q14  311 259 291 259
   q15  534 506 503 503
   q16  487 440 468 440
   q17  1776175617481748
   q18  8283762476557624
   q19  1712169816841684
   q20  2009181718571817
   q21  5311501449884988
   q22  499 456 433 433
   Total cold run time: 71419 ms
   Total hot run time: 55508 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


github-actions[bot] commented on code in PR #33291:
URL: https://github.com/apache/doris/pull/33291#discussion_r1554640592


##
be/src/vec/spill/spill_stream.cpp:
##
@@ -103,9 +113,9 @@ Status SpillStream::wait_spill() {
 return Status::OK();
 }
 
-Status SpillStream::spill_block(const Block& block, bool eof) {
+Status SpillStream::spill_block(RuntimeState* state, const Block& block, bool 
eof) {

Review Comment:
   warning: method 'spill_block' can be made static 
[readability-convert-member-functions-to-static]
   
   be/src/vec/spill/spill_stream.h:55:
   ```diff
   - Status spill_block(RuntimeState* state, const Block& block, bool eof);
   + static Status spill_block(RuntimeState* state, const Block& block, 
bool eof);
   ```
   



##
be/src/vec/spill/spill_stream_manager.cpp:
##
@@ -273,18 +275,67 @@ Status SpillDataDir::init() {
 RETURN_NOT_OK_STATUS_WITH_WARN(Status::IOError("opendir failed, 
path={}", _path),
"check file exist failed");
 }
-
+return update_capacity();
+}
+Status SpillDataDir::update_capacity() {
+std::lock_guard l(_mutex);
+RETURN_IF_ERROR(io::global_local_filesystem()->get_space_info(_path, 
&_disk_capacity_bytes,
+  
&_available_bytes));
+if (_shared_with_storage_path) {
+_limit_bytes = (size_t)(_disk_capacity_bytes *
+(config::storage_flood_stage_usage_percent / 
100.0) *
+(config::spill_storage_usage_percent / 100.0));
+} else {
+_limit_bytes =
+(size_t)(_disk_capacity_bytes * 
(config::spill_storage_usage_percent / 100.0));
+}
 return Status::OK();
 }
 bool SpillDataDir::reach_capacity_limit(int64_t incoming_data_size) {
-double used_pct = get_usage(incoming_data_size);
-int64_t left_bytes = _available_bytes - incoming_data_size;
-if (used_pct >= config::storage_flood_stage_usage_percent / 100.0 &&
-left_bytes <= config::storage_flood_stage_left_capacity_bytes) {
-LOG(WARNING) << "reach capacity limit. used pct: " << used_pct
- << ", left bytes: " << left_bytes << ", path: " << _path;
-return true;
+std::lock_guard l(_mutex);
+if (_shared_with_storage_path) {
+VLOG_DEBUG << fmt::format(
+"spill data path: {}, limit: {}, used: {}, available: {}, "
+"incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_limit_bytes),
+PrettyPrinter::print_bytes(_used_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+int64_t left_bytes = _available_bytes - incoming_data_size;
+if (_used_bytes + incoming_data_size > _limit_bytes ||
+left_bytes <= config::storage_flood_stage_left_capacity_bytes) {
+LOG(WARNING) << fmt::format(
+"spill data reach limit, path: {}, limit: {}, used: {}, 
available: {}, "
+"incoming "
+"bytes: {}",
+_path, PrettyPrinter::print_bytes(_limit_bytes),
+PrettyPrinter::print_bytes(_used_bytes),
+PrettyPrinter::print_bytes(_available_bytes),
+PrettyPrinter::print_bytes(incoming_data_size));
+return true;

Review Comment:
   warning: redundant boolean literal in conditional return statement 
[readability-simplify-boolean-expr]
   
   be/src/vec/spill/spill_stream_manager.cpp:305:
   ```diff
   - if (_used_bytes + incoming_data_size > _limit_bytes ||
   - left_bytes <= config::storage_flood_stage_left_capacity_bytes) 
{
   - LOG(WARNING) << fmt::format(
   - "spill data reach limit, path: {}, limit: {}, used: 
{}, available: {}, "
   - "incoming "
   - "bytes: {}",
   - _path, PrettyPrinter::print_bytes(_limit_bytes),
   - PrettyPrinter::print_bytes(_used_bytes),
   - PrettyPrinter::print_bytes(_available_bytes),
   - PrettyPrinter::print_bytes(incoming_data_size));
   - return true;
   - }
   - return false;
   + return _used_bytes + incoming_data_size > _limit_bytes ||
   + left_bytes <= config::storage_flood_stage_left_capacity_bytes;
   ```
   



##
be/src/vec/spill/spill_stream_manager.h:
##
@@ -54,25 +55,61 @@
 
 Status update_capacity();
 
-double get_usage(int64_t incoming_data_size) const {
-return _disk_capacity_bytes == 0
-   ? 0
-   : (_disk_capacity_bytes - _available_bytes + 
incoming_data_size) /
- (double)_disk_capacity_bytes;
+

Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


jacktengg commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041137219

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [improvement](spill) fuzzy spill and improve config [doris]

2024-04-06 Thread via GitHub


doris-robot commented on PR #33291:
URL: https://github.com/apache/doris/pull/33291#issuecomment-2041136966

   Thank you for your contribution to Apache Doris.
   Don't know what should be done next? See [How to process your 
PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR)
   
   Since 2024-03-18, the Document has been moved to 
[doris-website](https://github.com/apache/doris-website).
   See [Doris 
Document](https://cwiki.apache.org/confluence/display/DORIS/Doris+Document).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org