Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


dataroaring merged PR #33168:
URL: https://github.com/apache/doris/pull/33168


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


qidaye commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2035996364

   run p0
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


github-actions[bot] commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2034379722

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


github-actions[bot] commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2034379647

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


qidaye commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2034333587

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2034024724

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit c17947f2c10b4461a6a1b817942370ed682842f2 with 
default session variables
   Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  32 seconds loaded 861443392 Bytes, about 25 MB/s
   Insert into select:   16.4 seconds inserted 1000 Rows, about 609K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2034017668

   
   
   ClickBench: Total hot run time: 30.59 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit c17947f2c10b4461a6a1b817942370ed682842f2, 
data reload: false
   
   query1   0.030.040.03
   query2   0.070.050.04
   query3   0.230.050.05
   query4   1.740.070.07
   query5   0.480.490.49
   query6   1.130.650.65
   query7   0.020.010.01
   query8   0.050.050.04
   query9   0.590.500.51
   query10  0.550.570.56
   query11  0.140.110.11
   query12  0.140.110.11
   query13  0.610.600.59
   query14  0.760.790.79
   query15  0.870.840.85
   query16  0.350.340.36
   query17  0.981.010.99
   query18  0.250.250.26
   query19  1.781.731.75
   query20  0.020.010.01
   query21  15.44   0.750.72
   query22  3.535.192.36
   query23  17.76   1.210.97
   query24  1.600.230.27
   query25  0.120.090.09
   query26  0.280.170.20
   query27  0.090.090.08
   query28  13.60   0.960.95
   query29  12.53   3.243.48
   query30  0.250.060.06
   query31  2.860.380.39
   query32  3.350.470.48
   query33  2.862.852.92
   query34  15.48   4.354.31
   query35  4.374.384.39
   query36  0.660.480.47
   query37  0.190.180.19
   query38  0.170.160.15
   query39  0.050.040.04
   query40  0.170.140.15
   query41  0.090.050.05
   query42  0.060.060.06
   query43  0.040.040.04
   Total cold run time: 106.34 s
   Total hot run time: 30.59 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2034006728

   
   
   TPC-DS: Total hot run time: 182523 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit c17947f2c10b4461a6a1b817942370ed682842f2, 
data reload: false
   
   query1   1250112211241122
   query2   6487189018401840
   query3   6655217 211 211
   query4   24096   21355   21386   21355
   query5   4169406 414 406
   query6   285 184 185 184
   query7   4591302 290 290
   query8   238 178 180 178
   query9   8453224122612241
   query10  581 262 252 252
   query11  14843   14594   14431   14431
   query12  150 99  97  97
   query13  1645390 392 390
   query14  8569682068296820
   query15  208 184 177 177
   query16  7162275 273 273
   query17  1005613 590 590
   query18  1914292 292 292
   query19  212 161 166 161
   query20  98  96  95  95
   query21  199 142 125 125
   query22  5089485249194852
   query23  33846   33145   32968   32968
   query24  12636   320431013101
   query25  680 385 388 385
   query26  1892169 156 156
   query27  3017326 338 326
   query28  6774182818241824
   query29  1392585 578 578
   query30  305 164 171 164
   query31  987 717 715 715
   query32  96  61  60  60
   query33  690 251 232 232
   query34  1064492 529 492
   query35  844 697 702 697
   query36  976 876 873 873
   query37  282 75  75  75
   query38  3556337834063378
   query39  1573154815461546
   query40  306 129 133 129
   query41  47  44  45  44
   query42  116 101 103 101
   query43  440 411 398 398
   query44  1093701 710 701
   query45  285 263 260 260
   query46  1080823 787 787
   query47  1908182418221822
   query48  377 303 304 303
   query49  1172368 367 367
   query50  798 392 394 392
   query51  6516662065746574
   query52  113 96  104 96
   query53  359 291 286 286
   query54  328 244 253 244
   query55  91  78  84  78
   query56  244 225 226 225
   query57  1254112511261125
   query58  250 227 229 227
   query59  2533244824172417
   query60  269 241 257 241
   query61  112 109 108 108
   query62  712 446 453 446
   query63  316 284 291 284
   query64  6490339331633163
   query65  3072305030063006
   query66  1461333 333 333
   query67  15520   15094   15390   15094
   query68  5341568 574 568
   query69  544 350 329 329
   query70  1165109910921092
   query71  432 296 280 280
   query72  6494273724792479
   query73  726 324 324 324
   query74  6850637864256378
   query75  3091228523192285
   query76  3639112212281122
   query77  566 252 264 252
   query78  10992   10324   1   1
   query79  8685548 530 530
   query80  1770434 423 423
   query81  521 247 238 238
   query82  1378109 108 108
   query83  303 159 161 159
   query84  269 90  90  90
   query85  1622292 294 292
   query86  469 278 283 278
   query87  3661349534983495
   query88  4380230323042303
   query89  581 379 373 373
   query90  1985178 187 178
   query91  138 108 105 105
   query92  63  57  54  54
   query93  7174538 529 529
   query94  1212186 188 186
   query95  1097109410891089
   query96  619 271 269 269
   query97  2667246324782463
   query98  239 219 216 216
   query99  1285833 842 833
   Total cold run time: 295729 ms
   Total hot run time: 182523 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033982357

   
   
   TPC-H: Total hot run time: 39229 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit c17947f2c10b4461a6a1b817942370ed682842f2, 
data reload: false
   
   -- Round 1 --
   q1   17937   423042154215
   q2   2335198 192 192
   q3   10744   136314471363
   q4   10845   931 1001931
   q5   7788297929382938
   q6   218 133 143 133
   q7   1125620 610 610
   q8   9392208020432043
   q9   6767622362316223
   q10  8437353835243524
   q11  416 239 229 229
   q12  386 219 215 215
   q13  17776   290928992899
   q14  271 246 252 246
   q15  533 501 478 478
   q16  492 388 382 382
   q17  958 931 935 931
   q18  7412648464586458
   q19  1616155315451545
   q20  616 328 302 302
   q21  3605306430873064
   q22  378 309 308 308
   Total cold run time: 110047 ms
   Total hot run time: 39229 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4083406840764068
   q2   328 220 223 220
   q3   2955296329602960
   q4   1906184718371837
   q5   5240525752295229
   q6   211 128 125 125
   q7   2244180617941794
   q8   3244328633223286
   q9   8448848084768476
   q10  3767380738333807
   q11  535 451 447 447
   q12  719 553 538 538
   q13  16807   289028742874
   q14  292 267 278 267
   q15  510 463 479 463
   q16  450 392 426 392
   q17  1722170116771677
   q18  7669730272797279
   q19  1646164316521643
   q20  1950171517281715
   q21  4993473947744739
   q22  486 433 430 430
   Total cold run time: 70205 ms
   Total hot run time: 54266 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033968731

   TeamCity be ut coverage result:
Function Coverage: 35.65% (8883/24915) 
Line Coverage: 27.37% (72913/266354)
Region Coverage: 26.55% (37711/142022)
Branch Coverage: 23.35% (19212/82290)
Coverage Report: 
http://coverage.selectdb-in.cc/coverage/c17947f2c10b4461a6a1b817942370ed682842f2_c17947f2c10b4461a6a1b817942370ed682842f2/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


github-actions[bot] commented on code in PR #33168:
URL: https://github.com/apache/doris/pull/33168#discussion_r1549256758


##
be/src/olap/compaction.cpp:
##
@@ -449,7 +449,7 @@ Status CompactionMixin::execute_compact_impl(int64_t 
permits) {
 return Status::OK();
 }
 
-Status CompactionMixin::do_inverted_index_compaction() {
+Status Compaction::do_inverted_index_compaction() {

Review Comment:
   warning: function 'do_inverted_index_compaction' has cognitive complexity of 
69 (threshold 50) [readability-function-cognitive-complexity]
   ```cpp
   Status Compaction::do_inverted_index_compaction() {
  ^
   ```
   
   Additional context
   
   **be/src/olap/compaction.cpp:453:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (!config::inverted_index_compaction_enable || _input_row_num <= 0 ||
   ^
   ```
   **be/src/olap/compaction.cpp:454:** +1
   ```cpp
   !_stats.rowid_conversion || ctx.skip_inverted_index.empty()) {
^
   ```
   **be/src/olap/compaction.cpp:474:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (!_allow_delete_in_cumu_compaction) {
   ^
   ```
   **be/src/olap/compaction.cpp:475:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   if (compaction_type() == ReaderType::READER_CUMULATIVE_COMPACTION &&
   ^
   ```
   **be/src/olap/compaction.cpp:475:** +1
   ```cpp
   if (compaction_type() == ReaderType::READER_CUMULATIVE_COMPACTION &&
 ^
   ```
   **be/src/olap/compaction.cpp:488:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   RETURN_IF_ERROR(_tablet->check_rowid_conversion(_output_rowset, 
location_map));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/olap/compaction.cpp:488:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   RETURN_IF_ERROR(_tablet->check_rowid_conversion(_output_rowset, 
location_map));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/olap/compaction.cpp:505:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   
RETURN_IF_ERROR(_output_rs_writer->get_segment_num_rows(_segment_num_rows));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/olap/compaction.cpp:505:** +2, including nesting penalty of 1, 
nesting level increased to 2
   ```cpp
   
RETURN_IF_ERROR(_output_rs_writer->get_segment_num_rows(_segment_num_rows));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/olap/compaction.cpp:510:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (dest_segment_num <= 0) {
   ^
   ```
   **be/src/olap/compaction.cpp:531:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   for (int i = 0; i < dest_segment_num; ++i) {
   ^
   ```
   **be/src/olap/compaction.cpp:538:** +1, including nesting penalty of 0, 
nesting level increased to 1
   ```cpp
   if (config::debug_inverted_index_compaction) {
   ^
   ```
   **be/src/olap/compaction.cpp:539:** nesting level increased to 2
   ```cpp
   auto write_json_to_file = [&](const nlohmann::json& json_obj,
 ^
   ```
   **be/src/olap/compaction.cpp:544:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   
RETURN_IF_ERROR(io::global_local_filesystem()->create_file(file_path, 
_writer));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   **be/src/olap/compaction.cpp:544:** +4, including nesting penalty of 3, 
nesting level increased to 4
   ```cpp
   
RETURN_IF_ERROR(io::global_local_filesystem()->create_file(file_path, 
_writer));
   ^
   ```
   **be/src/common/status.h:543:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   if (UNLIKELY(!_status_.ok())) { \
   ^
   ```
   **be/src/olap/compaction.cpp:545:** +3, including nesting penalty of 2, 
nesting level increased to 3
   ```cpp
   RETURN_IF_ERROR(file_writer->append(json_obj.dump()));
   ^
   ```
   **be/src/common/status.h:541:** expanded from macro 'RETURN_IF_ERROR'
   ```cpp
   do {\
   ^
   ```
   

Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


qidaye commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033919078

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033900852

   
   
   TPC-H: Total hot run time: 39045 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 9dbd7e4d9e316d63f910ee9cee01062f5cadbe31, 
data reload: false
   
   -- Round 1 --
   q1   18223   425842224222
   q2   2702194 202 194
   q3   10898   121214241212
   q4   10205   846 989 846
   q5   7463301729472947
   q6   219 136 135 135
   q7   1130630 618 618
   q8   9405204520522045
   q9   6671624562426242
   q10  8424355035073507
   q11  423 242 233 233
   q12  386 225 221 221
   q13  17764   289029152890
   q14  272 242 244 242
   q15  532 488 471 471
   q16  505 391 387 387
   q17  969 921 915 915
   q18  7277649164586458
   q19  1613153915511539
   q20  610 335 310 310
   q21  3582310132023101
   q22  370 313 310 310
   Total cold run time: 109643 ms
   Total hot run time: 39045 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4100406541124065
   q2   328 219 219 219
   q3   2999299129372937
   q4   1904186718471847
   q5   5272523252375232
   q6   207 128 128 128
   q7   2265182718581827
   q8   3267332633283326
   q9   8528853185248524
   q10  3778383738363836
   q11  551 464 458 458
   q12  716 569 534 534
   q13  12630   292629342926
   q14  302 264 265 264
   q15  517 470 479 470
   q16  463 413 399 399
   q17  1733167116911671
   q18  7722737372207220
   q19  1645165516531653
   q20  1942174117241724
   q21  5066480547974797
   q22  488 427 441 427
   Total cold run time: 66423 ms
   Total hot run time: 54484 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


github-actions[bot] commented on code in PR #33168:
URL: https://github.com/apache/doris/pull/33168#discussion_r1549092928


##
be/src/olap/compaction.cpp:
##
@@ -230,226 +230,7 @@
(_input_rowsets_size / (_input_row_num + 1) + 1);
 }
 
-CompactionMixin::CompactionMixin(StorageEngine& engine, TabletSharedPtr tablet,
- const std::string& label)
-: Compaction(tablet, label), _engine(engine) {}
-
-CompactionMixin::~CompactionMixin() {
-if (_state != CompactionState::SUCCESS && _output_rowset != nullptr) {
-if (!_output_rowset->is_local()) {
-tablet()->record_unused_remote_rowset(_output_rowset->rowset_id(),
-  
_output_rowset->rowset_meta()->resource_id(),
-  
_output_rowset->num_segments());
-return;
-}
-_engine.add_unused_rowset(_output_rowset);
-}
-}
-
-Tablet* CompactionMixin::tablet() {
-return static_cast(_tablet.get());
-}
-
-Status CompactionMixin::do_compact_ordered_rowsets() {
-build_basic_info();
-RowsetWriterContext ctx;
-RETURN_IF_ERROR(construct_output_rowset_writer(ctx));
-
-LOG(INFO) << "start to do ordered data compaction, tablet=" << 
_tablet->tablet_id()
-  << ", output_version=" << _output_version;
-// link data to new rowset
-auto seg_id = 0;
-std::vector segment_key_bounds;
-for (auto rowset : _input_rowsets) {
-RETURN_IF_ERROR(rowset->link_files_to(_tablet->tablet_path(),
-  _output_rs_writer->rowset_id(), 
seg_id));
-seg_id += rowset->num_segments();
-
-std::vector key_bounds;
-RETURN_IF_ERROR(rowset->get_segments_key_bounds(_bounds));
-segment_key_bounds.insert(segment_key_bounds.end(), 
key_bounds.begin(), key_bounds.end());
-}
-// build output rowset
-RowsetMetaSharedPtr rowset_meta = std::make_shared();
-rowset_meta->set_num_rows(_input_row_num);
-rowset_meta->set_total_disk_size(_input_rowsets_size);
-rowset_meta->set_data_disk_size(_input_rowsets_size);
-rowset_meta->set_index_disk_size(_input_index_size);
-rowset_meta->set_empty(_input_row_num == 0);
-rowset_meta->set_num_segments(_input_num_segments);
-rowset_meta->set_segments_overlap(NONOVERLAPPING);
-rowset_meta->set_rowset_state(VISIBLE);
-
-rowset_meta->set_segments_key_bounds(segment_key_bounds);
-_output_rowset = _output_rs_writer->manual_build(rowset_meta);
-return Status::OK();
-}
-
-void CompactionMixin::build_basic_info() {
-for (auto& rowset : _input_rowsets) {
-_input_rowsets_size += rowset->data_disk_size();
-_input_index_size += rowset->index_disk_size();
-_input_row_num += rowset->num_rows();
-_input_num_segments += rowset->num_segments();
-}
-COUNTER_UPDATE(_input_rowsets_data_size_counter, _input_rowsets_size);
-COUNTER_UPDATE(_input_row_num_counter, _input_row_num);
-COUNTER_UPDATE(_input_segments_num_counter, _input_num_segments);
-
-_output_version =
-Version(_input_rowsets.front()->start_version(), 
_input_rowsets.back()->end_version());
-
-_newest_write_timestamp = _input_rowsets.back()->newest_write_timestamp();
-
-std::vector rowset_metas(_input_rowsets.size());
-std::transform(_input_rowsets.begin(), _input_rowsets.end(), 
rowset_metas.begin(),
-   [](const RowsetSharedPtr& rowset) { return 
rowset->rowset_meta(); });
-_cur_tablet_schema = 
_tablet->tablet_schema_with_merged_max_schema_version(rowset_metas);
-}
-
-bool CompactionMixin::handle_ordered_data_compaction() {
-if (!config::enable_ordered_data_compaction) {
-return false;
-}
-if (compaction_type() == ReaderType::READER_COLD_DATA_COMPACTION) {
-// The remote file system does not support to link files.
-return false;
-}
-if (_tablet->keys_type() == KeysType::UNIQUE_KEYS &&
-_tablet->enable_unique_key_merge_on_write()) {
-return false;
-}
-
-if (_tablet->tablet_meta()->tablet_schema()->skip_write_index_on_load()) {
-// Expected to create index through normal compaction
-return false;
-}
-
-// check delete version: if compaction type is base compaction and
-// has a delete version, use original compaction
-if (compaction_type() == ReaderType::READER_BASE_COMPACTION) {
-for (auto& rowset : _input_rowsets) {
-if (rowset->rowset_meta()->has_delete_predicate()) {
-return false;
-}
-}
-}
-
-// check if rowsets are tidy so we can just modify meta and do link
-// files to handle compaction
-auto input_size = _input_rowsets.size();
-std::string pre_max_key;
-for (auto i = 0; i < input_size; ++i) {
-if (!is_rowset_tidy(pre_max_key, _input_rowsets[i])) {
-if 

Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-03 Thread via GitHub


qidaye commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033774090

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033506535

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit acb2ef7c7ce1e97d2bc4df2c7850b46e99f5690a with 
default session variables
   Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
   Stream load orc:  59 seconds loaded 1101869774 Bytes, about 17 MB/s
   Stream load parquet:  32 seconds loaded 861443392 Bytes, about 25 MB/s
   Insert into select:   17.1 seconds inserted 1000 Rows, about 584K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033504311

   
   
   ClickBench: Total hot run time: 30.86 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit acb2ef7c7ce1e97d2bc4df2c7850b46e99f5690a, 
data reload: false
   
   query1   0.040.040.04
   query2   0.070.040.04
   query3   0.240.040.05
   query4   1.680.070.07
   query5   0.480.480.49
   query6   1.120.650.65
   query7   0.020.020.01
   query8   0.040.040.04
   query9   0.560.530.51
   query10  0.570.570.55
   query11  0.160.120.11
   query12  0.150.120.12
   query13  0.610.600.60
   query14  0.760.800.79
   query15  0.850.830.85
   query16  0.350.350.34
   query17  0.990.980.98
   query18  0.240.260.25
   query19  1.791.701.75
   query20  0.010.020.01
   query21  15.41   0.720.71
   query22  3.165.472.47
   query23  17.71   1.371.08
   query24  1.390.220.23
   query25  0.140.090.09
   query26  0.280.180.18
   query27  0.080.080.08
   query28  13.90   0.950.93
   query29  12.53   3.383.37
   query30  0.260.060.06
   query31  2.860.400.38
   query32  3.270.480.48
   query33  2.832.862.87
   query34  15.48   4.334.32
   query35  4.404.394.36
   query36  0.670.470.47
   query37  0.190.180.17
   query38  0.170.150.16
   query39  0.050.040.04
   query40  0.170.150.15
   query41  0.100.050.05
   query42  0.070.040.06
   query43  0.040.040.04
   Total cold run time: 105.89 s
   Total hot run time: 30.86 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033500434

   
   
   TPC-DS: Total hot run time: 182609 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit acb2ef7c7ce1e97d2bc4df2c7850b46e99f5690a, 
data reload: false
   
   query1   1217111811181118
   query2   6354203119461946
   query3   6657219 208 208
   query4   24829   21411   21460   21411
   query5   4172394 411 394
   query6   278 189 188 188
   query7   4603315 311 311
   query8   227 176 176 176
   query9   8431229622962296
   query10  454 269 254 254
   query11  15010   14500   14490   14490
   query12  145 96  96  96
   query13  1635393 389 389
   query14  8478699268516851
   query15  232 178 182 178
   query16  6861286 281 281
   query17  968 619 579 579
   query18  1852290 285 285
   query19  208 163 167 163
   query20  102 93  94  93
   query21  196 131 131 131
   query22  4952481546984698
   query23  33381   32683   32557   32557
   query24  12754   321432123212
   query25  728 437 440 437
   query26  1936174 177 174
   query27  3336385 417 385
   query28  6994195219061906
   query29  1384642 636 636
   query30  324 172 181 172
   query31  1026785 779 779
   query32  99  66  65  65
   query33  738 271 262 262
   query34  1426548 540 540
   query35  872 750 736 736
   query36  1004884 890 884
   query37  239 88  83  83
   query38  3687362035883588
   query39  1653161516181615
   query40  240 148 150 148
   query41  48  48  50  48
   query42  121 118 118 118
   query43  456 413 410 410
   query44  1128740 729 729
   query45  298 283 274 274
   query46  1102856 813 813
   query47  1978187418801874
   query48  403 317 324 317
   query49  959 374 388 374
   query50  855 423 425 423
   query51  7025687069516870
   query52  108 98  101 98
   query53  368 298 296 296
   query54  308 235 254 235
   query55  92  82  82  82
   query56  239 222 233 222
   query57  1295120411641164
   query58  250 239 242 239
   query59  2746225223202252
   query60  273 243 240 240
   query61  91  87  88  87
   query62  677 452 485 452
   query63  312 284 288 284
   query64  5825308933983089
   query65  3036299429732973
   query66  1300340 317 317
   query67  15362   15003   15046   15003
   query68  9551583 598 583
   query69  608 328 329 328
   query70  1394113011321130
   query71  540 276 270 270
   query72  6436257724142414
   query73  1570329 326 326
   query74  6772626363976263
   query75  3807229223012292
   query76  5819114312091143
   query77  562 252 255 252
   query78  10884   10098   10034   10034
   query79  10521   559 549 549
   query80  1958442 437 437
   query81  527 245 237 237
   query82  463 110 106 106
   query83  209 164 165 164
   query84  269 90  88  88
   query85  964 321 305 305
   query86  356 281 295 281
   query87  3744352734883488
   query88  3640235923732359
   query89  557 368 374 368
   query90  1994178 178 178
   query91  134 104 108 104
   query92  61  50  53  50
   query93  6835544 538 538
   query94  1202190 190 190
   query95  439 324 335 324
   query96  603 277 274 274
   query97  2708251024942494
   query98  227 221 214 214
   query99  1266828 828 828
   Total cold run time: 302957 ms
   Total hot run time: 182609 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033488776

   
   
   TPC-H: Total hot run time: 38709 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit acb2ef7c7ce1e97d2bc4df2c7850b46e99f5690a, 
data reload: false
   
   -- Round 1 --
   q1   17633   414541464145
   q2   2008192 182 182
   q3   10470   122714171227
   q4   10204   856 988 856
   q5   7462301629512951
   q6   218 135 134 134
   q7   1114642 613 613
   q8   9399206220472047
   q9   6642617761216121
   q10  8481350934843484
   q11  417 237 241 237
   q12  391 233 220 220
   q13  17776   289028922890
   q14  276 240 243 240
   q15  530 487 490 487
   q16  506 386 376 376
   q17  966 919 895 895
   q18  7276647263766376
   q19  1610155915331533
   q20  603 327 308 308
   q21  3597317730863086
   q22  369 302 301 301
   Total cold run time: 107948 ms
   Total hot run time: 38709 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4091405340464046
   q2   326 212 217 212
   q3   2968295829602958
   q4   1909184418231823
   q5   5284525352365236
   q6   209 125 127 125
   q7   2241179317751775
   q8   3245331033133310
   q9   8558850385468503
   q10  3764400040244000
   q11  567 464 470 464
   q12  771 598 641 598
   q13  12853   303330913033
   q14  316 272 268 268
   q15  543 488 493 488
   q16  512 449 458 449
   q17  1782176216911691
   q18  8279778579637785
   q19  1915169216921692
   q20  2053181518651815
   q21  5198494950254949
   q22  523 473 449 449
   Total cold run time: 67907 ms
   Total hot run time: 55669 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033486622

   TeamCity be ut coverage result:
Function Coverage: 35.64% (8880/24918) 
Line Coverage: 27.37% (72911/266404)
Region Coverage: 26.54% (37702/142058)
Branch Coverage: 23.34% (19217/82344)
Coverage Report: 
http://coverage.selectdb-in.cc/coverage/acb2ef7c7ce1e97d2bc4df2c7850b46e99f5690a_acb2ef7c7ce1e97d2bc4df2c7850b46e99f5690a/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


github-actions[bot] commented on code in PR #33168:
URL: https://github.com/apache/doris/pull/33168#discussion_r1548855311


##
be/src/olap/base_tablet.h:
##
@@ -86,6 +86,11 @@ class BaseTablet {
 return _max_version_schema;
 }
 
+inline Version max_version() const {

Review Comment:
   warning: method 'max_version' can be made static 
[readability-convert-member-functions-to-static]
   
   ```suggestion
   static inline Version max_version() {
   ```
   



##
be/src/olap/compaction.cpp:
##
@@ -230,226 +230,7 @@
(_input_rowsets_size / (_input_row_num + 1) + 1);
 }
 
-CompactionMixin::CompactionMixin(StorageEngine& engine, TabletSharedPtr tablet,
- const std::string& label)
-: Compaction(tablet, label), _engine(engine) {}
-
-CompactionMixin::~CompactionMixin() {
-if (_state != CompactionState::SUCCESS && _output_rowset != nullptr) {
-if (!_output_rowset->is_local()) {
-tablet()->record_unused_remote_rowset(_output_rowset->rowset_id(),
-  
_output_rowset->rowset_meta()->resource_id(),
-  
_output_rowset->num_segments());
-return;
-}
-_engine.add_unused_rowset(_output_rowset);
-}
-}
-
-Tablet* CompactionMixin::tablet() {
-return static_cast(_tablet.get());
-}
-
-Status CompactionMixin::do_compact_ordered_rowsets() {
-build_basic_info();
-RowsetWriterContext ctx;
-RETURN_IF_ERROR(construct_output_rowset_writer(ctx));
-
-LOG(INFO) << "start to do ordered data compaction, tablet=" << 
_tablet->tablet_id()
-  << ", output_version=" << _output_version;
-// link data to new rowset
-auto seg_id = 0;
-std::vector segment_key_bounds;
-for (auto rowset : _input_rowsets) {
-RETURN_IF_ERROR(rowset->link_files_to(_tablet->tablet_path(),
-  _output_rs_writer->rowset_id(), 
seg_id));
-seg_id += rowset->num_segments();
-
-std::vector key_bounds;
-RETURN_IF_ERROR(rowset->get_segments_key_bounds(_bounds));
-segment_key_bounds.insert(segment_key_bounds.end(), 
key_bounds.begin(), key_bounds.end());
-}
-// build output rowset
-RowsetMetaSharedPtr rowset_meta = std::make_shared();
-rowset_meta->set_num_rows(_input_row_num);
-rowset_meta->set_total_disk_size(_input_rowsets_size);
-rowset_meta->set_data_disk_size(_input_rowsets_size);
-rowset_meta->set_index_disk_size(_input_index_size);
-rowset_meta->set_empty(_input_row_num == 0);
-rowset_meta->set_num_segments(_input_num_segments);
-rowset_meta->set_segments_overlap(NONOVERLAPPING);
-rowset_meta->set_rowset_state(VISIBLE);
-
-rowset_meta->set_segments_key_bounds(segment_key_bounds);
-_output_rowset = _output_rs_writer->manual_build(rowset_meta);
-return Status::OK();
-}
-
-void CompactionMixin::build_basic_info() {
-for (auto& rowset : _input_rowsets) {
-_input_rowsets_size += rowset->data_disk_size();
-_input_index_size += rowset->index_disk_size();
-_input_row_num += rowset->num_rows();
-_input_num_segments += rowset->num_segments();
-}
-COUNTER_UPDATE(_input_rowsets_data_size_counter, _input_rowsets_size);
-COUNTER_UPDATE(_input_row_num_counter, _input_row_num);
-COUNTER_UPDATE(_input_segments_num_counter, _input_num_segments);
-
-_output_version =
-Version(_input_rowsets.front()->start_version(), 
_input_rowsets.back()->end_version());
-
-_newest_write_timestamp = _input_rowsets.back()->newest_write_timestamp();
-
-std::vector rowset_metas(_input_rowsets.size());
-std::transform(_input_rowsets.begin(), _input_rowsets.end(), 
rowset_metas.begin(),
-   [](const RowsetSharedPtr& rowset) { return 
rowset->rowset_meta(); });
-_cur_tablet_schema = 
_tablet->tablet_schema_with_merged_max_schema_version(rowset_metas);
-}
-
-bool CompactionMixin::handle_ordered_data_compaction() {
-if (!config::enable_ordered_data_compaction) {
-return false;
-}
-if (compaction_type() == ReaderType::READER_COLD_DATA_COMPACTION) {
-// The remote file system does not support to link files.
-return false;
-}
-if (_tablet->keys_type() == KeysType::UNIQUE_KEYS &&
-_tablet->enable_unique_key_merge_on_write()) {
-return false;
-}
-
-if (_tablet->tablet_meta()->tablet_schema()->skip_write_index_on_load()) {
-// Expected to create index through normal compaction
-return false;
-}
-
-// check delete version: if compaction type is base compaction and
-// has a delete version, use original compaction
-if (compaction_type() == ReaderType::READER_BASE_COMPACTION) {
-for (auto& rowset : _input_rowsets) {
-if (rowset->rowset_meta()->has_delete_predicate()) {
-  

Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


qidaye commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2033450290

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


platoneko commented on code in PR #33168:
URL: https://github.com/apache/doris/pull/33168#discussion_r1548058442


##
be/src/olap/compaction.h:
##
@@ -65,8 +65,15 @@ class Compaction {
 virtual std::string_view compaction_name() const = 0;
 
 protected:
+// Convert `_tablet` from `BaseTablet` to `Tablet`
+Tablet* tablet();

Review Comment:
   不建议在 cloud 和 local 的公用基类做强转



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2031669986

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit eeec2216671a669442ceb570715b75f5f0a3b705 with 
default session variables
   Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
   Stream load orc:  59 seconds loaded 1101869774 Bytes, about 17 MB/s
   Stream load parquet:  31 seconds loaded 861443392 Bytes, about 26 MB/s
   Insert into select:   16.1 seconds inserted 1000 Rows, about 621K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2031664436

   
   
   ClickBench: Total hot run time: 30.22 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit eeec2216671a669442ceb570715b75f5f0a3b705, 
data reload: false
   
   query1   0.040.030.03
   query2   0.070.040.04
   query3   0.230.050.04
   query4   1.680.060.06
   query5   0.480.470.48
   query6   1.140.650.66
   query7   0.020.010.01
   query8   0.050.040.05
   query9   0.580.510.51
   query10  0.560.580.57
   query11  0.150.120.11
   query12  0.140.110.12
   query13  0.610.590.60
   query14  0.780.780.79
   query15  0.860.850.84
   query16  0.350.350.36
   query17  0.980.960.96
   query18  0.250.260.25
   query19  1.801.711.76
   query20  0.010.000.00
   query21  15.40   0.750.72
   query22  2.865.501.79
   query23  18.07   1.241.16
   query24  1.330.210.40
   query25  0.140.090.08
   query26  0.300.170.20
   query27  0.080.080.09
   query28  13.51   0.950.93
   query29  12.67   3.613.38
   query30  0.260.060.07
   query31  2.840.390.39
   query32  3.280.490.47
   query33  2.912.852.86
   query34  15.48   4.344.33
   query35  4.364.394.40
   query36  0.680.460.47
   query37  0.180.170.16
   query38  0.170.160.17
   query39  0.040.040.03
   query40  0.170.140.16
   query41  0.100.060.05
   query42  0.060.050.05
   query43  0.050.040.04
   Total cold run time: 105.72 s
   Total hot run time: 30.22 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2031655349

   
   
   TPC-DS: Total hot run time: 182752 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit eeec2216671a669442ceb570715b75f5f0a3b705, 
data reload: false
   
   query1   1255111511151115
   query2   6341200121102001
   query3   220 218 218
   query4   24694   21750   21385   21385
   query5   4256405 411 405
   query6   277 192 185 185
   query7   4599316 302 302
   query8   234 165 186 165
   query9   8486231023332310
   query10  464 256 266 256
   query11  14990   1   14395   14395
   query12  145 102 96  96
   query13  1632387 391 387
   query14  8542701769596959
   query15  224 186 185 185
   query16  6855290 285 285
   query17  965 647 584 584
   query18  1856295 299 295
   query19  207 169 172 169
   query20  101 95  95  95
   query21  202 138 133 133
   query22  4936484547654765
   query23  33713   32860   32897   32860
   query24  13344   325131833183
   query25  719 449 459 449
   query26  1862180 178 178
   query27  3183385 372 372
   query28  7178192619041904
   query29  1329666 628 628
   query30  323 176 174 174
   query31  1010764 794 764
   query32  102 63  64  63
   query33  740 269 268 268
   query34  1110529 526 526
   query35  923 737 773 737
   query36  987 871 887 871
   query37  267 88  90  88
   query38  3750366235983598
   query39  1066103310841033
   query40  239 154 144 144
   query41  53  47  51  47
   query42  117 110 121 110
   query43  473 414 414 414
   query44  1125730 723 723
   query45  285 279 287 279
   query46  1136877 818 818
   query47  1964188619031886
   query48  398 321 323 321
   query49  950 362 375 362
   query50  838 434 433 433
   query51  7193690269196902
   query52  117 108 107 107
   query53  383 303 305 303
   query54  308 252 251 251
   query55  90  84  84  84
   query56  253 233 239 233
   query57  1279118011891180
   query58  249 236 234 234
   query59  2863254025042504
   query60  258 253 245 245
   query61  93  96  92  92
   query62  671 458 449 449
   query63  319 289 294 289
   query64  5820309831003098
   query65  3049304530273027
   query66  1302316 310 310
   query67  15382   14828   14840   14828
   query68  8478585 597 585
   query69  568 333 339 333
   query70  1226109811161098
   query71  499 277 272 272
   query72  6253259124452445
   query73  802 335 333 333
   query74  6767631163096309
   query75  3481229423542294
   query76  5131104112331041
   query77  574 259 260 259
   query78  10964   10165   10205   10165
   query79  8597542 528 528
   query80  1285454 445 445
   query81  500 238 237 237
   query82  725 110 104 104
   query83  201 168 162 162
   query84  275 89  89  89
   query85  1322296 294 294
   query86  407 276 271 271
   query87  3677351034833483
   query88  4016235723712357
   query89  550 374 380 374
   query90  1938179 188 179
   query91  135 108 108 108
   query92  62  51  52  51
   query93  6940532 524 524
   query94  1024203 192 192
   query95  448 343 344 343
   query96  614 277 274 274
   query97  2670252225052505
   query98  233 215 215 215
   query99  1283824 828 824
   Total cold run time: 298385 ms
   Total hot run time: 182752 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2031630528

   
   
   TPC-H: Total hot run time: 38598 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit eeec2216671a669442ceb570715b75f5f0a3b705, 
data reload: false
   
   -- Round 1 --
   q1   17615   407340674067
   q2   2012193 185 185
   q3   10476   123413911234
   q4   10207   886 968 886
   q5   7474295028962896
   q6   219 135 135 135
   q7   1106628 610 610
   q8   9400203920202020
   q9   6726617261016101
   q10  8435350335223503
   q11  416 249 255 249
   q12  390 222 220 220
   q13  17786   287929162879
   q14  268 242 240 240
   q15  532 492 489 489
   q16  512 386 378 378
   q17  952 927 891 891
   q18  7236637464416374
   q19  1624153715241524
   q20  596 322 327 322
   q21  3558309930853085
   q22  373 310 310 310
   Total cold run time: 107913 ms
   Total hot run time: 38598 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4204401440574014
   q2   339 218 218 218
   q3   2980297329482948
   q4   1872184018281828
   q5   5234521752225217
   q6   208 126 125 125
   q7   2238175718011757
   q8   3209327832783278
   q9   8458848784698469
   q10  3746396140123961
   q11  562 478 479 478
   q12  747 595 582 582
   q13  16822   312631003100
   q14  294 278 275 275
   q15  540 495 521 495
   q16  510 470 472 470
   q17  1783175817431743
   q18  8202768378177683
   q19  1686167616631663
   q20  2066182618711826
   q21  5174498449844984
   q22  534 444 417 417
   Total cold run time: 71408 ms
   Total hot run time: 55531 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2031595838

   TeamCity be ut coverage result:
Function Coverage: 35.64% (8879/24911) 
Line Coverage: 27.38% (72905/266309)
Region Coverage: 26.56% (37709/141993)
Branch Coverage: 23.34% (19216/82318)
Coverage Report: 
http://coverage.selectdb-in.cc/coverage/eeec2216671a669442ceb570715b75f5f0a3b705_eeec2216671a669442ceb570715b75f5f0a3b705/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


github-actions[bot] commented on code in PR #33168:
URL: https://github.com/apache/doris/pull/33168#discussion_r1547516835


##
be/src/olap/compaction.cpp:
##
@@ -230,226 +234,7 @@
(_input_rowsets_size / (_input_row_num + 1) + 1);
 }
 
-CompactionMixin::CompactionMixin(StorageEngine& engine, TabletSharedPtr tablet,
- const std::string& label)
-: Compaction(tablet, label), _engine(engine) {}
-
-CompactionMixin::~CompactionMixin() {
-if (_state != CompactionState::SUCCESS && _output_rowset != nullptr) {
-if (!_output_rowset->is_local()) {
-tablet()->record_unused_remote_rowset(_output_rowset->rowset_id(),
-  
_output_rowset->rowset_meta()->resource_id(),
-  
_output_rowset->num_segments());
-return;
-}
-_engine.add_unused_rowset(_output_rowset);
-}
-}
-
-Tablet* CompactionMixin::tablet() {
-return static_cast(_tablet.get());
-}
-
-Status CompactionMixin::do_compact_ordered_rowsets() {
-build_basic_info();
-RowsetWriterContext ctx;
-RETURN_IF_ERROR(construct_output_rowset_writer(ctx));
-
-LOG(INFO) << "start to do ordered data compaction, tablet=" << 
_tablet->tablet_id()
-  << ", output_version=" << _output_version;
-// link data to new rowset
-auto seg_id = 0;
-std::vector segment_key_bounds;
-for (auto rowset : _input_rowsets) {
-RETURN_IF_ERROR(rowset->link_files_to(_tablet->tablet_path(),
-  _output_rs_writer->rowset_id(), 
seg_id));
-seg_id += rowset->num_segments();
-
-std::vector key_bounds;
-RETURN_IF_ERROR(rowset->get_segments_key_bounds(_bounds));
-segment_key_bounds.insert(segment_key_bounds.end(), 
key_bounds.begin(), key_bounds.end());
-}
-// build output rowset
-RowsetMetaSharedPtr rowset_meta = std::make_shared();
-rowset_meta->set_num_rows(_input_row_num);
-rowset_meta->set_total_disk_size(_input_rowsets_size);
-rowset_meta->set_data_disk_size(_input_rowsets_size);
-rowset_meta->set_index_disk_size(_input_index_size);
-rowset_meta->set_empty(_input_row_num == 0);
-rowset_meta->set_num_segments(_input_num_segments);
-rowset_meta->set_segments_overlap(NONOVERLAPPING);
-rowset_meta->set_rowset_state(VISIBLE);
-
-rowset_meta->set_segments_key_bounds(segment_key_bounds);
-_output_rowset = _output_rs_writer->manual_build(rowset_meta);
-return Status::OK();
-}
-
-void CompactionMixin::build_basic_info() {
-for (auto& rowset : _input_rowsets) {
-_input_rowsets_size += rowset->data_disk_size();
-_input_index_size += rowset->index_disk_size();
-_input_row_num += rowset->num_rows();
-_input_num_segments += rowset->num_segments();
-}
-COUNTER_UPDATE(_input_rowsets_data_size_counter, _input_rowsets_size);
-COUNTER_UPDATE(_input_row_num_counter, _input_row_num);
-COUNTER_UPDATE(_input_segments_num_counter, _input_num_segments);
-
-_output_version =
-Version(_input_rowsets.front()->start_version(), 
_input_rowsets.back()->end_version());
-
-_newest_write_timestamp = _input_rowsets.back()->newest_write_timestamp();
-
-std::vector rowset_metas(_input_rowsets.size());
-std::transform(_input_rowsets.begin(), _input_rowsets.end(), 
rowset_metas.begin(),
-   [](const RowsetSharedPtr& rowset) { return 
rowset->rowset_meta(); });
-_cur_tablet_schema = 
_tablet->tablet_schema_with_merged_max_schema_version(rowset_metas);
-}
-
-bool CompactionMixin::handle_ordered_data_compaction() {
-if (!config::enable_ordered_data_compaction) {
-return false;
-}
-if (compaction_type() == ReaderType::READER_COLD_DATA_COMPACTION) {
-// The remote file system does not support to link files.
-return false;
-}
-if (_tablet->keys_type() == KeysType::UNIQUE_KEYS &&
-_tablet->enable_unique_key_merge_on_write()) {
-return false;
-}
-
-if (_tablet->tablet_meta()->tablet_schema()->skip_write_index_on_load()) {
-// Expected to create index through normal compaction
-return false;
-}
-
-// check delete version: if compaction type is base compaction and
-// has a delete version, use original compaction
-if (compaction_type() == ReaderType::READER_BASE_COMPACTION) {
-for (auto& rowset : _input_rowsets) {
-if (rowset->rowset_meta()->has_delete_predicate()) {
-return false;
-}
-}
-}
-
-// check if rowsets are tidy so we can just modify meta and do link
-// files to handle compaction
-auto input_size = _input_rowsets.size();
-std::string pre_max_key;
-for (auto i = 0; i < input_size; ++i) {
-if (!is_rowset_tidy(pre_max_key, _input_rowsets[i])) {
-if 

Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


qidaye commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2031534395

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [feature](index compaction)support index compaction in cloud mode [doris]

2024-04-02 Thread via GitHub


doris-robot commented on PR #33168:
URL: https://github.com/apache/doris/pull/33168#issuecomment-2031534072

   Thank you for your contribution to Apache Doris.
   Don't know what should be done next? See [How to process your 
PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR)
   
   Since 2024-03-18, the Document has been moved to 
[doris-website](https://github.com/apache/doris-website).
   See [Doris 
Document](https://cwiki.apache.org/confluence/display/DORIS/Doris+Document).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org