Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-05-09 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2102638419

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-25 Thread via GitHub


github-actions[bot] commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2076816632

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-25 Thread via GitHub


github-actions[bot] commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2076816565

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-25 Thread via GitHub


starocean999 commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1579206615


##
fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java:
##
@@ -1365,4 +1371,166 @@ public static boolean rewriteByPolicy(StatementBase 
statementBase, Analyzer anal
 }
 return reAnalyze;
 }
+
+/**
+ *
+ * @param column the column of SlotRef
+ * @param selectList new selectList for selectStmt
+ * @param groupByExprs group by Exprs for selectStmt
+ * @return true if ref can be rewritten
+ */
+private static boolean rewriteSelectList(Column column, SelectList 
selectList, ArrayList groupByExprs) {
+SlotRef slot = new SlotRef(null, column.getName());
+if (column.isKey()) {
+selectList.addItem(new SelectListItem(slot, column.getName()));
+groupByExprs.add(slot);
+return true;
+} else {
+AggregateType aggregateType = column.getAggregationType();
+if (aggregateType != AggregateType.SUM && aggregateType != 
AggregateType.MAX
+&& aggregateType != AggregateType.MIN) {
+return false;
+} else {
+FunctionName funcName = new 
FunctionName(aggregateType.toString().toLowerCase());
+List arrayList = Lists.newArrayList(slot);
+FunctionCallExpr func =  new FunctionCallExpr(funcName, new 
FunctionParams(false, arrayList));
+selectList.addItem(new SelectListItem(func, column.getName()));
+return true;
+}
+}
+}
+
+/**
+ * rewrite stmt for querying random distributed table, construct an 
aggregation node for pre-agg
+ * * CREATE TABLE `tbl` (
+ *   `k1` BIGINT NULL DEFAULT "10",
+ *   `k3` SMALLINT NULL,
+ *   `a` BIGINT SUM NULL DEFAULT "0"
+ * ) ENGINE=OLAP
+ * AGGREGATE KEY(`k1`, `k2`)
+ * DISTRIBUTED BY RANDOM BUCKETS 1
+ * PROPERTIES (
+ * "replication_allocation" = "tag.location.default: 1"
+ * )
+ * e.g.,
+ * original: select * from tbl
+ * rewrite: select * from (select k1, k2, sum(pv) from tbl group by k1, 
k2) t
+ * do not rewrite if no need two phase agg:
+ * e.g.,
+ * 1. select max(k1) from tbl
+ * 2. select sum(a) from tbl
+ *
+ * @param statementBase stmt to rewrite
+ * @param analyzer the analyzer
+ * @return true if rewritten
+ * @throws UserException
+ */
+public static boolean rewriteForRandomDistribution(StatementBase 
statementBase, Analyzer analyzer)
+throws UserException {
+boolean reAnalyze = false;
+if (!(statementBase instanceof SelectStmt)) {
+return false;
+}
+SelectStmt selectStmt = (SelectStmt) statementBase;
+for (int i = 0; i < selectStmt.fromClause.size(); i++) {
+TableRef tableRef = selectStmt.fromClause.get(i);
+// Recursively rewrite subquery
+if (tableRef instanceof InlineViewRef) {
+InlineViewRef viewRef = (InlineViewRef) tableRef;
+if (rewriteForRandomDistribution(viewRef.getQueryStmt(), 
viewRef.getAnalyzer())) {
+reAnalyze = true;
+}
+continue;
+}
+TableIf table = tableRef.getTable();
+if (!(table instanceof OlapTable)) {
+continue;
+}
+// only rewrite random distributed AGG_KEY table
+OlapTable olapTable = (OlapTable) table;
+if (olapTable.getKeysType() != KeysType.AGG_KEYS) {
+continue;
+}
+DistributionInfo distributionInfo = 
olapTable.getDefaultDistributionInfo();
+if (distributionInfo.getType() != 
DistributionInfo.DistributionInfoType.RANDOM) {
+continue;
+}
+
+// check agg function and column agg type
+boolean aggTypeMatch = true;
+if (selectStmt.getAggInfo() != null) {
+ArrayList aggExprs = 
selectStmt.getAggInfo().getAggregateExprs();
+if (aggExprs.stream().allMatch(expr -> 
aggTypeMatch(expr.getFnName().getFunction(), expr))) {
+continue;
+}
+aggTypeMatch = false;
+}
+// construct a new InlineViewRef for pre-agg
+boolean canRewrite = true;
+SelectList selectList = new SelectList();
+ArrayList groupingExprs = new ArrayList<>();
+TupleDescriptor desc = tableRef.getDesc();
+List columns = 
desc.getSlots().stream().map(SlotDescriptor::getColumn).collect(Collectors.toList());
+columns = columns.isEmpty() || !aggTypeMatch ? 
olapTable.getBaseSchema() : columns;
+for (Column col : columns) {
+if (!rewriteSelectList(col, selectList, groupingExprs)) 

Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-25 Thread via GitHub


starocean999 commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1579206615


##
fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java:
##
@@ -1365,4 +1371,166 @@ public static boolean rewriteByPolicy(StatementBase 
statementBase, Analyzer anal
 }
 return reAnalyze;
 }
+
+/**
+ *
+ * @param column the column of SlotRef
+ * @param selectList new selectList for selectStmt
+ * @param groupByExprs group by Exprs for selectStmt
+ * @return true if ref can be rewritten
+ */
+private static boolean rewriteSelectList(Column column, SelectList 
selectList, ArrayList groupByExprs) {
+SlotRef slot = new SlotRef(null, column.getName());
+if (column.isKey()) {
+selectList.addItem(new SelectListItem(slot, column.getName()));
+groupByExprs.add(slot);
+return true;
+} else {
+AggregateType aggregateType = column.getAggregationType();
+if (aggregateType != AggregateType.SUM && aggregateType != 
AggregateType.MAX
+&& aggregateType != AggregateType.MIN) {
+return false;
+} else {
+FunctionName funcName = new 
FunctionName(aggregateType.toString().toLowerCase());
+List arrayList = Lists.newArrayList(slot);
+FunctionCallExpr func =  new FunctionCallExpr(funcName, new 
FunctionParams(false, arrayList));
+selectList.addItem(new SelectListItem(func, column.getName()));
+return true;
+}
+}
+}
+
+/**
+ * rewrite stmt for querying random distributed table, construct an 
aggregation node for pre-agg
+ * * CREATE TABLE `tbl` (
+ *   `k1` BIGINT NULL DEFAULT "10",
+ *   `k3` SMALLINT NULL,
+ *   `a` BIGINT SUM NULL DEFAULT "0"
+ * ) ENGINE=OLAP
+ * AGGREGATE KEY(`k1`, `k2`)
+ * DISTRIBUTED BY RANDOM BUCKETS 1
+ * PROPERTIES (
+ * "replication_allocation" = "tag.location.default: 1"
+ * )
+ * e.g.,
+ * original: select * from tbl
+ * rewrite: select * from (select k1, k2, sum(pv) from tbl group by k1, 
k2) t
+ * do not rewrite if no need two phase agg:
+ * e.g.,
+ * 1. select max(k1) from tbl
+ * 2. select sum(a) from tbl
+ *
+ * @param statementBase stmt to rewrite
+ * @param analyzer the analyzer
+ * @return true if rewritten
+ * @throws UserException
+ */
+public static boolean rewriteForRandomDistribution(StatementBase 
statementBase, Analyzer analyzer)
+throws UserException {
+boolean reAnalyze = false;
+if (!(statementBase instanceof SelectStmt)) {
+return false;
+}
+SelectStmt selectStmt = (SelectStmt) statementBase;
+for (int i = 0; i < selectStmt.fromClause.size(); i++) {
+TableRef tableRef = selectStmt.fromClause.get(i);
+// Recursively rewrite subquery
+if (tableRef instanceof InlineViewRef) {
+InlineViewRef viewRef = (InlineViewRef) tableRef;
+if (rewriteForRandomDistribution(viewRef.getQueryStmt(), 
viewRef.getAnalyzer())) {
+reAnalyze = true;
+}
+continue;
+}
+TableIf table = tableRef.getTable();
+if (!(table instanceof OlapTable)) {
+continue;
+}
+// only rewrite random distributed AGG_KEY table
+OlapTable olapTable = (OlapTable) table;
+if (olapTable.getKeysType() != KeysType.AGG_KEYS) {
+continue;
+}
+DistributionInfo distributionInfo = 
olapTable.getDefaultDistributionInfo();
+if (distributionInfo.getType() != 
DistributionInfo.DistributionInfoType.RANDOM) {
+continue;
+}
+
+// check agg function and column agg type
+boolean aggTypeMatch = true;
+if (selectStmt.getAggInfo() != null) {
+ArrayList aggExprs = 
selectStmt.getAggInfo().getAggregateExprs();
+if (aggExprs.stream().allMatch(expr -> 
aggTypeMatch(expr.getFnName().getFunction(), expr))) {
+continue;
+}
+aggTypeMatch = false;
+}
+// construct a new InlineViewRef for pre-agg
+boolean canRewrite = true;
+SelectList selectList = new SelectList();
+ArrayList groupingExprs = new ArrayList<>();
+TupleDescriptor desc = tableRef.getDesc();
+List columns = 
desc.getSlots().stream().map(SlotDescriptor::getColumn).collect(Collectors.toList());
+columns = columns.isEmpty() || !aggTypeMatch ? 
olapTable.getBaseSchema() : columns;
+for (Column col : columns) {
+if (!rewriteSelectList(col, selectList, groupingExprs)) 

Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2060174696

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c with 
default session variables
   Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  32 seconds loaded 861443392 Bytes, about 25 MB/s
   Insert into select:   13.5 seconds inserted 1000 Rows, about 740K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2060173068

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2060172115

   
   
   ClickBench: Total hot run time: 30.35 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c, 
data reload: false
   
   query1   0.030.040.03
   query2   0.090.040.04
   query3   0.230.060.06
   query4   1.660.080.07
   query5   0.500.490.50
   query6   1.470.720.72
   query7   0.020.010.02
   query8   0.050.040.04
   query9   0.540.470.49
   query10  0.540.550.55
   query11  0.150.110.12
   query12  0.150.120.12
   query13  0.600.620.58
   query14  0.750.780.77
   query15  0.840.810.81
   query16  0.360.370.35
   query17  0.950.951.02
   query18  0.190.260.24
   query19  1.751.651.65
   query20  0.020.010.01
   query21  15.40   0.650.64
   query22  4.327.141.85
   query23  18.31   1.451.32
   query24  1.800.270.20
   query25  0.150.080.08
   query26  0.260.170.17
   query27  0.080.080.08
   query28  13.33   1.000.98
   query29  12.56   3.263.27
   query30  0.260.070.06
   query31  2.850.380.38
   query32  3.300.460.48
   query33  2.772.862.84
   query34  17.14   4.444.51
   query35  4.524.514.47
   query36  0.640.490.46
   query37  0.190.150.16
   query38  0.140.140.15
   query39  0.050.040.05
   query40  0.180.140.14
   query41  0.090.050.05
   query42  0.050.050.04
   query43  0.030.030.04
   Total cold run time: 109.31 s
   Total hot run time: 30.35 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2060167872

   
   
   TPC-DS: Total hot run time: 186263 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c, 
data reload: false
   
   query1   899 367 368 367
   query2   6433257923062306
   query3   6650205 210 205
   query4   24222   21419   21385   21385
   query5   4128394 413 394
   query6   278 189 173 173
   query7   4586294 290 290
   query8   241 174 172 172
   query9   8499236623382338
   query10  409 236 256 236
   query11  14843   14254   14264   14254
   query12  137 90  83  83
   query13  1646357 348 348
   query14  9330801679977997
   query15  275 180 184 180
   query16  8239257 264 257
   query17  1994573 547 547
   query18  2107279 281 279
   query19  323 157 145 145
   query20  88  85  82  82
   query21  197 129 124 124
   query22  4978477748434777
   query23  33733   33094   33316   33094
   query24  11058   310531143105
   query25  583 382 389 382
   query26  716 164 159 159
   query27  2355362 381 362
   query28  6029209820682068
   query29  877 636 633 633
   query30  300 179 179 179
   query31  967 809 767 767
   query32  90  53  54  53
   query33  645 251 241 241
   query34  886 494 500 494
   query35  843 715 705 705
   query36  1064901 909 901
   query37  119 74  72  72
   query38  3449336933463346
   query39  1617159317191593
   query40  177 129 129 129
   query41  54  42  43  42
   query42  106 97  95  95
   query43  592 540 529 529
   query44  1120746 741 741
   query45  285 280 242 242
   query46  1095774 736 736
   query47  2046191319471913
   query48  382 299 309 299
   query49  834 379 391 379
   query50  807 397 390 390
   query51  6881675667976756
   query52  97  96  88  88
   query53  339 273 271 271
   query54  298 236 244 236
   query55  74  68  69  68
   query56  237 222 219 219
   query57  1204114211381138
   query58  216 194 199 194
   query59  3352312931683129
   query60  251 236 231 231
   query61  89  87  88  87
   query62  610 454 444 444
   query63  304 281 278 278
   query64  4703417939813981
   query65  3037304630763046
   query66  755 344 339 339
   query67  15177   14998   15020   14998
   query68  5196547 548 547
   query69  514 307 303 303
   query70  1201117312151173
   query71  1420128112711271
   query72  6499274525472547
   query73  725 323 321 321
   query74  6747647063806380
   query75  3371267225972597
   query76  3418991 903 903
   query77  470 267 269 267
   query78  10793   10206   10136   10136
   query79  8512527 536 527
   query80  2378447 475 447
   query81  521 243 256 243
   query82  1360102 95  95
   query83  314 166 166 166
   query84  266 85  85  85
   query85  1788265 262 262
   query86  478 298 292 292
   query87  3474327932743274
   query88  5313241624252416
   query89  466 366 368 366
   query90  1943181 182 181
   query91  126 98  98  98
   query92  61  50  53  50
   query93  6295514 506 506
   query94  1086181 178 178
   query95  382 301 291 291
   query96  610 276 267 267
   query97  3105291829502918
   query98  232 226 217 217
   query99  1231848 853 848
   Total cold run time: 291250 ms
   Total hot run time: 186263 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2060159280

   
   
   TPC-H: Total hot run time: 38329 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c, 
data reload: false
   
   -- Round 1 --
   q1   17637   466642074207
   q2   2025192 186 186
   q3   10517   118912291189
   q4   10191   786 706 706
   q5   7497266026682660
   q6   220 132 131 131
   q7   1012614 588 588
   q8   9229204520352035
   q9   7284657565066506
   q10  8578353335313531
   q11  435 229 233 229
   q12  499 223 214 214
   q13  17775   295429372937
   q14  291 235 224 224
   q15  518 493 480 480
   q16  498 393 385 385
   q17  948 611 625 611
   q18  7300674967996749
   q19  7000153814971497
   q20  646 316 312 312
   q21  3458265129502651
   q22  366 301 318 301
   Total cold run time: 113924 ms
   Total hot run time: 38329 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4345424142224222
   q2   362 275 276 275
   q3   2983275927642759
   q4   1866159115541554
   q5   5309534252805280
   q6   209 124 122 122
   q7   2251186818501850
   q8   3201334033413340
   q9   8547855786698557
   q10  4121391340593913
   q11  598 498 498 498
   q12  791 635 632 632
   q13  16439   322531763176
   q14  327 306 275 275
   q15  538 499 484 484
   q16  490 446 459 446
   q17  1813154815071507
   q18  8036813178657865
   q19  1668159415351535
   q20  2071189318291829
   q21  5089504549804980
   q22  548 466 462 462
   Total cold run time: 71602 ms
   Total hot run time: 55561 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2060116612

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2060100550

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2059378354

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-16 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2059254040

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-15 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2058181263

   run pipeline


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-15 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2058179036

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-15 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2056830183

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-15 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2056475615

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-15 Thread via GitHub


DarvenDuan commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1565545970


##
fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java:
##
@@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase 
statementBase, Analyzer anal
 }
 return reAnalyze;
 }
+
+/**
+ *
+ * @param ref the SlotRef to rewrite
+ * @param selectList new selectList for selectStmt
+ * @param groupByExprs group by Exprs for selectStmt
+ * @return true if ref can be rewritten
+ */
+private static boolean rewriteSelectList(SlotRef ref, SelectList 
selectList, ArrayList groupByExprs,
+ ArrayList 
aggExprs) {
+Column column = ref.getColumn();
+if (column.isKey()) {
+selectList.addItem(new SelectListItem(ref, null));
+groupByExprs.add(ref);
+return true;
+} else {
+AggregateType aggregateType = column.getAggregationType();
+if (aggregateType != AggregateType.SUM && aggregateType != 
AggregateType.MAX
+&& aggregateType != AggregateType.MIN) {
+return false;
+} else {
+FunctionName funcName = new 
FunctionName(aggregateType.toString().toLowerCase());
+List arrayList = Lists.newArrayList(ref);
+FunctionCallExpr func =  new FunctionCallExpr(funcName, new 
FunctionParams(false, arrayList));
+selectList.addItem(new SelectListItem(func, null));
+aggExprs.add(func);
+return true;
+}
+}
+}
+
+public static boolean rewriteForRandomDistribution(StatementBase 
statementBase, Analyzer analyzer)

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-15 Thread via GitHub


DarvenDuan commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1565545630


##
fe/fe-core/src/main/java/org/apache/doris/qe/StmtExecutor.java:
##
@@ -1426,21 +1426,24 @@ private void analyzeAndGenerateQueryPlan(TQueryOptions 
tQueryOptions) throws Use
 reAnalyze = true;
 }
 if (parsedStmt instanceof SelectStmt) {
-if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer)) {
+if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer)
+|| 
StmtRewriter.rewriteForRandomDistribution(parsedStmt, analyzer)) {
 reAnalyze = true;
 }
 }
 if (parsedStmt instanceof SetOperationStmt) {
 List operands = 
((SetOperationStmt) parsedStmt).getOperands();
 for (SetOperationStmt.SetOperand operand : operands) {
-if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), 
analyzer)) {
+if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), 
analyzer)
+|| 
StmtRewriter.rewriteForRandomDistribution(operand.getQueryStmt(), analyzer)) {
 reAnalyze = true;
 }
 }
 }
 if (parsedStmt instanceof InsertStmt) {
 QueryStmt queryStmt = ((InsertStmt) parsedStmt).getQueryStmt();
-if (queryStmt != null && 
StmtRewriter.rewriteByPolicy(queryStmt, analyzer)) {
+if (queryStmt != null && 
StmtRewriter.rewriteByPolicy(queryStmt, analyzer)
+|| 
StmtRewriter.rewriteForRandomDistribution(queryStmt, analyzer)) {

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-15 Thread via GitHub


DarvenDuan commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1565544780


##
fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java:
##
@@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase 
statementBase, Analyzer anal
 }
 return reAnalyze;
 }
+
+/**
+ *
+ * @param ref the SlotRef to rewrite
+ * @param selectList new selectList for selectStmt
+ * @param groupByExprs group by Exprs for selectStmt
+ * @return true if ref can be rewritten
+ */
+private static boolean rewriteSelectList(SlotRef ref, SelectList 
selectList, ArrayList groupByExprs,
+ ArrayList 
aggExprs) {
+Column column = ref.getColumn();
+if (column.isKey()) {
+selectList.addItem(new SelectListItem(ref, null));
+groupByExprs.add(ref);
+return true;
+} else {
+AggregateType aggregateType = column.getAggregationType();
+if (aggregateType != AggregateType.SUM && aggregateType != 
AggregateType.MAX
+&& aggregateType != AggregateType.MIN) {
+return false;
+} else {
+FunctionName funcName = new 
FunctionName(aggregateType.toString().toLowerCase());
+List arrayList = Lists.newArrayList(ref);
+FunctionCallExpr func =  new FunctionCallExpr(funcName, new 
FunctionParams(false, arrayList));
+selectList.addItem(new SelectListItem(func, null));
+aggExprs.add(func);
+return true;
+}
+}
+}
+
+public static boolean rewriteForRandomDistribution(StatementBase 
statementBase, Analyzer analyzer)
+throws UserException {
+boolean reAnalyze = false;
+if (!(statementBase instanceof SelectStmt)) {
+return false;
+}
+SelectStmt selectStmt = (SelectStmt) statementBase;
+for (int i = 0; i < selectStmt.fromClause.size(); i++) {
+TableRef tableRef = selectStmt.fromClause.get(i);
+// Recursively rewrite subquery
+if (tableRef instanceof InlineViewRef) {
+InlineViewRef viewRef = (InlineViewRef) tableRef;
+if (rewriteForRandomDistribution(viewRef.getQueryStmt(), 
viewRef.getAnalyzer())) {
+reAnalyze = true;
+}
+continue;
+}
+// already has agg and group by info
+if (selectStmt.hasAggInfo() && selectStmt.hasGroupByClause()) {
+continue;
+}
+TableIf table = tableRef.getTable();
+if (!(table instanceof OlapTable)) {
+continue;
+}
+OlapTable olapTable = (OlapTable) table;
+if (olapTable.getKeysType() != KeysType.AGG_KEYS) {
+continue;
+}
+DistributionInfo distributionInfo = 
olapTable.getDefaultDistributionInfo();
+if (distributionInfo.getType() != 
DistributionInfo.DistributionInfoType.RANDOM) {
+continue;
+}
+
+SelectList selectList = selectStmt.getSelectList();
+SelectList newSelectList = new SelectList();
+ArrayList groupingExprs = new ArrayList<>();
+ArrayList aggExprs = new ArrayList<>();
+boolean canRewrite = true;
+for (SelectListItem item : selectList.getItems()) {
+if (item.isStar()) {
+TupleDescriptor desc = tableRef.getDesc();
+for (Column col : desc.getTable().getBaseSchema()) {
+SlotRef slot = new SlotRef(null, col.getName());
+slot.setTable(desc.getTable());
+slot.setTupleId(desc.getId());
+slot.setDesc(desc.getColumnSlot(col.getName()));
+if (!rewriteSelectList(slot, newSelectList, 
groupingExprs, aggExprs)) {
+canRewrite = false;
+break;
+}
+}
+if (!canRewrite) {
+break;
+}
+} else {
+Expr expr = item.getExpr();
+// just for SlotRef
+if (!(expr instanceof SlotRef)) {
+break;

Review Comment:
   I had refactored the logical of rewriting, instead of change original 
selectStmt, I add an aggregation node for pre-agg, which ignores the complexity 
of the expression in the query.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this 

Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


starocean999 commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1565128487


##
fe/fe-core/src/main/java/org/apache/doris/qe/StmtExecutor.java:
##
@@ -1426,21 +1426,24 @@ private void analyzeAndGenerateQueryPlan(TQueryOptions 
tQueryOptions) throws Use
 reAnalyze = true;
 }
 if (parsedStmt instanceof SelectStmt) {
-if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer)) {
+if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer)
+|| 
StmtRewriter.rewriteForRandomDistribution(parsedStmt, analyzer)) {
 reAnalyze = true;
 }
 }
 if (parsedStmt instanceof SetOperationStmt) {
 List operands = 
((SetOperationStmt) parsedStmt).getOperands();
 for (SetOperationStmt.SetOperand operand : operands) {
-if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), 
analyzer)) {
+if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), 
analyzer)
+|| 
StmtRewriter.rewriteForRandomDistribution(operand.getQueryStmt(), analyzer)) {
 reAnalyze = true;
 }
 }
 }
 if (parsedStmt instanceof InsertStmt) {
 QueryStmt queryStmt = ((InsertStmt) parsedStmt).getQueryStmt();
-if (queryStmt != null && 
StmtRewriter.rewriteByPolicy(queryStmt, analyzer)) {
+if (queryStmt != null && 
StmtRewriter.rewriteByPolicy(queryStmt, analyzer)
+|| 
StmtRewriter.rewriteForRandomDistribution(queryStmt, analyzer)) {

Review Comment:
   queryStmt != null && (StmtRewriter.rewriteByPolicy(queryStmt, analyzer)
   || 
StmtRewriter.rewriteForRandomDistribution(queryStmt, analyzer))



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


starocean999 commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1565128341


##
fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java:
##
@@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase 
statementBase, Analyzer anal
 }
 return reAnalyze;
 }
+
+/**
+ *
+ * @param ref the SlotRef to rewrite
+ * @param selectList new selectList for selectStmt
+ * @param groupByExprs group by Exprs for selectStmt
+ * @return true if ref can be rewritten
+ */
+private static boolean rewriteSelectList(SlotRef ref, SelectList 
selectList, ArrayList groupByExprs,
+ ArrayList 
aggExprs) {
+Column column = ref.getColumn();
+if (column.isKey()) {
+selectList.addItem(new SelectListItem(ref, null));
+groupByExprs.add(ref);
+return true;
+} else {
+AggregateType aggregateType = column.getAggregationType();
+if (aggregateType != AggregateType.SUM && aggregateType != 
AggregateType.MAX
+&& aggregateType != AggregateType.MIN) {
+return false;
+} else {
+FunctionName funcName = new 
FunctionName(aggregateType.toString().toLowerCase());
+List arrayList = Lists.newArrayList(ref);
+FunctionCallExpr func =  new FunctionCallExpr(funcName, new 
FunctionParams(false, arrayList));
+selectList.addItem(new SelectListItem(func, null));
+aggExprs.add(func);
+return true;
+}
+}
+}
+
+public static boolean rewriteForRandomDistribution(StatementBase 
statementBase, Analyzer analyzer)
+throws UserException {
+boolean reAnalyze = false;
+if (!(statementBase instanceof SelectStmt)) {
+return false;
+}
+SelectStmt selectStmt = (SelectStmt) statementBase;
+for (int i = 0; i < selectStmt.fromClause.size(); i++) {
+TableRef tableRef = selectStmt.fromClause.get(i);
+// Recursively rewrite subquery
+if (tableRef instanceof InlineViewRef) {
+InlineViewRef viewRef = (InlineViewRef) tableRef;
+if (rewriteForRandomDistribution(viewRef.getQueryStmt(), 
viewRef.getAnalyzer())) {
+reAnalyze = true;
+}
+continue;
+}
+// already has agg and group by info
+if (selectStmt.hasAggInfo() && selectStmt.hasGroupByClause()) {
+continue;
+}
+TableIf table = tableRef.getTable();
+if (!(table instanceof OlapTable)) {
+continue;
+}
+OlapTable olapTable = (OlapTable) table;
+if (olapTable.getKeysType() != KeysType.AGG_KEYS) {
+continue;
+}
+DistributionInfo distributionInfo = 
olapTable.getDefaultDistributionInfo();
+if (distributionInfo.getType() != 
DistributionInfo.DistributionInfoType.RANDOM) {
+continue;
+}
+
+SelectList selectList = selectStmt.getSelectList();
+SelectList newSelectList = new SelectList();
+ArrayList groupingExprs = new ArrayList<>();
+ArrayList aggExprs = new ArrayList<>();
+boolean canRewrite = true;
+for (SelectListItem item : selectList.getItems()) {
+if (item.isStar()) {
+TupleDescriptor desc = tableRef.getDesc();
+for (Column col : desc.getTable().getBaseSchema()) {
+SlotRef slot = new SlotRef(null, col.getName());
+slot.setTable(desc.getTable());
+slot.setTupleId(desc.getId());
+slot.setDesc(desc.getColumnSlot(col.getName()));
+if (!rewriteSelectList(slot, newSelectList, 
groupingExprs, aggExprs)) {
+canRewrite = false;
+break;
+}
+}
+if (!canRewrite) {
+break;
+}
+} else {
+Expr expr = item.getExpr();
+// just for SlotRef
+if (!(expr instanceof SlotRef)) {
+break;

Review Comment:
   simply break is not enough, try sql` select citycode, username, pv, siteid + 
1 from ${tableName} order by siteid;`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


morrySnow commented on code in PR #33630:
URL: https://github.com/apache/doris/pull/33630#discussion_r1565117424


##
fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java:
##
@@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase 
statementBase, Analyzer anal
 }
 return reAnalyze;
 }
+
+/**
+ *
+ * @param ref the SlotRef to rewrite
+ * @param selectList new selectList for selectStmt
+ * @param groupByExprs group by Exprs for selectStmt
+ * @return true if ref can be rewritten
+ */
+private static boolean rewriteSelectList(SlotRef ref, SelectList 
selectList, ArrayList groupByExprs,
+ ArrayList 
aggExprs) {
+Column column = ref.getColumn();
+if (column.isKey()) {
+selectList.addItem(new SelectListItem(ref, null));
+groupByExprs.add(ref);
+return true;
+} else {
+AggregateType aggregateType = column.getAggregationType();
+if (aggregateType != AggregateType.SUM && aggregateType != 
AggregateType.MAX
+&& aggregateType != AggregateType.MIN) {
+return false;
+} else {
+FunctionName funcName = new 
FunctionName(aggregateType.toString().toLowerCase());
+List arrayList = Lists.newArrayList(ref);
+FunctionCallExpr func =  new FunctionCallExpr(funcName, new 
FunctionParams(false, arrayList));
+selectList.addItem(new SelectListItem(func, null));
+aggExprs.add(func);
+return true;
+}
+}
+}
+
+public static boolean rewriteForRandomDistribution(StatementBase 
statementBase, Analyzer analyzer)

Review Comment:
   add comment to explain this function



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2054015774

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053970302

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089 with 
default session variables
   Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  33 seconds loaded 861443392 Bytes, about 24 MB/s
   Insert into select:   13.6 seconds inserted 1000 Rows, about 735K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053969446

   
   
   ClickBench: Total hot run time: 30.35 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089, 
data reload: false
   
   query1   0.040.030.03
   query2   0.080.030.03
   query3   0.230.040.05
   query4   1.680.060.07
   query5   0.500.470.48
   query6   1.440.640.66
   query7   0.020.010.01
   query8   0.050.040.05
   query9   0.560.500.49
   query10  0.560.570.53
   query11  0.150.110.12
   query12  0.150.120.12
   query13  0.610.590.59
   query14  0.750.760.77
   query15  0.820.800.80
   query16  0.400.360.37
   query17  1.011.011.00
   query18  0.210.260.21
   query19  1.841.801.67
   query20  0.010.000.01
   query21  15.40   0.640.64
   query22  4.466.142.22
   query23  18.33   1.381.24
   query24  1.760.270.21
   query25  0.140.080.08
   query26  0.270.160.16
   query27  0.070.080.07
   query28  13.42   0.991.00
   query29  12.61   3.293.28
   query30  0.270.060.05
   query31  2.880.370.37
   query32  3.280.470.46
   query33  2.842.762.82
   query34  17.07   4.404.38
   query35  4.464.464.43
   query36  0.630.480.46
   query37  0.190.160.16
   query38  0.160.140.15
   query39  0.040.040.04
   query40  0.180.150.14
   query41  0.090.040.05
   query42  0.050.050.04
   query43  0.040.040.03
   Total cold run time: 109.75 s
   Total hot run time: 30.35 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053967782

   
   
   TPC-DS: Total hot run time: 183526 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089, 
data reload: false
   
   query1   868 111711321117
   query2   7448255223232323
   query3   6656211 201 201
   query4   37017   21437   21343   21343
   query5   4151390 388 388
   query6   230 179 175 175
   query7   4030282 282 282
   query8   212 170 172 170
   query9   5771228222592259
   query10  361 232 241 232
   query11  14803   14219   14275   14219
   query12  135 93  86  86
   query13  988 361 365 361
   query14  8690690768676867
   query15  216 182 174 174
   query16  7235272 254 254
   query17  1688594 557 557
   query18  1474299 273 273
   query19  210 156 154 154
   query20  94  87  87  87
   query21  198 131 125 125
   query22  5045492148514851
   query23  33767   32849   33381   32849
   query24  11203   298630022986
   query25  568 420 404 404
   query26  908 160 158 158
   query27  3067361 358 358
   query28  6646207821042078
   query29  886 659 634 634
   query30  284 177 173 173
   query31  954 737 750 737
   query32  63  120 54  54
   query33  559 244 244 244
   query34  892 485 505 485
   query35  829 691 717 691
   query36  1055946 938 938
   query37  108 70  73  70
   query38  3666358135393539
   query39  1628156815511551
   query40  185 128 128 128
   query41  47  44  43  43
   query42  104 97  97  97
   query43  579 541 541 541
   query44  1316734 723 723
   query45  296 300 255 255
   query46  1080731 741 731
   query47  2029200619731973
   query48  369 294 295 294
   query49  834 379 364 364
   query50  796 395 386 386
   query51  6869681068246810
   query52  101 83  90  83
   query53  340 282 276 276
   query54  254 216 225 216
   query55  72  70  69  69
   query56  235 216 224 216
   query57  1235113211391132
   query58  214 199 198 198
   query59  3207334830563056
   query60  259 256 236 236
   query61  111 90  86  86
   query62  593 439 427 427
   query63  307 273 271 271
   query64  3961409439743974
   query65  3068299830182998
   query66  721 310 364 310
   query67  15939   15033   14787   14787
   query68  8820543 550 543
   query69  604 299 309 299
   query70  1277120611391139
   query71  499 281 262 262
   query72  6842264424292429
   query73  903 316 317 316
   query74  7132636764436367
   query75  3512239523172317
   query76  5323115611651156
   query77  620 250 249 249
   query78  10946   10372   10101   10101
   query79  9868520 514 514
   query80  1983458 420 420
   query81  509 232 226 226
   query82  818 91  91  91
   query83  212 165 166 165
   query84  258 84  79  79
   query85  992 314 266 266
   query86  420 293 315 293
   query87  3739347934923479
   query88  5993226423622264
   query89  526 366 370 366
   query90  1979177 172 172
   query91  119 96  98  96
   query92  64  48  47  47
   query93  6661500 499 499
   query94  1125176 178 176
   query95  378 286 285 285
   query96  601 256 259 256
   query97  2659244824592448
   query98  229 224 219 219
   query99  1182860 887 860
   Total cold run time: 306517 ms
   Total hot run time: 183526 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053963449

   
   
   TPC-H: Total hot run time: 38061 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089, 
data reload: false
   
   -- Round 1 --
   q1   17609   429042254225
   q2   2005184 182 182
   q3   10466   114912111149
   q4   10191   726 829 726
   q5   7542268826412641
   q6   219 132 130 130
   q7   991 597 571 571
   q8   9224204420322032
   q9   7903651565046504
   q10  8552353635083508
   q11  470 240 229 229
   q12  472 220 205 205
   q13  17783   290829292908
   q14  272 220 233 220
   q15  515 487 473 473
   q16  502 373 377 373
   q17  951 618 677 618
   q18  7354666265466546
   q19  5423152914861486
   q20  692 314 308 308
   q21  3467273228362732
   q22  364 295 304 295
   Total cold run time: 112967 ms
   Total hot run time: 38061 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4376427742524252
   q2   367 255 271 255
   q3   3015275627812756
   q4   1861157815571557
   q5   5328531652775277
   q6   210 123 124 123
   q7   2230187919021879
   q8   3186334533183318
   q9   8564853385578533
   q10  4093393839203920
   q11  641 521 510 510
   q12  779 670 643 643
   q13  17814   327030213021
   q14  321 306 302 302
   q15  519 480 472 472
   q16  504 442 445 442
   q17  1827153015271527
   q18  8113788778567856
   q19  1650155416181554
   q20  2038185918161816
   q21  5227489249454892
   q22  524 455 468 455
   Total cold run time: 73187 ms
   Total hot run time: 55360 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053951296

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053932802

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da with 
default session variables
   Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
   Stream load orc:  58 seconds loaded 1101869774 Bytes, about 18 MB/s
   Stream load parquet:  32 seconds loaded 861443392 Bytes, about 25 MB/s
   Insert into select:   13.3 seconds inserted 1000 Rows, about 751K 
ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053932055

   
   
   ClickBench: Total hot run time: 30.2 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da, 
data reload: false
   
   query1   0.040.040.03
   query2   0.070.040.04
   query3   0.230.060.06
   query4   1.660.090.10
   query5   0.510.490.51
   query6   1.430.640.65
   query7   0.020.010.02
   query8   0.040.040.04
   query9   0.550.500.51
   query10  0.550.550.56
   query11  0.150.110.11
   query12  0.140.110.12
   query13  0.620.590.58
   query14  0.760.770.77
   query15  0.810.810.81
   query16  0.370.360.38
   query17  0.971.011.03
   query18  0.220.240.22
   query19  1.781.671.66
   query20  0.010.010.01
   query21  15.41   0.660.64
   query22  4.377.441.90
   query23  18.29   1.481.26
   query24  2.130.220.21
   query25  0.140.080.08
   query26  0.270.160.15
   query27  0.080.080.08
   query28  13.35   1.000.99
   query29  12.58   3.283.28
   query30  0.270.050.06
   query31  2.950.380.37
   query32  3.220.470.46
   query33  2.842.812.82
   query34  17.28   4.344.43
   query35  4.524.494.43
   query36  0.650.460.46
   query37  0.170.160.15
   query38  0.150.150.14
   query39  0.040.040.04
   query40  0.170.140.14
   query41  0.100.050.05
   query42  0.060.050.05
   query43  0.040.030.04
   Total cold run time: 110.01 s
   Total hot run time: 30.2 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053930394

   
   
   TPC-DS: Total hot run time: 184601 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da, 
data reload: false
   
   query1   1235111211161112
   query2   6194277125072507
   query3   6648208 203 203
   query4   36767   21653   21462   21462
   query5   4159388 399 388
   query6   240 189 182 182
   query7   4046294 284 284
   query8   223 171 167 167
   query9   5770235522922292
   query10  361 239 241 239
   query11  14578   14217   14136   14136
   query12  141 90  83  83
   query13  991 353 349 349
   query14  9951683869566838
   query15  205 174 184 174
   query16  6869258 262 258
   query17  1683542 550 542
   query18  1530273 269 269
   query19  185 149 151 149
   query20  91  85  85  85
   query21  198 129 121 121
   query22  4988483548044804
   query23  33760   33118   33456   33118
   query24  11198   295929842959
   query25  531 410 372 372
   query26  810 172 153 153
   query27  3085360 363 360
   query28  6614211820752075
   query29  877 644 614 614
   query30  311 182 167 167
   query31  967 774 738 738
   query32  59  54  51  51
   query33  516 256 257 256
   query34  931 500 493 493
   query35  857 723 715 715
   query36  1057976 988 976
   query37  116 73  76  73
   query38  3728356936513569
   query39  1643157715731573
   query40  172 134 128 128
   query41  46  44  46  44
   query42  104 96  96  96
   query43  606 547 578 547
   query44  1356727 713 713
   query45  277 249 279 249
   query46  1081744 729 729
   query47  2024195719781957
   query48  377 301 298 298
   query49  832 373 357 357
   query50  779 396 401 396
   query51  6975685867526752
   query52  107 89  89  89
   query53  341 277 279 277
   query54  245 219 220 219
   query55  73  70  70  70
   query56  239 218 227 218
   query57  1214114311151115
   query58  216 198 195 195
   query59  3409338132453245
   query60  243 234 228 228
   query61  92  87  102 87
   query62  603 453 443 443
   query63  304 283 283 283
   query64  4147392941973929
   query65  3066302930373029
   query66  750 321 322 321
   query67  15645   14889   14908   14889
   query68  7022550 543 543
   query69  531 313 305 305
   query70  1298119011311131
   query71  472 282 278 278
   query72  6585273225622562
   query73  823 322 320 320
   query74  7145641064146410
   query75  3145241123602360
   query76  4210113511161116
   query77  595 263 253 253
   query78  10973   10162   10192   10162
   query79  7357522 529 522
   query80  2113442 445 442
   query81  527 235 240 235
   query82  157597  95  95
   query83  338 174 169 169
   query84  267 84  85  84
   query85  1432312 307 307
   query86  463 297 278 278
   query87  3788350335283503
   query88  6153226322802263
   query89  479 377 377 377
   query90  1959175 174 174
   query91  121 95  96  95
   query92  56  46  47  46
   query93  6345516 507 507
   query94  1125185 179 179
   query95  381 289 290 289
   query96  597 257 259 257
   query97  2641249024652465
   query98  231 222 223 222
   query99  1213843 855 843
   Total cold run time: 301396 ms
   Total hot run time: 184601 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-14 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053927326

   
   
   TPC-H: Total hot run time: 38165 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da, 
data reload: false
   
   -- Round 1 --
   q1   17620   426341744174
   q2   2010188 180 180
   q3   10472   116011481148
   q4   10197   833 709 709
   q5   7524267826302630
   q6   212 129 128 128
   q7   976 599 583 583
   q8   9214204420382038
   q9   7966657564736473
   q10  8579355334803480
   q11  457 239 228 228
   q12  422 219 211 211
   q13  18618   292429172917
   q14  271 224 234 224
   q15  523 481 480 480
   q16  512 399 372 372
   q17  958 732 745 732
   q18  7320677066846684
   q19  5811153815001500
   q20  686 311 305 305
   q21  3546266128372661
   q22  363 315 308 308
   Total cold run time: 114257 ms
   Total hot run time: 38165 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4439426242544254
   q2   371 271 267 267
   q3   2977274227732742
   q4   1850155515841555
   q5   5359532353255323
   q6   209 122 121 121
   q7   2247188219111882
   q8   3189331233073307
   q9   8594850386578503
   q10  4071392139773921
   q11  617 507 506 506
   q12  794 623 650 623
   q13  16820   324331563156
   q14  317 276 319 276
   q15  523 483 461 461
   q16  511 462 444 444
   q17  1830155414871487
   q18  8061820877187718
   q19  1680154415901544
   q20  2041187218581858
   q21  5136497448434843
   q22  575 504 473 473
   Total cold run time: 72211 ms
   Total hot run time: 55264 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-13 Thread via GitHub


DarvenDuan commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053916849

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]

2024-04-13 Thread via GitHub


doris-robot commented on PR #33630:
URL: https://github.com/apache/doris/pull/33630#issuecomment-2053915942

   Thank you for your contribution to Apache Doris.
   Don't know what should be done next? See [How to process your 
PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR)
   
   Since 2024-03-18, the Document has been moved to 
[doris-website](https://github.com/apache/doris-website).
   See [Doris 
Document](https://cwiki.apache.org/confluence/display/DORIS/Doris+Document).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org