Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2102638419 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
github-actions[bot] commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2076816632 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
github-actions[bot] commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2076816565 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
starocean999 commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1579206615 ## fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java: ## @@ -1365,4 +1371,166 @@ public static boolean rewriteByPolicy(StatementBase statementBase, Analyzer anal } return reAnalyze; } + +/** + * + * @param column the column of SlotRef + * @param selectList new selectList for selectStmt + * @param groupByExprs group by Exprs for selectStmt + * @return true if ref can be rewritten + */ +private static boolean rewriteSelectList(Column column, SelectList selectList, ArrayList groupByExprs) { +SlotRef slot = new SlotRef(null, column.getName()); +if (column.isKey()) { +selectList.addItem(new SelectListItem(slot, column.getName())); +groupByExprs.add(slot); +return true; +} else { +AggregateType aggregateType = column.getAggregationType(); +if (aggregateType != AggregateType.SUM && aggregateType != AggregateType.MAX +&& aggregateType != AggregateType.MIN) { +return false; +} else { +FunctionName funcName = new FunctionName(aggregateType.toString().toLowerCase()); +List arrayList = Lists.newArrayList(slot); +FunctionCallExpr func = new FunctionCallExpr(funcName, new FunctionParams(false, arrayList)); +selectList.addItem(new SelectListItem(func, column.getName())); +return true; +} +} +} + +/** + * rewrite stmt for querying random distributed table, construct an aggregation node for pre-agg + * * CREATE TABLE `tbl` ( + * `k1` BIGINT NULL DEFAULT "10", + * `k3` SMALLINT NULL, + * `a` BIGINT SUM NULL DEFAULT "0" + * ) ENGINE=OLAP + * AGGREGATE KEY(`k1`, `k2`) + * DISTRIBUTED BY RANDOM BUCKETS 1 + * PROPERTIES ( + * "replication_allocation" = "tag.location.default: 1" + * ) + * e.g., + * original: select * from tbl + * rewrite: select * from (select k1, k2, sum(pv) from tbl group by k1, k2) t + * do not rewrite if no need two phase agg: + * e.g., + * 1. select max(k1) from tbl + * 2. select sum(a) from tbl + * + * @param statementBase stmt to rewrite + * @param analyzer the analyzer + * @return true if rewritten + * @throws UserException + */ +public static boolean rewriteForRandomDistribution(StatementBase statementBase, Analyzer analyzer) +throws UserException { +boolean reAnalyze = false; +if (!(statementBase instanceof SelectStmt)) { +return false; +} +SelectStmt selectStmt = (SelectStmt) statementBase; +for (int i = 0; i < selectStmt.fromClause.size(); i++) { +TableRef tableRef = selectStmt.fromClause.get(i); +// Recursively rewrite subquery +if (tableRef instanceof InlineViewRef) { +InlineViewRef viewRef = (InlineViewRef) tableRef; +if (rewriteForRandomDistribution(viewRef.getQueryStmt(), viewRef.getAnalyzer())) { +reAnalyze = true; +} +continue; +} +TableIf table = tableRef.getTable(); +if (!(table instanceof OlapTable)) { +continue; +} +// only rewrite random distributed AGG_KEY table +OlapTable olapTable = (OlapTable) table; +if (olapTable.getKeysType() != KeysType.AGG_KEYS) { +continue; +} +DistributionInfo distributionInfo = olapTable.getDefaultDistributionInfo(); +if (distributionInfo.getType() != DistributionInfo.DistributionInfoType.RANDOM) { +continue; +} + +// check agg function and column agg type +boolean aggTypeMatch = true; +if (selectStmt.getAggInfo() != null) { +ArrayList aggExprs = selectStmt.getAggInfo().getAggregateExprs(); +if (aggExprs.stream().allMatch(expr -> aggTypeMatch(expr.getFnName().getFunction(), expr))) { +continue; +} +aggTypeMatch = false; +} +// construct a new InlineViewRef for pre-agg +boolean canRewrite = true; +SelectList selectList = new SelectList(); +ArrayList groupingExprs = new ArrayList<>(); +TupleDescriptor desc = tableRef.getDesc(); +List columns = desc.getSlots().stream().map(SlotDescriptor::getColumn).collect(Collectors.toList()); +columns = columns.isEmpty() || !aggTypeMatch ? olapTable.getBaseSchema() : columns; +for (Column col : columns) { +if (!rewriteSelectList(col, selectList, groupingExprs))
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
starocean999 commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1579206615 ## fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java: ## @@ -1365,4 +1371,166 @@ public static boolean rewriteByPolicy(StatementBase statementBase, Analyzer anal } return reAnalyze; } + +/** + * + * @param column the column of SlotRef + * @param selectList new selectList for selectStmt + * @param groupByExprs group by Exprs for selectStmt + * @return true if ref can be rewritten + */ +private static boolean rewriteSelectList(Column column, SelectList selectList, ArrayList groupByExprs) { +SlotRef slot = new SlotRef(null, column.getName()); +if (column.isKey()) { +selectList.addItem(new SelectListItem(slot, column.getName())); +groupByExprs.add(slot); +return true; +} else { +AggregateType aggregateType = column.getAggregationType(); +if (aggregateType != AggregateType.SUM && aggregateType != AggregateType.MAX +&& aggregateType != AggregateType.MIN) { +return false; +} else { +FunctionName funcName = new FunctionName(aggregateType.toString().toLowerCase()); +List arrayList = Lists.newArrayList(slot); +FunctionCallExpr func = new FunctionCallExpr(funcName, new FunctionParams(false, arrayList)); +selectList.addItem(new SelectListItem(func, column.getName())); +return true; +} +} +} + +/** + * rewrite stmt for querying random distributed table, construct an aggregation node for pre-agg + * * CREATE TABLE `tbl` ( + * `k1` BIGINT NULL DEFAULT "10", + * `k3` SMALLINT NULL, + * `a` BIGINT SUM NULL DEFAULT "0" + * ) ENGINE=OLAP + * AGGREGATE KEY(`k1`, `k2`) + * DISTRIBUTED BY RANDOM BUCKETS 1 + * PROPERTIES ( + * "replication_allocation" = "tag.location.default: 1" + * ) + * e.g., + * original: select * from tbl + * rewrite: select * from (select k1, k2, sum(pv) from tbl group by k1, k2) t + * do not rewrite if no need two phase agg: + * e.g., + * 1. select max(k1) from tbl + * 2. select sum(a) from tbl + * + * @param statementBase stmt to rewrite + * @param analyzer the analyzer + * @return true if rewritten + * @throws UserException + */ +public static boolean rewriteForRandomDistribution(StatementBase statementBase, Analyzer analyzer) +throws UserException { +boolean reAnalyze = false; +if (!(statementBase instanceof SelectStmt)) { +return false; +} +SelectStmt selectStmt = (SelectStmt) statementBase; +for (int i = 0; i < selectStmt.fromClause.size(); i++) { +TableRef tableRef = selectStmt.fromClause.get(i); +// Recursively rewrite subquery +if (tableRef instanceof InlineViewRef) { +InlineViewRef viewRef = (InlineViewRef) tableRef; +if (rewriteForRandomDistribution(viewRef.getQueryStmt(), viewRef.getAnalyzer())) { +reAnalyze = true; +} +continue; +} +TableIf table = tableRef.getTable(); +if (!(table instanceof OlapTable)) { +continue; +} +// only rewrite random distributed AGG_KEY table +OlapTable olapTable = (OlapTable) table; +if (olapTable.getKeysType() != KeysType.AGG_KEYS) { +continue; +} +DistributionInfo distributionInfo = olapTable.getDefaultDistributionInfo(); +if (distributionInfo.getType() != DistributionInfo.DistributionInfoType.RANDOM) { +continue; +} + +// check agg function and column agg type +boolean aggTypeMatch = true; +if (selectStmt.getAggInfo() != null) { +ArrayList aggExprs = selectStmt.getAggInfo().getAggregateExprs(); +if (aggExprs.stream().allMatch(expr -> aggTypeMatch(expr.getFnName().getFunction(), expr))) { +continue; +} +aggTypeMatch = false; +} +// construct a new InlineViewRef for pre-agg +boolean canRewrite = true; +SelectList selectList = new SelectList(); +ArrayList groupingExprs = new ArrayList<>(); +TupleDescriptor desc = tableRef.getDesc(); +List columns = desc.getSlots().stream().map(SlotDescriptor::getColumn).collect(Collectors.toList()); +columns = columns.isEmpty() || !aggTypeMatch ? olapTable.getBaseSchema() : columns; +for (Column col : columns) { +if (!rewriteSelectList(col, selectList, groupingExprs))
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2060174696 Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Load test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c with default session variables Stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s Stream load orc: 58 seconds loaded 1101869774 Bytes, about 18 MB/s Stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s Insert into select: 13.5 seconds inserted 1000 Rows, about 740K ops/s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2060173068 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2060172115 ClickBench: Total hot run time: 30.35 s ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools ClickBench test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c, data reload: false query1 0.030.040.03 query2 0.090.040.04 query3 0.230.060.06 query4 1.660.080.07 query5 0.500.490.50 query6 1.470.720.72 query7 0.020.010.02 query8 0.050.040.04 query9 0.540.470.49 query10 0.540.550.55 query11 0.150.110.12 query12 0.150.120.12 query13 0.600.620.58 query14 0.750.780.77 query15 0.840.810.81 query16 0.360.370.35 query17 0.950.951.02 query18 0.190.260.24 query19 1.751.651.65 query20 0.020.010.01 query21 15.40 0.650.64 query22 4.327.141.85 query23 18.31 1.451.32 query24 1.800.270.20 query25 0.150.080.08 query26 0.260.170.17 query27 0.080.080.08 query28 13.33 1.000.98 query29 12.56 3.263.27 query30 0.260.070.06 query31 2.850.380.38 query32 3.300.460.48 query33 2.772.862.84 query34 17.14 4.444.51 query35 4.524.514.47 query36 0.640.490.46 query37 0.190.150.16 query38 0.140.140.15 query39 0.050.040.05 query40 0.180.140.14 query41 0.090.050.05 query42 0.050.050.04 query43 0.030.030.04 Total cold run time: 109.31 s Total hot run time: 30.35 s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2060167872 TPC-DS: Total hot run time: 186263 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools TPC-DS sf100 test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c, data reload: false query1 899 367 368 367 query2 6433257923062306 query3 6650205 210 205 query4 24222 21419 21385 21385 query5 4128394 413 394 query6 278 189 173 173 query7 4586294 290 290 query8 241 174 172 172 query9 8499236623382338 query10 409 236 256 236 query11 14843 14254 14264 14254 query12 137 90 83 83 query13 1646357 348 348 query14 9330801679977997 query15 275 180 184 180 query16 8239257 264 257 query17 1994573 547 547 query18 2107279 281 279 query19 323 157 145 145 query20 88 85 82 82 query21 197 129 124 124 query22 4978477748434777 query23 33733 33094 33316 33094 query24 11058 310531143105 query25 583 382 389 382 query26 716 164 159 159 query27 2355362 381 362 query28 6029209820682068 query29 877 636 633 633 query30 300 179 179 179 query31 967 809 767 767 query32 90 53 54 53 query33 645 251 241 241 query34 886 494 500 494 query35 843 715 705 705 query36 1064901 909 901 query37 119 74 72 72 query38 3449336933463346 query39 1617159317191593 query40 177 129 129 129 query41 54 42 43 42 query42 106 97 95 95 query43 592 540 529 529 query44 1120746 741 741 query45 285 280 242 242 query46 1095774 736 736 query47 2046191319471913 query48 382 299 309 299 query49 834 379 391 379 query50 807 397 390 390 query51 6881675667976756 query52 97 96 88 88 query53 339 273 271 271 query54 298 236 244 236 query55 74 68 69 68 query56 237 222 219 219 query57 1204114211381138 query58 216 194 199 194 query59 3352312931683129 query60 251 236 231 231 query61 89 87 88 87 query62 610 454 444 444 query63 304 281 278 278 query64 4703417939813981 query65 3037304630763046 query66 755 344 339 339 query67 15177 14998 15020 14998 query68 5196547 548 547 query69 514 307 303 303 query70 1201117312151173 query71 1420128112711271 query72 6499274525472547 query73 725 323 321 321 query74 6747647063806380 query75 3371267225972597 query76 3418991 903 903 query77 470 267 269 267 query78 10793 10206 10136 10136 query79 8512527 536 527 query80 2378447 475 447 query81 521 243 256 243 query82 1360102 95 95 query83 314 166 166 166 query84 266 85 85 85 query85 1788265 262 262 query86 478 298 292 292 query87 3474327932743274 query88 5313241624252416 query89 466 366 368 366 query90 1943181 182 181 query91 126 98 98 98 query92 61 50 53 50 query93 6295514 506 506 query94 1086181 178 178 query95 382 301 291 291 query96 610 276 267 267 query97 3105291829502918 query98 232 226 217 217 query99 1231848 853 848 Total cold run time: 291250 ms Total hot run time: 186263 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2060159280 TPC-H: Total hot run time: 38329 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit d2150a8a0e489dbc168529f4205f269619ccc98c, data reload: false -- Round 1 -- q1 17637 466642074207 q2 2025192 186 186 q3 10517 118912291189 q4 10191 786 706 706 q5 7497266026682660 q6 220 132 131 131 q7 1012614 588 588 q8 9229204520352035 q9 7284657565066506 q10 8578353335313531 q11 435 229 233 229 q12 499 223 214 214 q13 17775 295429372937 q14 291 235 224 224 q15 518 493 480 480 q16 498 393 385 385 q17 948 611 625 611 q18 7300674967996749 q19 7000153814971497 q20 646 316 312 312 q21 3458265129502651 q22 366 301 318 301 Total cold run time: 113924 ms Total hot run time: 38329 ms - Round 2, with runtime_filter_mode=off - q1 4345424142224222 q2 362 275 276 275 q3 2983275927642759 q4 1866159115541554 q5 5309534252805280 q6 209 124 122 122 q7 2251186818501850 q8 3201334033413340 q9 8547855786698557 q10 4121391340593913 q11 598 498 498 498 q12 791 635 632 632 q13 16439 322531763176 q14 327 306 275 275 q15 538 499 484 484 q16 490 446 459 446 q17 1813154815071507 q18 8036813178657865 q19 1668159415351535 q20 2071189318291829 q21 5089504549804980 q22 548 466 462 462 Total cold run time: 71602 ms Total hot run time: 55561 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2060116612 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2060100550 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2059378354 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2059254040 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2058181263 run pipeline -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2058179036 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2056830183 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2056475615 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1565545970 ## fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java: ## @@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase statementBase, Analyzer anal } return reAnalyze; } + +/** + * + * @param ref the SlotRef to rewrite + * @param selectList new selectList for selectStmt + * @param groupByExprs group by Exprs for selectStmt + * @return true if ref can be rewritten + */ +private static boolean rewriteSelectList(SlotRef ref, SelectList selectList, ArrayList groupByExprs, + ArrayList aggExprs) { +Column column = ref.getColumn(); +if (column.isKey()) { +selectList.addItem(new SelectListItem(ref, null)); +groupByExprs.add(ref); +return true; +} else { +AggregateType aggregateType = column.getAggregationType(); +if (aggregateType != AggregateType.SUM && aggregateType != AggregateType.MAX +&& aggregateType != AggregateType.MIN) { +return false; +} else { +FunctionName funcName = new FunctionName(aggregateType.toString().toLowerCase()); +List arrayList = Lists.newArrayList(ref); +FunctionCallExpr func = new FunctionCallExpr(funcName, new FunctionParams(false, arrayList)); +selectList.addItem(new SelectListItem(func, null)); +aggExprs.add(func); +return true; +} +} +} + +public static boolean rewriteForRandomDistribution(StatementBase statementBase, Analyzer analyzer) Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1565545630 ## fe/fe-core/src/main/java/org/apache/doris/qe/StmtExecutor.java: ## @@ -1426,21 +1426,24 @@ private void analyzeAndGenerateQueryPlan(TQueryOptions tQueryOptions) throws Use reAnalyze = true; } if (parsedStmt instanceof SelectStmt) { -if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer)) { +if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer) +|| StmtRewriter.rewriteForRandomDistribution(parsedStmt, analyzer)) { reAnalyze = true; } } if (parsedStmt instanceof SetOperationStmt) { List operands = ((SetOperationStmt) parsedStmt).getOperands(); for (SetOperationStmt.SetOperand operand : operands) { -if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), analyzer)) { +if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), analyzer) +|| StmtRewriter.rewriteForRandomDistribution(operand.getQueryStmt(), analyzer)) { reAnalyze = true; } } } if (parsedStmt instanceof InsertStmt) { QueryStmt queryStmt = ((InsertStmt) parsedStmt).getQueryStmt(); -if (queryStmt != null && StmtRewriter.rewriteByPolicy(queryStmt, analyzer)) { +if (queryStmt != null && StmtRewriter.rewriteByPolicy(queryStmt, analyzer) +|| StmtRewriter.rewriteForRandomDistribution(queryStmt, analyzer)) { Review Comment: fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1565544780 ## fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java: ## @@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase statementBase, Analyzer anal } return reAnalyze; } + +/** + * + * @param ref the SlotRef to rewrite + * @param selectList new selectList for selectStmt + * @param groupByExprs group by Exprs for selectStmt + * @return true if ref can be rewritten + */ +private static boolean rewriteSelectList(SlotRef ref, SelectList selectList, ArrayList groupByExprs, + ArrayList aggExprs) { +Column column = ref.getColumn(); +if (column.isKey()) { +selectList.addItem(new SelectListItem(ref, null)); +groupByExprs.add(ref); +return true; +} else { +AggregateType aggregateType = column.getAggregationType(); +if (aggregateType != AggregateType.SUM && aggregateType != AggregateType.MAX +&& aggregateType != AggregateType.MIN) { +return false; +} else { +FunctionName funcName = new FunctionName(aggregateType.toString().toLowerCase()); +List arrayList = Lists.newArrayList(ref); +FunctionCallExpr func = new FunctionCallExpr(funcName, new FunctionParams(false, arrayList)); +selectList.addItem(new SelectListItem(func, null)); +aggExprs.add(func); +return true; +} +} +} + +public static boolean rewriteForRandomDistribution(StatementBase statementBase, Analyzer analyzer) +throws UserException { +boolean reAnalyze = false; +if (!(statementBase instanceof SelectStmt)) { +return false; +} +SelectStmt selectStmt = (SelectStmt) statementBase; +for (int i = 0; i < selectStmt.fromClause.size(); i++) { +TableRef tableRef = selectStmt.fromClause.get(i); +// Recursively rewrite subquery +if (tableRef instanceof InlineViewRef) { +InlineViewRef viewRef = (InlineViewRef) tableRef; +if (rewriteForRandomDistribution(viewRef.getQueryStmt(), viewRef.getAnalyzer())) { +reAnalyze = true; +} +continue; +} +// already has agg and group by info +if (selectStmt.hasAggInfo() && selectStmt.hasGroupByClause()) { +continue; +} +TableIf table = tableRef.getTable(); +if (!(table instanceof OlapTable)) { +continue; +} +OlapTable olapTable = (OlapTable) table; +if (olapTable.getKeysType() != KeysType.AGG_KEYS) { +continue; +} +DistributionInfo distributionInfo = olapTable.getDefaultDistributionInfo(); +if (distributionInfo.getType() != DistributionInfo.DistributionInfoType.RANDOM) { +continue; +} + +SelectList selectList = selectStmt.getSelectList(); +SelectList newSelectList = new SelectList(); +ArrayList groupingExprs = new ArrayList<>(); +ArrayList aggExprs = new ArrayList<>(); +boolean canRewrite = true; +for (SelectListItem item : selectList.getItems()) { +if (item.isStar()) { +TupleDescriptor desc = tableRef.getDesc(); +for (Column col : desc.getTable().getBaseSchema()) { +SlotRef slot = new SlotRef(null, col.getName()); +slot.setTable(desc.getTable()); +slot.setTupleId(desc.getId()); +slot.setDesc(desc.getColumnSlot(col.getName())); +if (!rewriteSelectList(slot, newSelectList, groupingExprs, aggExprs)) { +canRewrite = false; +break; +} +} +if (!canRewrite) { +break; +} +} else { +Expr expr = item.getExpr(); +// just for SlotRef +if (!(expr instanceof SlotRef)) { +break; Review Comment: I had refactored the logical of rewriting, instead of change original selectStmt, I add an aggregation node for pre-agg, which ignores the complexity of the expression in the query. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
starocean999 commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1565128487 ## fe/fe-core/src/main/java/org/apache/doris/qe/StmtExecutor.java: ## @@ -1426,21 +1426,24 @@ private void analyzeAndGenerateQueryPlan(TQueryOptions tQueryOptions) throws Use reAnalyze = true; } if (parsedStmt instanceof SelectStmt) { -if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer)) { +if (StmtRewriter.rewriteByPolicy(parsedStmt, analyzer) +|| StmtRewriter.rewriteForRandomDistribution(parsedStmt, analyzer)) { reAnalyze = true; } } if (parsedStmt instanceof SetOperationStmt) { List operands = ((SetOperationStmt) parsedStmt).getOperands(); for (SetOperationStmt.SetOperand operand : operands) { -if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), analyzer)) { +if (StmtRewriter.rewriteByPolicy(operand.getQueryStmt(), analyzer) +|| StmtRewriter.rewriteForRandomDistribution(operand.getQueryStmt(), analyzer)) { reAnalyze = true; } } } if (parsedStmt instanceof InsertStmt) { QueryStmt queryStmt = ((InsertStmt) parsedStmt).getQueryStmt(); -if (queryStmt != null && StmtRewriter.rewriteByPolicy(queryStmt, analyzer)) { +if (queryStmt != null && StmtRewriter.rewriteByPolicy(queryStmt, analyzer) +|| StmtRewriter.rewriteForRandomDistribution(queryStmt, analyzer)) { Review Comment: queryStmt != null && (StmtRewriter.rewriteByPolicy(queryStmt, analyzer) || StmtRewriter.rewriteForRandomDistribution(queryStmt, analyzer)) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
starocean999 commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1565128341 ## fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java: ## @@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase statementBase, Analyzer anal } return reAnalyze; } + +/** + * + * @param ref the SlotRef to rewrite + * @param selectList new selectList for selectStmt + * @param groupByExprs group by Exprs for selectStmt + * @return true if ref can be rewritten + */ +private static boolean rewriteSelectList(SlotRef ref, SelectList selectList, ArrayList groupByExprs, + ArrayList aggExprs) { +Column column = ref.getColumn(); +if (column.isKey()) { +selectList.addItem(new SelectListItem(ref, null)); +groupByExprs.add(ref); +return true; +} else { +AggregateType aggregateType = column.getAggregationType(); +if (aggregateType != AggregateType.SUM && aggregateType != AggregateType.MAX +&& aggregateType != AggregateType.MIN) { +return false; +} else { +FunctionName funcName = new FunctionName(aggregateType.toString().toLowerCase()); +List arrayList = Lists.newArrayList(ref); +FunctionCallExpr func = new FunctionCallExpr(funcName, new FunctionParams(false, arrayList)); +selectList.addItem(new SelectListItem(func, null)); +aggExprs.add(func); +return true; +} +} +} + +public static boolean rewriteForRandomDistribution(StatementBase statementBase, Analyzer analyzer) +throws UserException { +boolean reAnalyze = false; +if (!(statementBase instanceof SelectStmt)) { +return false; +} +SelectStmt selectStmt = (SelectStmt) statementBase; +for (int i = 0; i < selectStmt.fromClause.size(); i++) { +TableRef tableRef = selectStmt.fromClause.get(i); +// Recursively rewrite subquery +if (tableRef instanceof InlineViewRef) { +InlineViewRef viewRef = (InlineViewRef) tableRef; +if (rewriteForRandomDistribution(viewRef.getQueryStmt(), viewRef.getAnalyzer())) { +reAnalyze = true; +} +continue; +} +// already has agg and group by info +if (selectStmt.hasAggInfo() && selectStmt.hasGroupByClause()) { +continue; +} +TableIf table = tableRef.getTable(); +if (!(table instanceof OlapTable)) { +continue; +} +OlapTable olapTable = (OlapTable) table; +if (olapTable.getKeysType() != KeysType.AGG_KEYS) { +continue; +} +DistributionInfo distributionInfo = olapTable.getDefaultDistributionInfo(); +if (distributionInfo.getType() != DistributionInfo.DistributionInfoType.RANDOM) { +continue; +} + +SelectList selectList = selectStmt.getSelectList(); +SelectList newSelectList = new SelectList(); +ArrayList groupingExprs = new ArrayList<>(); +ArrayList aggExprs = new ArrayList<>(); +boolean canRewrite = true; +for (SelectListItem item : selectList.getItems()) { +if (item.isStar()) { +TupleDescriptor desc = tableRef.getDesc(); +for (Column col : desc.getTable().getBaseSchema()) { +SlotRef slot = new SlotRef(null, col.getName()); +slot.setTable(desc.getTable()); +slot.setTupleId(desc.getId()); +slot.setDesc(desc.getColumnSlot(col.getName())); +if (!rewriteSelectList(slot, newSelectList, groupingExprs, aggExprs)) { +canRewrite = false; +break; +} +} +if (!canRewrite) { +break; +} +} else { +Expr expr = item.getExpr(); +// just for SlotRef +if (!(expr instanceof SlotRef)) { +break; Review Comment: simply break is not enough, try sql` select citycode, username, pv, siteid + 1 from ${tableName} order by siteid;` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
morrySnow commented on code in PR #33630: URL: https://github.com/apache/doris/pull/33630#discussion_r1565117424 ## fe/fe-core/src/main/java/org/apache/doris/analysis/StmtRewriter.java: ## @@ -1365,4 +1370,116 @@ public static boolean rewriteByPolicy(StatementBase statementBase, Analyzer anal } return reAnalyze; } + +/** + * + * @param ref the SlotRef to rewrite + * @param selectList new selectList for selectStmt + * @param groupByExprs group by Exprs for selectStmt + * @return true if ref can be rewritten + */ +private static boolean rewriteSelectList(SlotRef ref, SelectList selectList, ArrayList groupByExprs, + ArrayList aggExprs) { +Column column = ref.getColumn(); +if (column.isKey()) { +selectList.addItem(new SelectListItem(ref, null)); +groupByExprs.add(ref); +return true; +} else { +AggregateType aggregateType = column.getAggregationType(); +if (aggregateType != AggregateType.SUM && aggregateType != AggregateType.MAX +&& aggregateType != AggregateType.MIN) { +return false; +} else { +FunctionName funcName = new FunctionName(aggregateType.toString().toLowerCase()); +List arrayList = Lists.newArrayList(ref); +FunctionCallExpr func = new FunctionCallExpr(funcName, new FunctionParams(false, arrayList)); +selectList.addItem(new SelectListItem(func, null)); +aggExprs.add(func); +return true; +} +} +} + +public static boolean rewriteForRandomDistribution(StatementBase statementBase, Analyzer analyzer) Review Comment: add comment to explain this function -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2054015774 run p0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053970302 Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Load test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089 with default session variables Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s Stream load orc: 58 seconds loaded 1101869774 Bytes, about 18 MB/s Stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s Insert into select: 13.6 seconds inserted 1000 Rows, about 735K ops/s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053969446 ClickBench: Total hot run time: 30.35 s ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools ClickBench test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089, data reload: false query1 0.040.030.03 query2 0.080.030.03 query3 0.230.040.05 query4 1.680.060.07 query5 0.500.470.48 query6 1.440.640.66 query7 0.020.010.01 query8 0.050.040.05 query9 0.560.500.49 query10 0.560.570.53 query11 0.150.110.12 query12 0.150.120.12 query13 0.610.590.59 query14 0.750.760.77 query15 0.820.800.80 query16 0.400.360.37 query17 1.011.011.00 query18 0.210.260.21 query19 1.841.801.67 query20 0.010.000.01 query21 15.40 0.640.64 query22 4.466.142.22 query23 18.33 1.381.24 query24 1.760.270.21 query25 0.140.080.08 query26 0.270.160.16 query27 0.070.080.07 query28 13.42 0.991.00 query29 12.61 3.293.28 query30 0.270.060.05 query31 2.880.370.37 query32 3.280.470.46 query33 2.842.762.82 query34 17.07 4.404.38 query35 4.464.464.43 query36 0.630.480.46 query37 0.190.160.16 query38 0.160.140.15 query39 0.040.040.04 query40 0.180.150.14 query41 0.090.040.05 query42 0.050.050.04 query43 0.040.040.03 Total cold run time: 109.75 s Total hot run time: 30.35 s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053967782 TPC-DS: Total hot run time: 183526 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools TPC-DS sf100 test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089, data reload: false query1 868 111711321117 query2 7448255223232323 query3 6656211 201 201 query4 37017 21437 21343 21343 query5 4151390 388 388 query6 230 179 175 175 query7 4030282 282 282 query8 212 170 172 170 query9 5771228222592259 query10 361 232 241 232 query11 14803 14219 14275 14219 query12 135 93 86 86 query13 988 361 365 361 query14 8690690768676867 query15 216 182 174 174 query16 7235272 254 254 query17 1688594 557 557 query18 1474299 273 273 query19 210 156 154 154 query20 94 87 87 87 query21 198 131 125 125 query22 5045492148514851 query23 33767 32849 33381 32849 query24 11203 298630022986 query25 568 420 404 404 query26 908 160 158 158 query27 3067361 358 358 query28 6646207821042078 query29 886 659 634 634 query30 284 177 173 173 query31 954 737 750 737 query32 63 120 54 54 query33 559 244 244 244 query34 892 485 505 485 query35 829 691 717 691 query36 1055946 938 938 query37 108 70 73 70 query38 3666358135393539 query39 1628156815511551 query40 185 128 128 128 query41 47 44 43 43 query42 104 97 97 97 query43 579 541 541 541 query44 1316734 723 723 query45 296 300 255 255 query46 1080731 741 731 query47 2029200619731973 query48 369 294 295 294 query49 834 379 364 364 query50 796 395 386 386 query51 6869681068246810 query52 101 83 90 83 query53 340 282 276 276 query54 254 216 225 216 query55 72 70 69 69 query56 235 216 224 216 query57 1235113211391132 query58 214 199 198 198 query59 3207334830563056 query60 259 256 236 236 query61 111 90 86 86 query62 593 439 427 427 query63 307 273 271 271 query64 3961409439743974 query65 3068299830182998 query66 721 310 364 310 query67 15939 15033 14787 14787 query68 8820543 550 543 query69 604 299 309 299 query70 1277120611391139 query71 499 281 262 262 query72 6842264424292429 query73 903 316 317 316 query74 7132636764436367 query75 3512239523172317 query76 5323115611651156 query77 620 250 249 249 query78 10946 10372 10101 10101 query79 9868520 514 514 query80 1983458 420 420 query81 509 232 226 226 query82 818 91 91 91 query83 212 165 166 165 query84 258 84 79 79 query85 992 314 266 266 query86 420 293 315 293 query87 3739347934923479 query88 5993226423622264 query89 526 366 370 366 query90 1979177 172 172 query91 119 96 98 96 query92 64 48 47 47 query93 6661500 499 499 query94 1125176 178 176 query95 378 286 285 285 query96 601 256 259 256 query97 2659244824592448 query98 229 224 219 219 query99 1182860 887 860 Total cold run time: 306517 ms Total hot run time: 183526 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053963449 TPC-H: Total hot run time: 38061 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit a4f6fb1d9f0301fee11291ac546a958d6a2b3089, data reload: false -- Round 1 -- q1 17609 429042254225 q2 2005184 182 182 q3 10466 114912111149 q4 10191 726 829 726 q5 7542268826412641 q6 219 132 130 130 q7 991 597 571 571 q8 9224204420322032 q9 7903651565046504 q10 8552353635083508 q11 470 240 229 229 q12 472 220 205 205 q13 17783 290829292908 q14 272 220 233 220 q15 515 487 473 473 q16 502 373 377 373 q17 951 618 677 618 q18 7354666265466546 q19 5423152914861486 q20 692 314 308 308 q21 3467273228362732 q22 364 295 304 295 Total cold run time: 112967 ms Total hot run time: 38061 ms - Round 2, with runtime_filter_mode=off - q1 4376427742524252 q2 367 255 271 255 q3 3015275627812756 q4 1861157815571557 q5 5328531652775277 q6 210 123 124 123 q7 2230187919021879 q8 3186334533183318 q9 8564853385578533 q10 4093393839203920 q11 641 521 510 510 q12 779 670 643 643 q13 17814 327030213021 q14 321 306 302 302 q15 519 480 472 472 q16 504 442 445 442 q17 1827153015271527 q18 8113788778567856 q19 1650155416181554 q20 2038185918161816 q21 5227489249454892 q22 524 455 468 455 Total cold run time: 73187 ms Total hot run time: 55360 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053951296 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053932802 Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G' ``` Load test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da with default session variables Stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s Stream load orc: 58 seconds loaded 1101869774 Bytes, about 18 MB/s Stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s Insert into select: 13.3 seconds inserted 1000 Rows, about 751K ops/s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053932055 ClickBench: Total hot run time: 30.2 s ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools ClickBench test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da, data reload: false query1 0.040.040.03 query2 0.070.040.04 query3 0.230.060.06 query4 1.660.090.10 query5 0.510.490.51 query6 1.430.640.65 query7 0.020.010.02 query8 0.040.040.04 query9 0.550.500.51 query10 0.550.550.56 query11 0.150.110.11 query12 0.140.110.12 query13 0.620.590.58 query14 0.760.770.77 query15 0.810.810.81 query16 0.370.360.38 query17 0.971.011.03 query18 0.220.240.22 query19 1.781.671.66 query20 0.010.010.01 query21 15.41 0.660.64 query22 4.377.441.90 query23 18.29 1.481.26 query24 2.130.220.21 query25 0.140.080.08 query26 0.270.160.15 query27 0.080.080.08 query28 13.35 1.000.99 query29 12.58 3.283.28 query30 0.270.050.06 query31 2.950.380.37 query32 3.220.470.46 query33 2.842.812.82 query34 17.28 4.344.43 query35 4.524.494.43 query36 0.650.460.46 query37 0.170.160.15 query38 0.150.150.14 query39 0.040.040.04 query40 0.170.140.14 query41 0.100.050.05 query42 0.060.050.05 query43 0.040.030.04 Total cold run time: 110.01 s Total hot run time: 30.2 s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053930394 TPC-DS: Total hot run time: 184601 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools TPC-DS sf100 test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da, data reload: false query1 1235111211161112 query2 6194277125072507 query3 6648208 203 203 query4 36767 21653 21462 21462 query5 4159388 399 388 query6 240 189 182 182 query7 4046294 284 284 query8 223 171 167 167 query9 5770235522922292 query10 361 239 241 239 query11 14578 14217 14136 14136 query12 141 90 83 83 query13 991 353 349 349 query14 9951683869566838 query15 205 174 184 174 query16 6869258 262 258 query17 1683542 550 542 query18 1530273 269 269 query19 185 149 151 149 query20 91 85 85 85 query21 198 129 121 121 query22 4988483548044804 query23 33760 33118 33456 33118 query24 11198 295929842959 query25 531 410 372 372 query26 810 172 153 153 query27 3085360 363 360 query28 6614211820752075 query29 877 644 614 614 query30 311 182 167 167 query31 967 774 738 738 query32 59 54 51 51 query33 516 256 257 256 query34 931 500 493 493 query35 857 723 715 715 query36 1057976 988 976 query37 116 73 76 73 query38 3728356936513569 query39 1643157715731573 query40 172 134 128 128 query41 46 44 46 44 query42 104 96 96 96 query43 606 547 578 547 query44 1356727 713 713 query45 277 249 279 249 query46 1081744 729 729 query47 2024195719781957 query48 377 301 298 298 query49 832 373 357 357 query50 779 396 401 396 query51 6975685867526752 query52 107 89 89 89 query53 341 277 279 277 query54 245 219 220 219 query55 73 70 70 70 query56 239 218 227 218 query57 1214114311151115 query58 216 198 195 195 query59 3409338132453245 query60 243 234 228 228 query61 92 87 102 87 query62 603 453 443 443 query63 304 283 283 283 query64 4147392941973929 query65 3066302930373029 query66 750 321 322 321 query67 15645 14889 14908 14889 query68 7022550 543 543 query69 531 313 305 305 query70 1298119011311131 query71 472 282 278 278 query72 6585273225622562 query73 823 322 320 320 query74 7145641064146410 query75 3145241123602360 query76 4210113511161116 query77 595 263 253 253 query78 10973 10162 10192 10162 query79 7357522 529 522 query80 2113442 445 442 query81 527 235 240 235 query82 157597 95 95 query83 338 174 169 169 query84 267 84 85 84 query85 1432312 307 307 query86 463 297 278 278 query87 3788350335283503 query88 6153226322802263 query89 479 377 377 377 query90 1959175 174 174 query91 121 95 96 95 query92 56 46 47 46 query93 6345516 507 507 query94 1125185 179 179 query95 381 289 290 289 query96 597 257 259 257 query97 2641249024652465 query98 231 222 223 222 query99 1213843 855 843 Total cold run time: 301396 ms Total hot run time: 184601 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053927326 TPC-H: Total hot run time: 38165 ms ``` machine: 'aliyun_ecs.c7a.8xlarge_32C64G' scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools Tpch sf100 test result on commit 3daf04bc6a177cca660cf2ba968ad577a3cd58da, data reload: false -- Round 1 -- q1 17620 426341744174 q2 2010188 180 180 q3 10472 116011481148 q4 10197 833 709 709 q5 7524267826302630 q6 212 129 128 128 q7 976 599 583 583 q8 9214204420382038 q9 7966657564736473 q10 8579355334803480 q11 457 239 228 228 q12 422 219 211 211 q13 18618 292429172917 q14 271 224 234 224 q15 523 481 480 480 q16 512 399 372 372 q17 958 732 745 732 q18 7320677066846684 q19 5811153815001500 q20 686 311 305 305 q21 3546266128372661 q22 363 315 308 308 Total cold run time: 114257 ms Total hot run time: 38165 ms - Round 2, with runtime_filter_mode=off - q1 4439426242544254 q2 371 271 267 267 q3 2977274227732742 q4 1850155515841555 q5 5359532353255323 q6 209 122 121 121 q7 2247188219111882 q8 3189331233073307 q9 8594850386578503 q10 4071392139773921 q11 617 507 506 506 q12 794 623 650 623 q13 16820 324331563156 q14 317 276 319 276 q15 523 483 461 461 q16 511 462 444 444 q17 1830155414871487 q18 8061820877187718 q19 1680154415901544 q20 2041187218581858 q21 5136497448434843 q22 575 504 473 473 Total cold run time: 72211 ms Total hot run time: 55264 ms ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
DarvenDuan commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053916849 run buildall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
Re: [PR] [fix] (planner) support auto aggregation for random distributed table on legacy planner [doris]
doris-robot commented on PR #33630: URL: https://github.com/apache/doris/pull/33630#issuecomment-2053915942 Thank you for your contribution to Apache Doris. Don't know what should be done next? See [How to process your PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR) Since 2024-03-18, the Document has been moved to [doris-website](https://github.com/apache/doris-website). See [Doris Document](https://cwiki.apache.org/confluence/display/DORIS/Doris+Document). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org