[
https://issues.apache.org/jira/browse/IMPALA-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119137#comment-17119137
]
Tim Armstrong commented on IMPALA-9792:
---------------------------------------
This seems to be enough to get it to generate more parallelism:
{noformat}
diff --git a/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java b/fe/s
rc/main/java/org/apache/impala/planner/KuduScanNode.java
index 9e17a32..0b2c271 100644
--- a/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
@@ -261,6 +261,7 @@ public class KuduScanNode extends ScanNode {
}
KuduScanTokenBuilder tokenBuilder = client.newScanTokenBuilder(rpcTable);
tokenBuilder.setProjectedColumnNames(projectedCols);
+ tokenBuilder.setSplitSizeBytes(1024L * 1024L * 8L);
for (KuduPredicate predicate: kuduPredicates_)
tokenBuilder.addPredicate(predicate);
return tokenBuilder.build();
}
{noformat}
{noformat}
[localhost:21000] default> set mt_dop=8; select count(*) from
tpch_kudu.lineitem; summary;
MT_DOP set to 8
Query: select count(*) from tpch_kudu.lineitem
Query submitted at: 2020-05-28 15:21:01 (Coordinator:
http://tarmstrong-box:25000)
Query progress can be monitored at:
http://tarmstrong-box:25000/query_plan?query_id=5147b6d4ef07380c:db99791500000000
+----------+
| count(*) |
+----------+
| 6001215 |
+----------+
Fetched 1 row(s) in 6.56s
+---------------------+--------+-------+----------+----------+-------+------------+-----------+---------------+--------------------+
| Operator | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est.
#Rows | Peak Mem | Est. Peak Mem | Detail |
+---------------------+--------+-------+----------+----------+-------+------------+-----------+---------------+--------------------+
| F01:ROOT | 1 | 1 | 23.15us | 23.15us | |
| 0 B | 0 B | |
| 03:AGGREGATE | 1 | 1 | 330.88us | 330.88us | 1 | 1
| 16.00 KB | 10.00 MB | FINALIZE |
| 02:EXCHANGE | 1 | 1 | 253.10us | 253.10us | 18 | 1
| 152.00 KB | 16.00 KB | UNPARTITIONED |
| F00:EXCHANGE SENDER | 3 | 18 | 135.04us | 431.50us | |
| 16.00 KB | 0 B | |
| 01:AGGREGATE | 3 | 18 | 0ns | 0ns | 18 | 1
| 20.00 KB | 10.00 MB | |
| 00:SCAN KUDU | 3 | 18 | 82.23ms | 86.85ms | 18 | 6.00M
| 0 B | 384.00 KB | tpch_kudu.lineitem |
+---------------------+--------+-------+----------+----------+-------+------------+-----------+---------------+--------------------+
[localhost:21000] default> show partitions tpch_kudu.lineitem;
Query: show partitions tpch_kudu.lineitem
+-----------+----------+-----------------+-----------+
| Start Key | Stop Key | Leader Replica | #Replicas |
+-----------+----------+-----------------+-----------+
| | 00000001 | 127.0.0.1:31202 | 3 |
| 00000001 | 00000002 | 127.0.0.1:31201 | 3 |
| 00000002 | 00000003 | 127.0.0.1:31200 | 3 |
| 00000003 | 00000004 | 127.0.0.1:31200 | 3 |
| 00000004 | 00000005 | 127.0.0.1:31201 | 3 |
| 00000005 | 00000006 | 127.0.0.1:31202 | 3 |
| 00000006 | 00000007 | 127.0.0.1:31202 | 3 |
| 00000007 | 00000008 | 127.0.0.1:31202 | 3 |
| 00000008 | | 127.0.0.1:31200 | 3 |
+-----------+----------+-----------------+-----------+
Fetched 9 row(s) in 0.03s
{noformat}
> Split Kudu scan ranges into smaller chunks for greater paralellelism
> --------------------------------------------------------------------
>
> Key: IMPALA-9792
> URL: https://issues.apache.org/jira/browse/IMPALA-9792
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Tim Armstrong
> Priority: Major
> Labels: kudu, multithreading
>
> We currently use one thread to scan each tablet, which may underparallelise
> queries in many cases. Kudu added an API in KUDU-2437 and KUDU-2670 to split
> tokens at a finer granularity.
> See
> https://github.com/apache/kudu/commit/22a6faa44364dec3a171ec79c15b814ad9277d8f#diff-a4afa9dba99c7612b2cb9176134ff2b0
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]