Hello Alexey Serbin, Kurt Deschler, Wenzhe Zhou, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/18536
to look at the new patch set (#5).
Change subject: IMPALA-10465: Use IGNORE variant of Kudu write operations
......................................................................
IMPALA-10465: Use IGNORE variant of Kudu write operations
KUDU-1563 added support for INSERT_IGNORE, UPDATE_IGNORE, and
DELETE_IGNORE to handle cases where users want to ignore primary key
errors in an efficient way. Impala already does this today for it's
INSERT behavior, however it does so by ignoring the per-row errors from
Kudu client side. This requires a large error buffer (which may need to
be expanded in rare cases) to log all of the warning messages which
users often do not care about and causes significant RPC overhead.
This patch change the the Kudu write operation by Impala to use
INSERT_IGNORE, UPDATE_IGNORE, and DELETE_IGNORE if Kudu master server
support it and backend flag "kudu_ignore_conflicts" is true.
The table below shows performance difference after the patch by runing
kudu-ignore workload in 9 iterations:
+----------------------+--------+-------------+------------+------------+----------------+
| Query | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base
StdDev(%) |
+----------------------+--------+-------------+------------+------------+----------------+
| KUDU-IGNORE-1-CREATE | 3.61 | 3.16 | +13.99% | * 33.84% * | *
38.04% * |
| KUDU-IGNORE-3-UPDATE | 30.06 | 30.52 | -1.53% | 0.18% |
0.58% |
| KUDU-IGNORE-0-DROP | 0.17 | 0.18 | -6.63% | * 14.22% * | *
13.17% * |
| KUDU-IGNORE-4-DELETE | 30.21 | 32.19 | -6.16% | 8.22% | *
10.09% * |
| KUDU-IGNORE-2-INSERT | 48.91 | 71.09 | I -31.20% | 0.60% |
0.72% |
+----------------------+--------+-------------+------------+------------+----------------+
Additionally, this patch adds 'workload_iterations' option in
single_node_perf_run.py to loop a workload and modify DDL_CRUD_PATTERN
in query_executor.py to not not trigger EXPLAIN in regular CREATE
query (non CTAS).
Testing:
- Pass core tests.
Change-Id: I8da7c41d61b0888378b390b8b643238433eb3b52
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M bin/single_node_perf_run.py
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
A testdata/workloads/targeted-perf/queries/kudu-ignore.test
M tests/custom_cluster/test_kudu.py
M tests/performance/query_executor.py
7 files changed, 254 insertions(+), 29 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/18536/5
--
To view, visit http://gerrit.cloudera.org:8080/18536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8da7c41d61b0888378b390b8b643238433eb3b52
Gerrit-Change-Number: 18536
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>