Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
......................................................................


Patch Set 4:

Perf results from running on the 10 node cluster:

For smaller queries that we were able to handle previously, there's a 
regression of about 10% in overall query running time (all averaged over 3 
runs):
240.13s with the patch vs. 224.58s previously for 200m rows inserted
472.97s with the patch vs. 433.05s previously for 400m rows inserted
That's unfortunate, but can be improved in the future, e.g. by codegen-ing the 
partition function.

But, we can now handle significantly larger inserts - I was seeing timeouts 
regularly at > 400m rows previously, but with the patch I've tested up to 6b 
row inserts without any timeouts.

-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-HasComments: No

Reply via email to