Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables ......................................................................
Patch Set 4: > > Perf results from running on the 10 node cluster: > > > > For smaller queries that we were able to handle previously, > there's > > a regression of about 10% in overall query running time (all > > averaged over 3 runs): > > 240.13s with the patch vs. 224.58s previously for 200m rows > > inserted > > 472.97s with the patch vs. 433.05s previously for 400m rows > > inserted > > That's unfortunate, but can be improved in the future, e.g. by > > codegen-ing the partition function. > > > > But, we can now handle significantly larger inserts - I was > seeing > > timeouts regularly at > 400m rows previously, but with the patch > > I've tested up to 6b row inserts without any timeouts. > > Those are good results, and the perf regression is fairly > negligible, so let's not worry about that for now. Currently KuduPartitionExpr::GetIntVal is very expensive, although it is not currently the bottleneck we should. During the conducted tests is soft memory limit reached? -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: No
