Todd Lipcon has submitted this change and it was merged. Change subject: rowset_tree: add bulk queries ......................................................................
rowset_tree: add bulk queries This extends the bulk query API from IntervalTree up into RowSet. TestRowSetTreePerformance.TestPerformance shows a noticeable improvement for the cases where the number of query points is large: Q= 10 R= 10 1-by-1 3 ms Q= 10 R= 10 batched 0 ms (3.00x) Q= 100 R= 10 1-by-1 9 ms Q= 100 R= 10 batched 9 ms (1.11x) Q= 500 R= 10 1-by-1 54 ms Q= 500 R= 10 batched 57 ms (0.95x) Q= 1000 R= 10 1-by-1 92 ms Q= 1000 R= 10 batched 125 ms (0.74x) Q= 5000 R= 10 1-by-1 490 ms Q= 5000 R= 10 batched 814 ms (0.60x) Q= 10 R= 100 1-by-1 4 ms Q= 10 R= 100 batched 4 ms (1.25x) Q= 100 R= 100 1-by-1 19 ms Q= 100 R= 100 batched 22 ms (0.87x) Q= 500 R= 100 1-by-1 112 ms Q= 500 R= 100 batched 78 ms (1.43x) Q= 1000 R= 100 1-by-1 210 ms Q= 1000 R= 100 batched 181 ms (1.16x) Q= 5000 R= 100 1-by-1 1031 ms Q= 5000 R= 100 batched 941 ms (1.10x) Q= 10 R= 250 1-by-1 1 ms Q= 10 R= 250 batched 7 ms (0.13x) Q= 100 R= 250 1-by-1 23 ms Q= 100 R= 250 batched 20 ms (1.14x) Q= 500 R= 250 1-by-1 166 ms Q= 500 R= 250 batched 112 ms (1.48x) Q= 1000 R= 250 1-by-1 333 ms Q= 1000 R= 250 batched 207 ms (1.61x) Q= 5000 R= 250 1-by-1 1568 ms Q= 5000 R= 250 batched 1155 ms (1.36x) Q= 10 R= 500 1-by-1 10 ms Q= 10 R= 500 batched 8 ms (1.37x) Q= 100 R= 500 1-by-1 46 ms Q= 100 R= 500 batched 45 ms (1.02x) Q= 500 R= 500 1-by-1 238 ms Q= 500 R= 500 batched 169 ms (1.41x) Q= 1000 R= 500 1-by-1 451 ms Q= 1000 R= 500 batched 307 ms (1.47x) Q= 5000 R= 500 1-by-1 2234 ms Q= 5000 R= 500 batched 1487 ms (1.50x) The cases where the number of query points is small relative to the number of rowsets are worse, as predicted by big-O analysis, but in those cases the other fixed overhead of operations (eg RPC overhead) are probably so much larger that it isn't noticeable. If it does turn out to be noticeable for the small-batch case, we could easily switch to the one-at-a-time algorithm when Q << R. More important than the above CPU optimizations, however, is that this will allow for other higher-level optimizations during the process of applying row operations. Change-Id: I6ab24681dfbb3b1e6f08d52eb0647a5f3ca6851f Reviewed-on: http://gerrit.cloudera.org:8080/6482 Tested-by: Kudu Jenkins Reviewed-by: David Ribeiro Alves <[email protected]> --- M src/kudu/tablet/rowset_tree-test.cc M src/kudu/tablet/rowset_tree.cc M src/kudu/tablet/rowset_tree.h 3 files changed, 130 insertions(+), 14 deletions(-) Approvals: David Ribeiro Alves: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6482 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6ab24681dfbb3b1e6f08d52eb0647a5f3ca6851f Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]>
