Hello David Ribeiro Alves, Mike Percy, Adar Dembo, I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/3221 to review the following change. Change subject: KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas ...................................................................... KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas When handling a zipfian update workload, we have a lot of delta records applied to a single row. In the compaction code path, we were previously using Mutation::AppendToList() to append each such delta to the end of a linked list. Thus, constructing the linked list was O(n^2) in the number of deltas corresponding to a row. Each such iteration also most likely was a CPU cache miss, making this quite slow. In fact, in a YCSB workload I'm currently testing, a single compaction has been running for over an hour and a half, spending all of its time in the AppendToList function. This patch changes the flow so that the deltas are collected by prepending them to the linked list (an O(1) operation). The resulting list is then reversed (an O(n) operation) before feeding into the rest of the unchanged compaction code path. This patch is also notable for being one of the few times in real life where a "reverse the linked list" algorithm was actually necessary. Due to the notoriously tricky nature of this foundational computer science problem (I once botched it on a job interview), and the fact that this was a so-called "real life situation", I googled the algorithm. Thus, I am confident of its correctness. Aside from such assurances, this code path is also very well covered by existing test cases which perform multiple updates against the same row (in particular fuzz-itest). Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 --- M src/kudu/tablet/compaction.cc M src/kudu/tablet/compaction.h M src/kudu/tablet/delta_store.h M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltamemstore.cc M src/kudu/tablet/memrowset.cc M src/kudu/tablet/memrowset.h M src/kudu/tablet/mutation.cc M src/kudu/tablet/mutation.h 9 files changed, 39 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/3221/1 -- To view, visit http://gerrit.cloudera.org:8080/3221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <david.al...@cloudera.com> Gerrit-Reviewer: Mike Percy <mpe...@apache.org>