Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/3224
Change subject: KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas ...................................................................... KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas When handling a zipfian update workload, we have a lot of delta records applied to a single row. In the compaction code path, we were previously using Mutation::AppendToList() to append each such delta to the end of a linked list. Thus, constructing the linked list was O(n^2) in the number of deltas corresponding to a row. Each such iteration also most likely was a CPU cache miss, making this quite slow. In fact, in a YCSB workload I'm currently testing, a single compaction has been running for over an hour and a half, spending all of its time in the AppendToList function. This patch changes the flow so that the deltas are collected by prepending them to the linked list (an O(1) operation). The resulting list is then reversed (an O(n) operation) before feeding into the rest of the unchanged compaction code path. This patch is also notable for being one of the few times in real life where a "reverse the linked list" algorithm was actually necessary. Due to the notoriously tricky nature of this foundational computer science problem (I once botched it on a job interview), and the fact that this was a so-called "real life situation", I googled the algorithm. Thus, I am confident of its correctness. Aside from such assurances, this code path is also very well covered by existing test cases which perform multiple updates against the same row (in particular fuzz-itest). Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 Reviewed-on: http://gerrit.cloudera.org:8080/3221 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo <a...@cloudera.com> (cherry picked from commit c7178e97e842f42e9ed9d5e9e2a4f521fbe70b6b) --- M src/kudu/tablet/compaction.cc M src/kudu/tablet/compaction.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_store.h M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltamemstore.cc M src/kudu/tablet/memrowset.cc M src/kudu/tablet/memrowset.h M src/kudu/tablet/mutation.cc M src/kudu/tablet/mutation.h 10 files changed, 40 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/3224/1 -- To view, visit http://gerrit.cloudera.org:8080/3224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.9.x Gerrit-Owner: Todd Lipcon <t...@apache.org>