Thanks for being modest. I wondered why it's an Append (unlike what other
databases are doing).

Just curious, what the perf gain after the change for this specific test?

On Wed, May 25, 2016 at 4:01 PM, Todd Lipcon (Code Review) <
ger...@cloudera.org> wrote:

> Hello David Ribeiro Alves, Mike Percy, Adar Dembo,
>
> I'd like you to do a code review.  Please visit
>
>     http://gerrit.cloudera.org:8080/3221
>
> to review the following change.
>
> Change subject: KUDU-749 (part 2): avoid O(n^2) behavior when compacting
> deltas
> ......................................................................
>
> KUDU-749 (part 2): avoid O(n^2) behavior when compacting deltas
>
> When handling a zipfian update workload, we have a lot of delta records
> applied
> to a single row. In the compaction code path, we were previously using
> Mutation::AppendToList() to append each such delta to the end of a linked
> list.
> Thus, constructing the linked list was O(n^2) in the number of deltas
> corresponding to a row. Each such iteration also most likely was a CPU
> cache
> miss, making this quite slow.
>
> In fact, in a YCSB workload I'm currently testing, a single compaction has
> been
> running for over an hour and a half, spending all of its time in the
> AppendToList function.
>
> This patch changes the flow so that the deltas are collected by prepending
> them
> to the linked list (an O(1) operation). The resulting list is then
> reversed (an
> O(n) operation) before feeding into the rest of the unchanged compaction
> code
> path.
>
> This patch is also notable for being one of the few times in real life
> where a
> "reverse the linked list" algorithm was actually necessary. Due to the
> notoriously tricky nature of this foundational computer science problem (I
> once
> botched it on a job interview), and the fact that this was a so-called
> "real
> life situation", I googled the algorithm. Thus, I am confident of its
> correctness. Aside from such assurances, this code path is also very well
> covered by existing test cases which perform multiple updates against the
> same
> row (in particular fuzz-itest).
>
> Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5
> ---
> M src/kudu/tablet/compaction.cc
> M src/kudu/tablet/compaction.h
> M src/kudu/tablet/delta_store.h
> M src/kudu/tablet/deltafile.cc
> M src/kudu/tablet/deltamemstore.cc
> M src/kudu/tablet/memrowset.cc
> M src/kudu/tablet/memrowset.h
> M src/kudu/tablet/mutation.cc
> M src/kudu/tablet/mutation.h
> 9 files changed, 39 insertions(+), 44 deletions(-)
>
>
>   git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/3221/1
> --
> To view, visit http://gerrit.cloudera.org:8080/3221
> To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
>
> Gerrit-MessageType: newchange
> Gerrit-Change-Id: If6bfa3fc6f41998b0f1ff58c5c8ea39881022de5
> Gerrit-PatchSet: 1
> Gerrit-Project: kudu
> Gerrit-Branch: master
> Gerrit-Owner: Todd Lipcon <t...@apache.org>
> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
> Gerrit-Reviewer: David Ribeiro Alves <david.al...@cloudera.com>
> Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
>

Reply via email to