Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6483

to look at the new patch set (#11).

Change subject: tablet: check for row presence in rowset-wise order
......................................................................

tablet: check for row presence in rowset-wise order

This changes the overall flow of applying batches of writes from:

  for each key in batch:
    find which rowsets may contain this key
    for each rowset:
      check if key is present in rowset
    if not present in any rowset, apply the write

to:

  sort all ops by key
  bulk-query the interval tree
  for each rowset:
    for each key which might be in that rowset, in sorted order:
      check if the key is present
      if so, mark it as such
  for each op:
    if insert:
      if it's not present in any rowset (based on above marking)
        apply it
    if mutate:
      apply to the rowset determined above

The one wrinkle not expressed in the above pseudo-code is the handling
of batches with duplicate keys (eg insert/update/delete/insert
sequences of the same PK in one batch). In those cases, the later
operations may find their keys in different RowSets based on the results
of the earlier operations, and thus we need to continue to do the
presence checks one-by-one.

The aim with this patch is that because we end up accessing each rowset
many times sequentially in ascending key order, we open up the
opportunity for two potential optimizations:

1) a thread-local cache can keep seeked iterators within that rowset, as
   well as hot blocks such as the index blocks, and reuse it cross-op
   within the same batch
2) we can detect when several of the ops in a batch all fall into a
   "gap" in a particular rowset, and "skip ahead", short circuiting a
   bunch of checks

This patch isn't meant to give big wins on its own, but the further
patches in the series reap big benefits (see later commit messages for details)

As a simple benchmark to make sure performance didn't regress, I used
tpch_real_world with 12 writer threads on a single machine for scale
factor 100 (600M rows).

Before:
  Wall time: 1007s
  Server CPU time: 9333s

After:
  Wall time: 1014s (+0.7%, probably noise)
  Server CPU time: 8861s (-5.3%, could be significant)

Change-Id: I9a678bb0da130ce94aae3634c82ace2d6898ffa2
---
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/tablet/row_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tablet/transactions/write_transaction.h
6 files changed, 306 insertions(+), 85 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/6483/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9a678bb0da130ce94aae3634c82ace2d6898ffa2
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to