Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11856 )

Change subject: (01/05) delta_store: avoid copying deleted row data in 
ApplyUpdates
......................................................................

(01/05) delta_store: avoid copying deleted row data in ApplyUpdates

When applying deltas, the scan path will first populate a selection vector
with deletes (i.e. a row is unset if there's a relevant DELETE for it), and
then use it to enable two optimizations:
1. Short-circuit out if all rows in the block have been deleted.
2. Skip predicate evaluation and base data copying for deleted rows.

To this we can add a third optimization: don't apply a delta whose mutation
is to a deleted row. Note that it's not incorrect to apply such deltas (as
we'll skip deleted rows when serializing the scan response to the client);
it's just unnecessary work.

I wrote a microbenchmark to evaluate the impact of this optimization. Below
is the timing and perf stat output of the microbenchmark before and after
applying the optimization (specifically, applying this patch, then
commenting out the filtering in DeltaPrepare::ApplyUpdates for the first
run, and uncommenting it for the next run).

  Time spent running 1000 scans with 0 deletes: real 0.523s     user 0.524s     
sys 0.000s
  Time spent running 1000 scans with 10 deletes: real 0.515s    user 0.516s     
sys 0.000s
  Time spent running 1000 scans with 100 deletes: real 0.512s   user 0.512s     
sys 0.000s
  Time spent running 1000 scans with 1000 deletes: real 0.553s  user 0.552s     
sys 0.000s

   Performance counter stats for 'bin/deltamemstore-test 
--gtest_filter=*Varying* --benchmark_num_passes=1000':

       2201.029290      task-clock (msec)         #    0.991 CPUs utilized
                 5      context-switches          #    0.002 K/sec
                 0      cpu-migrations            #    0.000 K/sec
             4,950      page-faults               #    0.002 M/sec
     8,276,723,832      cycles                    #    3.760 GHz
    24,539,935,941      instructions              #    2.96  insn per cycle
     4,709,709,705      branches                  # 2139.776 M/sec
        12,631,579      branch-misses             #    0.27% of all branches

       2.220370506 seconds time elapsed

  Time spent running 1000 scans with 0 deletes: real 0.474s     user 0.475s     
sys 0.000s
  Time spent running 1000 scans with 10 deletes: real 0.475s    user 0.472s     
sys 0.004s
  Time spent running 1000 scans with 100 deletes: real 0.478s   user 0.476s     
sys 0.004s
  Time spent running 1000 scans with 1000 deletes: real 0.550s  user 0.552s     
sys 0.000s

   Performance counter stats for 'bin/deltamemstore-test 
--gtest_filter=*Varying* --benchmark_num_passes=1000':

       2074.795741      task-clock (msec)         #    0.990 CPUs utilized
                23      context-switches          #    0.011 K/sec
                 1      cpu-migrations            #    0.000 K/sec
             4,951      page-faults               #    0.002 M/sec
     7,675,100,058      cycles                    #    3.699 GHz
    23,100,692,252      instructions              #    3.01  insn per cycle
     4,539,777,117      branches                  # 2188.060 M/sec
        11,819,267      branch-misses             #    0.26% of all branches

       2.096193851 seconds time elapsed

Note: I originally wrote this patch thinking it was necessary for diff scan
correctness, but have since convinced myself that it's just an optimization.

Change-Id: I6a646d0816a96e9aba486c0d0a1e6b4a7e15c144
Reviewed-on: http://gerrit.cloudera.org:8080/11856
Tested-by: Kudu Jenkins
Reviewed-by: Grant Henke <[email protected]>
Reviewed-by: Mike Percy <[email protected]>
---
M src/kudu/tablet/delta_applier.cc
M src/kudu/tablet/delta_applier.h
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/deltamemstore.cc
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/tablet-test-util.h
13 files changed, 116 insertions(+), 46 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Grant Henke: Looks good to me, but someone else must approve
  Mike Percy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6a646d0816a96e9aba486c0d0a1e6b4a7e15c144
Gerrit-Change-Number: 11856
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to