Hello Tidy Bot, Alexey Serbin, Attila Bukor, Kudu Jenkins, Adar Dembo, 
Volodymyr Verovkin,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15145

to look at the new patch set (#3).

Change subject: KUDU-1625: background op to GC ancient, fully deleted rowsets
......................................................................

KUDU-1625: background op to GC ancient, fully deleted rowsets

This adds an experimental background op that deletes disk rowsets that
have had all of their rows deleted. If the most recent update to a
rowset is older than the ancient history mark, and the rowset contains
no live rows, that rowset will be deleted.

It'd be nice if we could have the policy work for rowsets that are
mostly deleted, but such a solution would come with difficult questions
around write amplification and compatibility with the existing
compactions strategies. For instance, a more complete solution would
need to consider whether to rewrite a rowset if it had 25%, 50%, or 75%
deleted rows: some operators wouldn't mind the write amplification to
save space. However, picking a good heuristic (or exposing some knobs to
turn) makes this tricky.

The benefit of the approach in this patch is that no such tradeoff needs
to be made: the "write amplification" is minimal here because no new
data blocks are written in performing the operation -- the tablet
metadata is rewritten to exclude the blocks, and the underlying blocks
are deleted, which isn't IO intensive either.

There's still room for improvement in this implementation in that,
currently, a DMS flush will write stats to disk and we'll only read the
stats if we Init() the DeltaFileReader (e.g. on scan).

Change-Id: I696e2a29ea52ad4e54801b495c322bc371787124
---
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/deltamemstore.cc
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/mt-tablet-test.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test-base.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_history_gc-test.cc
M src/kudu/tablet/tablet_metrics.cc
M src/kudu/tablet/tablet_metrics.h
M src/kudu/tablet/tablet_mm_ops.cc
M src/kudu/tablet/tablet_mm_ops.h
19 files changed, 469 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/45/15145/3
--
To view, visit http://gerrit.cloudera.org:8080/15145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I696e2a29ea52ad4e54801b495c322bc371787124
Gerrit-Change-Number: 15145
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <abu...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Volodymyr Verovkin <verjov...@cloudera.com>

Reply via email to