Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/9372
to look at the new patch set (#7).
Change subject: rowset_metadata: cache min/max encoded keys
......................................................................
rowset_metadata: cache min/max encoded keys
This patch adds a new flag rowset_metadata_store_keys that, when true,
will indicate that Kudu should duplicate diskrowset min/max keys into
the rowset metadata. Doing so allows Kudu to read the keys from tablet
metadata and bootstrap tablets without having to fully initializing the
CFileReaders for the key columns of each rowset.
A small test is added to tablet_server-test that ensures we don't read
any extraneous bytes when starting up the tablet server if we're reading
keys from the rowset metadata.
I benchmarked this with ~50GB of flushed YCSB data (92 tablets of
varying sizes) on a single node with 4 data directories and a separate
WAL/metadata directory. To set up, I let the server flush/compact for a
while so bootstrap times wouldn't be dominated by reading WAL segments,
and set rowset_metadata_store_keys to true so the tserver had the option
of reading the cached keys from the rowset metadata at startup.
With the above setup, I started the tserver with a disabled maintenance
manager (to avoid further IO) and waited for the tablets to get to a
RUNNING state, recording the sum of the logged bootstrap times of each
tablet. I repeated this, configuring Kudu to read the keys from the
rowset metadata, and to read the keys from the data blocks, dropping OS
caches in between runs. The results are below.
Run number: 1 2 3 Avg
Reading cached keys (s): 26.430 24.143 20.826 23.800
Not reading cached keys (s): 40.578 38.428 37.093 38.700
Based on this, ~15 seconds worth of bootstrapping time was spent on
initializing the key index readers, that could be avoided by reading the
keys from the rowset metadata instead.
Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tserver/tablet_server-test.cc
7 files changed, 124 insertions(+), 17 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/7
--
To view, visit http://gerrit.cloudera.org:8080/9372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d
Gerrit-Change-Number: 9372
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>