Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#8). Change subject: rowset_metadata: cache min/max encoded keys ...................................................................... rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully initializing the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. I benchmarked this with ~50GB of flushed YCSB data (92 tablets of varying sizes) on a single node with 4 data directories and a separate WAL/metadata directory. To set up, I let the server flush/compact for a while so bootstrap times wouldn't be dominated by reading WAL segments, and set rowset_metadata_store_keys to true so the tserver had the option of reading the cached keys from the rowset metadata at startup. With the above setup, I started the tserver with a disabled maintenance manager (to avoid further IO) and waited for the tablets to get to a RUNNING state, recording the sum of the logged bootstrap times of each tablet. I repeated this, configuring Kudu to read the keys from the rowset metadata, and to read the keys from the data blocks, dropping OS caches in between runs. The results are below. Run number: 1 2 3 Avg Reading cached keys (s): 26.430 24.143 20.826 23.800 Not reading cached keys (s): 40.578 38.428 37.093 38.700 Based on this, ~15 seconds worth of bootstrapping time was spent on initializing the key index readers, that could be avoided by reading the keys from the rowset metadata instead. Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 143 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/8 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org>