[kudu-CR] rowset metadata: cache min/max encoded keys
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#9). Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully initializing the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. I benchmarked this with ~50GB of flushed YCSB data (92 tablets of varying sizes) on a single node with 4 data directories and a separate WAL/metadata directory. To set up, I let the server flush/compact for a while so bootstrap times wouldn't be dominated by reading WAL segments, and set rowset_metadata_store_keys to true so the tserver had the option of reading the cached keys from the rowset metadata at startup. With the above setup, I started the tserver with a disabled maintenance manager (to avoid further IO) and waited for the tablets to get to a RUNNING state, recording the sum of the logged bootstrap times of each tablet. I repeated this, configuring Kudu to read the keys from the rowset metadata, and to read the keys from the data blocks, dropping OS caches in between runs. The results are below. Run number: 1 2 3 Avg Reading cached keys (s): 26.430 24.143 20.826 23.800 Not reading cached keys (s): 40.578 38.428 37.093 38.700 Based on this, ~15 seconds worth of bootstrapping time was spent on initializing the key index readers, that could be avoided by reading the keys from the rowset metadata instead. Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 139 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/9 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 9 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rowset metadata: cache min/max encoded keys
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 8: (2 comments) > Patch Set 8: > > (5 comments) http://gerrit.cloudera.org:8080/#/c/9372/8/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/9372/8/src/kudu/tablet/cfile_set.cc@159 PS8, Line 159: rowset_metadata_->has_min_encoded_key() && : rowset_metadata_->has_max_encoded_key() Gonna merge these since I think it'll be cleaner. http://gerrit.cloudera.org:8080/#/c/9372/8/src/kudu/tablet/rowset_metadata.cc File src/kudu/tablet/rowset_metadata.cc: http://gerrit.cloudera.org:8080/#/c/9372/8/src/kudu/tablet/rowset_metadata.cc@162 PS8, Line 162: DCHECK_EQ(min_encoded_key_ == boost::none, max_encoded_key_ == boost::none); I think this is a valid intermediary state so I'm going to remove this. -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 04 Apr 2018 01:49:10 + Gerrit-HasComments: Yes
[kudu-CR] rowset metadata: cache min/max encoded keys
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#8). Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully initializing the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. I benchmarked this with ~50GB of flushed YCSB data (92 tablets of varying sizes) on a single node with 4 data directories and a separate WAL/metadata directory. To set up, I let the server flush/compact for a while so bootstrap times wouldn't be dominated by reading WAL segments, and set rowset_metadata_store_keys to true so the tserver had the option of reading the cached keys from the rowset metadata at startup. With the above setup, I started the tserver with a disabled maintenance manager (to avoid further IO) and waited for the tablets to get to a RUNNING state, recording the sum of the logged bootstrap times of each tablet. I repeated this, configuring Kudu to read the keys from the rowset metadata, and to read the keys from the data blocks, dropping OS caches in between runs. The results are below. Run number: 1 2 3 Avg Reading cached keys (s): 26.430 24.143 20.826 23.800 Not reading cached keys (s): 40.578 38.428 37.093 38.700 Based on this, ~15 seconds worth of bootstrapping time was spent on initializing the key index readers, that could be avoided by reading the keys from the rowset metadata instead. Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 143 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/8 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rowset metadata: cache min/max encoded keys
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 8: (5 comments) http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/cfile_set.cc@154 PS7, Line 154: : > instead of checking for .empty(), should we make these optional? an empty e Done http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/diskrowset.cc@80 PS7, Line 80: TAG_FLAG(rowset_metadata_store_keys, experimental); > let's tag as experimental for now as well, since we may well make this defa Done http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.h File src/kudu/tablet/rowset_metadata.h: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.h@244 PS7, Line 244: last_durable_redo_dms_id_(kNoDurableMemStore) { > per comment elsewhere, these probably should be optional<> Done http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.cc File src/kudu/tablet/rowset_metadata.cc: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.cc@83 PS7, Line 83: max_encoded_key_ = boost::none; > I think we need to set them to 'none' in the else case -- it seems all of t Done http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.cc@164 PS7, Line 164: set_min_encoded_key(*min_encoded_ > is this different than just using set_mutable_min_encoded_key(min_encoded_k Ah, good callout. I tried looking at the generated code and it seems like there's not too much of a difference, but set_mutable_min_encoded_key() is definitely more idiomatic. -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 04 Apr 2018 01:30:32 + Gerrit-HasComments: Yes
[kudu-CR] rowset metadata: cache min/max encoded keys
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/cfile_set.cc@154 PS7, Line 154: rowset_metadata_->min_encoded_key().empty() && : !rowset_metadata_->max_encoded_key().empty() instead of checking for .empty(), should we make these optional? an empty encoded key is valid in the case that you have a string column and insert "". Granted, you'll only have one of them, but still seems it would be more correct to base it on whether we have the data, not whether it's an empty string http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/diskrowset.cc@80 PS7, Line 80: TAG_FLAG(rowset_metadata_store_keys, advanced); let's tag as experimental for now as well, since we may well make this default and remove the flag at some point http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.h File src/kudu/tablet/rowset_metadata.h: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.h@244 PS7, Line 244: std::string min_encoded_key_; per comment elsewhere, these probably should be optional<> http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.cc File src/kudu/tablet/rowset_metadata.cc: http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.cc@83 PS7, Line 83: min_encoded_key_ = pb.min_encoded_key(); I think we need to set them to 'none' in the else case -- it seems all of the other code explicitly "zeros" the fields so that you could call LoadFromPB() twice with different PBs and not have leftover data from the prior call http://gerrit.cloudera.org:8080/#/c/9372/7/src/kudu/tablet/rowset_metadata.cc@164 PS7, Line 164: mutable_min_encoded_key()->assign is this different than just using set_mutable_min_encoded_key(min_encoded_key)? -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 04 Apr 2018 00:30:40 + Gerrit-HasComments: Yes
[kudu-CR] rowset metadata: cache min/max encoded keys
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/9372/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9372/6//COMMIT_MSG@29 PS6, Line 29: tablet. I repeated this, configuring Kudu to read the keys from the > This seems to be missing the punch line. Is this implying that, with the pa We could cut out the 15s difference (these two rows are time spent bootstrapping when reading keys from the rowset meta vs reading from the data blocks. Updated to be a bit more clear. -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 30 Mar 2018 17:35:33 + Gerrit-HasComments: Yes
[kudu-CR] rowset metadata: cache min/max encoded keys
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#7). Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully initializing the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. I benchmarked this with ~50GB of flushed YCSB data (92 tablets of varying sizes) on a single node with 4 data directories and a separate WAL/metadata directory. To set up, I let the server flush/compact for a while so bootstrap times wouldn't be dominated by reading WAL segments, and set rowset_metadata_store_keys to true so the tserver had the option of reading the cached keys from the rowset metadata at startup. With the above setup, I started the tserver with a disabled maintenance manager (to avoid further IO) and waited for the tablets to get to a RUNNING state, recording the sum of the logged bootstrap times of each tablet. I repeated this, configuring Kudu to read the keys from the rowset metadata, and to read the keys from the data blocks, dropping OS caches in between runs. The results are below. Run number: 1 2 3 Avg Reading cached keys (s): 26.430 24.143 20.826 23.800 Not reading cached keys (s): 40.578 38.428 37.093 38.700 Based on this, ~15 seconds worth of bootstrapping time was spent on initializing the key index readers, that could be avoided by reading the keys from the rowset metadata instead. Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 124 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/7 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rowset metadata: cache min/max encoded keys
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/9372/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9372/6//COMMIT_MSG@29 PS6, Line 29: Reading from blocks (s): 40.578 38.428 37.093 38.700 This seems to be missing the punch line. Is this implying that, with the patch active, we would cut out this 38sec entirely because we wouldn't read from blocks anymore? -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 30 Mar 2018 02:00:15 + Gerrit-HasComments: Yes
[kudu-CR] rowset metadata: cache min/max encoded keys
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 5: Rebased. Failures were flakes that Alexey fixed in 91e207fccf95230476b4455b825e9eadd106a19f -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 28 Mar 2018 21:21:51 + Gerrit-HasComments: No
[kudu-CR] rowset metadata: cache min/max encoded keys
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#5). Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully open the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. I benchmarked this with ~50GB of flushed YCSB data (92 tablets of varying sizes) on a single node with 4 data directories and a separate WAL/metadata directory. To set up, I let the server flush/compact for a while so bootstrap times wouldn't be dominated by reading WAL segments. I started the tserver, disabling its maintenance manager to avoid further IO, and waited for the tablets to get to a RUNNING state. The summed up logged tablet bootstrap times are reported below: Run number: 1 2 3 Avg Reading from meta (s):26.430 24.143 20.826 23.800 Reading from blocks (s): 40.578 38.428 37.093 38.700 Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 124 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/5 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rowset metadata: cache min/max encoded keys
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 3: Added some units and fixed a typo. -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 28 Mar 2018 20:15:31 + Gerrit-HasComments: No
[kudu-CR] rowset metadata: cache min/max encoded keys
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#4). Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully open the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. I benchmarked this with ~50GB of flushed YCSB data (92 tablets of varying sizes) on a single node with 4 data directories and a separate WAL/metadata directory. To set up, I let the server flush/compact for a while so bootstrap times wouldn't be dominated by reading WAL segments. I started the tserver, disabling its maintenance manager to avoid further IO, and waited for the tablets to get to a RUNNING state. The summed up the logged tablet bootstrap times are reported below: Run number: 1 2 3 Avg Reading from meta (s):26.430 24.143 20.826 23.800 Reading from blocks (s): 40.578 38.428 37.093 38.700 Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 124 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/4 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rowset metadata: cache min/max encoded keys
Hello David Ribeiro Alves, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#3). Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully open the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. I benchmarked this with ~50GB of flushed YCSB data (92 tablets of varying sizes) on a single node with 4 data directories and a separate WAL/metadata directory. To set up, I let the server flush/compact for a while so bootstrap times wouldn't be dominated by reading WAL segments. I started the tserver, disabling its maintenance manager to avoid further IO, and waited for the tablets to get to a RUNNING state. The summed up the logged tablet bootstrap times are reported below: Run number: 1 2 3 Avg Reading from meta:26.430 24.143 20.826 23.800 Reading from blocks: 40.578 38.428 37.093 38.700 Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 124 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/3 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rowset metadata: cache min/max encoded keys
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9372/2/src/kudu/tablet/rowset_metadata.h File src/kudu/tablet/rowset_metadata.h: http://gerrit.cloudera.org:8080/#/c/9372/2/src/kudu/tablet/rowset_metadata.h@243 PS2, Line 243: // The min and max keys of the rowset. : std::string min_encoded_key_; > docs, here or on the setters Done -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 28 Mar 2018 19:15:32 + Gerrit-HasComments: Yes
[kudu-CR] rowset metadata: cache min/max encoded keys
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 2: I think we could pretty easily benchmark this by loading a few hundred GB using YCSB and then measure startup time of the tserver after a drop_caches. I usually see many tens of seconds where stacks indicate it's opening cfile footers at startup in that setting. -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 07 Mar 2018 01:47:33 + Gerrit-HasComments: No
[kudu-CR] rowset metadata: cache min/max encoded keys
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9372/2/src/kudu/tablet/rowset_metadata.h File src/kudu/tablet/rowset_metadata.h: http://gerrit.cloudera.org:8080/#/c/9372/2/src/kudu/tablet/rowset_metadata.h@243 PS2, Line 243: std::string min_encoded_key_; : std::string max_encoded_key_; docs, here or on the setters -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Wed, 21 Feb 2018 23:53:56 + Gerrit-HasComments: Yes
[kudu-CR] rowset metadata: cache min/max encoded keys
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9372 ) Change subject: rowset_metadata: cache min/max encoded keys .. Patch Set 2: before we spent too much time with this path I think we should briefly benchmark it in a cloud setting against: - whole footer caching - no caching Even though this strategy would avoid having to read the footers to build the trees, caching the whole footer still prevents 2 remote reads per file. Lemme know if I can help setting up a benchmark -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Wed, 21 Feb 2018 23:50:09 + Gerrit-HasComments: No
[kudu-CR] rowset metadata: cache min/max encoded keys
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9372 to look at the new patch set (#2). Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully open the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 123 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/2 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew WongGerrit-Reviewer: Kudu Jenkins
[kudu-CR] rowset metadata: cache min/max encoded keys
Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9372 Change subject: rowset_metadata: cache min/max encoded keys .. rowset_metadata: cache min/max encoded keys This patch adds a new flag rowset_metadata_store_keys that, when true, will indicate that Kudu should duplicate diskrowset min/max keys into the rowset metadata. Doing so allows Kudu to read the keys from tablet metadata and bootstrap tablets without having to fully open the CFileReaders for the key columns of each rowset. A small test is added to tablet_server-test that ensures we don't read any extraneous bytes when starting up the tablet server if we're reading keys from the rowset metadata. Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/rowset_metadata.cc M src/kudu/tablet/rowset_metadata.h M src/kudu/tserver/tablet_server-test.cc 7 files changed, 121 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/9372/1 -- To view, visit http://gerrit.cloudera.org:8080/9372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I37d6f7160e3a7188753684e063963110f70e9b8d Gerrit-Change-Number: 9372 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong