Todd Lipcon has posted comments on this change. Change subject: KUDU-1549: compact LBM container metadata files at startup ......................................................................
Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS2, Line 87: DEFINE_double(log_container_live_metadata_before_compact_ratio, 0.10, : "Desired ratio of live block metadata in log containers. If a " : "container's live to total block ratio dips below this value, " : "the container's metadata file will be compacted at startup."); > I'll tag it advanced. I don't think experimental is appropriate given that hmm, I guess i was thinking experimental because who knows if a ratio is even the right metric? It's pretty internal-facing, so seems like the kind of thing we may never want to allow a user to tweak? (is it reasonable to have something be experimental forever?) As for a better value, I was thinking something like 0.5 -- it's worth compacting if we think we can reduce startup time by at least 50%, perhaps? Do we know how to map a metadata file size/block-count to a time, approximately? ie is the timing O(size) or O(record count)? http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: PS2, Line 295: // 3. Containers in 'low_live_block_containers' will have their metadata : // files compacted. : > I went back and forth on this. I do like the idea of not making any on-disk hmm, trying to follow this suggestion. So, you're saying that since we already have the list of live blocks for each container, we could just dump that out rather than rereading the original block *records*? Also maybe I misread the original code. Is the vector<BlockRecordPB> here just the remaining _live_ blocks? If so, maybe it's not so bad, since it's bounded at 2x the normal "steady state" usage of the LBM, and we're always assuming that is a small fraction of overall TS heap. In the first read through I was thinking this woudl be _all_ records, not just te live ones. -- To view, visit http://gerrit.cloudera.org:8080/6826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
