Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12239 )
Change subject: [fs]: LBM moves half-present containers out of the way ...................................................................... Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager-test.cc@1911 PS1, Line 1911: MetricRegistry new_registry; : scoped_refptr<MetricEntity> new_entity = : METRIC_ENTITY_server.Instantiate(&new_registry, "test"); Can't you reuse 'entity' and 'registry' below? http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager.cc@755 PS1, Line 755: // Check that both the metadata and data files exist and have valid lengths. It would be good to decompose this entire section into a separate function whose goal is to sanity check the container files. http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager.cc@774 PS1, Line 774: // Handle a half-present container whose data file is missing. Open What about half-present containers whose metadata file is missing? I can't think of a safe way to handle this: 1. Check if the data file has 0 length. Not useful because a full container will have a 10G length. 2. Check if the data occupies 0 bytes on disk. Not reliable because we may have had multiple concurrent deletions at the time of the crash, some of which were going to punch holes in the data file (but didn't). In your runs of dense_node-itest, have you seen cases where the metadata file went missing? Maybe we should rename these orphaned data files? Would probably have to beef up FsReport and Repair() to handle that case, since we wouldn't want to do it in read-only filesystems. http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager.cc@779 PS1, Line 779: set<BlockId> live_blocks; An unordered_set would be more efficient for this use case. http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager.cc@780 PS1, Line 780: unique_ptr<RandomAccessFile> metadata_reader; : RETURN_NOT_OK_CONTAINER_DISK_FAILURE(block_manager->env()->NewRandomAccessFile( : metadata_path, &metadata_reader)); : ReadablePBContainerFile pb_reader(std::move(metadata_reader)); : RETURN_NOT_OK_CONTAINER_DISK_FAILURE(pb_reader.Open()); Please consolidate this with the equivalent code in LogBlockManager::ProcessRecords. http://gerrit.cloudera.org:8080/#/c/12239/1/src/kudu/fs/log_block_manager.cc@801 PS1, Line 801: if ((read_status.IsEndOfFile() && live_blocks.empty())) { What if read_status is not EndOfFile? Seems like that'll cause us to proceed as if the container is fine on L817. -- To view, visit http://gerrit.cloudera.org:8080/12239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a4223d0e6385a7f7776f8fe2679a3bb16d832ab Gerrit-Change-Number: 12239 Gerrit-PatchSet: 1 Gerrit-Owner: helifu <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 18 Jan 2019 19:35:09 +0000 Gerrit-HasComments: Yes
