Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12075 )
Change subject: KUDU-2636: LBM supports deleting dead and full containers ...................................................................... Patch Set 5: (4 comments) Looks like you have a TSAN data race to figure out too. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1979 PS4, Line 1979: for (auto& name : names) { > i have no idea :( Maybe define a helper function like this? void LogBlockManager::DeleteDeadContainersFromDisk(DataDir* dir, const vector<string>& names) { int64_t deleted_metadata_bytes = 0; for (const auto& n : names) { string data_file_name = StrCat(n, kContainerDataFileSuffix); string metadata_file_name = StrCat(n, kContainerMetadataFileSuffix); uint64_t metadata_size; Status s = env_->GetFileSize(metadata_file_name, &metadata_size); if (s.ok()) { deleted_metadata_bytes += metadata_size; } else { WARN_NOT_OK_LBM_DISK_FAILURE(s, "Could not get size of dead container metadata file " + metadata_file_name); } WARN_NOT_OK_LBM_DISK_FAILURE(file_cache_.DeleteFile(data_file_name), "Could not delete dead container data file " + data_file_name); WARN_NOT_OK_LBM_DISK_FAILURE(file_cache_.DeleteFile(metadata_file_name), "Could not delete dead container metadata file " + metadata_file_name); } if (!names.empty()) { WARN_NOT_OK_LBM_DISK_FAILURE(env_->SyncDir(dir->dir()), "Could not sync data directory"); LOG(INFO) << Substitute("Deleted $0 dead containers ($1 metadata bytes)", dead_containers.size(), deleted_metadata_bytes); } } You could call it here on L1979 and it could replace the code on L2553-2586. Make sure to preserve the useful comments though. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1980 PS4, Line 1980: string data_file_name = StrCat(name, kContainerDataFileSuffix); > done. Looks like you missed this. http://gerrit.cloudera.org:8080/#/c/12075/4/src/kudu/fs/log_block_manager.cc@1983 PS4, Line 1983: file_cache_.Invalidate(metadata_file_name); : WARN_NOT_OK_LBM_DISK_FAILURE(env_->DeleteFi > The function 'FileCache<FileType>::DeleteFile' does not delete file directl That's true, but the call to RemoveFullContainersUnlocked on L1975 will have deleted the LogBlockContainer instances from memory, which means the associated descriptors from the FileCache will have been destroyed. So if we were to call FileCache::DeleteFile here to delete the container's files, I think we'll end up in the "There is no outstanding descriptor" case, evict the fd from the cache, and delete the file. Does that sound right to you? It's been a while since I've looked at the FileCache code so maybe I missed something. Anyway, if you agree, we don't need the Invalidate calls and file_cache_.DeleteFile should be safe. http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/12075/5/src/kudu/fs/log_block_manager.cc@1237 PS5, Line 1237: // Due to the shared ownership of LogBlockDeletionTransaction, the : // destructor is responsible for destroying all registered blocks. : // This includes: : // 1. Punching holes in deleted blocks, and : // 2. Deleting dead containers outright. There's something wrong here with the formatting; looks like there's an extra empty line for each line of text. -- To view, visit http://gerrit.cloudera.org:8080/12075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7d3672ae3c3dd9acd489120d653c44a641537f10 Gerrit-Change-Number: 12075 Gerrit-PatchSet: 5 Gerrit-Owner: helifu <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu <[email protected]> Gerrit-Comment-Date: Tue, 18 Dec 2018 22:02:49 +0000 Gerrit-HasComments: Yes
