Andrew Wong has posted comments on this change. Change subject: disk failure: add persistent disk states ......................................................................
Patch Set 5: (44 comments) I know there are already comments on some of these, but I've moved some of the changes to other patches, since not all of them fall under "add persistent disk states". Namely: - changes to fs_manager will be moved to a separate change to handle failures at startup. - changes to ts_disk_failure-test are moved to https://gerrit.cloudera.org/#/c/7441/ - changes to disk_failure-itest are moved to https://gerrit.cloudera.org/#/c/7031/ http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/fs/block_manager_util-test.cc File src/kudu/fs/block_manager_util-test.cc: Line 211: ps.mutable_all_uuids()->Reserve(kAllUuids.size()); > warning: local copy 'path_sets_copy' of the variable 'path_sets' is never m Done Line 220: EXPECT_NO_FATAL_FAILURE(RunCheckIntegrityTest(env_, path_sets_copy, > warning: local copy 'path_sets_copy' of the variable 'path_sets' is never m Done http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS2, Line 62: , 'healthy_instance_' is set to false. : Status LoadF > This conflates file locking with our unlocked suffix convention for in-memo Done Line 64: > Nit: maybe rename to Flush() or FlushToDisk() to emphasize that the content Done Line 65: // Locks the instance metadata file, which must exist on-disk. Returns an > Why? Why not just always write out what's in metadata_? Done. As per the above comment about naming, I've called it FlushMetadataToDisk (for the private generalized call) and UpdateOnDisk (for the public call for just metadata_) http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: > I don't understand the need to store lock_mode_ or to unlock an instance fi Thanks for the explanation. I misinterpreted the purpose of the lock mode based on it's usage in Open() Line 179: DataDir::DataDir(Env* env, > Belongs in status.h, I think. Only used once, removing. Line 221: > warning: redundant get() call on smart pointer [google-readability-redundan Done Line 288: > warning: the parameter 'metric_entity' is copied for each invocation but on Done Line 295: Shutdown(); > May as well store the AccessMode directly instead of adapting FsManager.rea This was here because I misunderstood the purpose of opening the DataDirManager in read_only mode. These shouldn't be necessary. Line 299: // We may be waiting here for a while on outstanding closures. > Why bother initializing path_set_ in this way? You need to load them all fr Done Line 347: } > If this fails, how can we proceed with CheckHolePunch and metadata.Create? Good point, here we can pass through without creating any instance, since the instance should fail to write anyway. PS2, Line 363: for (const string& dir : to_sync) { : Status s = env_->SyncDir(dir); : if (PREDICT_FALSE(!s.ok())) { : LOG(ERROR) << Substitute("Unable to synchronize directory $0", dir) << s.ToString(); : if (s.IsDiskFailure()) { : > Consolidate this. Done Line 375: } > If this fails, shouldn't it affect the validity of the metadata file on 'di If this does fail, the metadata file on 'dir' wouldn't be writable since it's on a failed disk. In this case, the failure should be caught when the file is loaded from disk. Line 377: // Success: don't delete any files. > Trailing whitespace. Done PS2, Line 390: // Open and lock the data dir's metadata instance file. : string instance_filename = JoinPathSegments(p, kI > I don't understand this. Oops, seems like a few things were left in the patch that shouldn't have been... Line 411: "Proceeding with failed data dir", instance_filename); > Why enforce this tight coupling? Was under the impression that it was something we enforce; since that doesn't appear to be the case, dropping it here. Thanks for the notes on locking! Line 436: } > Why bother creating a DataDir for a failed disk? The uuid_idx maps need to know about all disks (failed or not) because tablets' DataDirGroups' dirs' UUIDs are checked when read from disk (i.e. a tablet expects every directory its DataDirGroup to be known to the DataDirManager). Line 455: } > warning: the 'empty' method should be used to check for emptiness instead o Done Line 460: &path_set_, &updated_indices), Substitute("Could not verify integrity of files: $0", > There's no chance of failure here, so why not just update uuid_by_path_ dir Done Line 462: if (uuid_by_path_.empty()) { > Nit: add empty line before. Done Line 485: for (const auto& dd : dds) { > Update (and perhaps generalize, to avoid updating again). Done PS2, Line 510: if (idx > max_data_dirs) { : return Status::NotSupported( : Substitute("Block manager supports a maximum of $0 paths", max_data_dirs)); : } > Remove? Got some other log statements like this elsewhere. Done Line 766: } > ? Done http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: Line 327: // lock_guard of dir_group_lock_. > warning: function 'kudu::fs::DataDirManager::UpdateInstanceFilesUnlocked' h Done http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/fs/fs.proto File src/kudu/fs/fs.proto: PS2, Line 41: derlying fil > What's a "parent disk"? I think you're actually trying to get at the underl Done PS2, Line 60: > Be more specific here, like "inside the set" or whatever. Done PS2, Line 60: ion. > UUIDs (to agree with the plural PathSetPBs) Done Line 64: // useful when there are many such references, as the position is much shorter > How about "All paths...", and also mention that the order matches that of a Done PS2, Line 67: Globally > This is microseconds since the epoch? Can you clarify in the comment? Done PS2, Line 67: ifi > omit Done Line 70: repeated bytes all_uuids = 2; // Deprecated field used before Kudu 1.5.0. > There are a lot of repeated fields here whose orders all match. How about c Done http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: Line 244: LOG(FATAL) << "Invalid block manager: " << FLAGS_block_manager; > What about some of these read calls? And the CleanTmpFiles()/CheckAndFixPer Since block_manager::Open() can now succeed even if there are disk failures, we should just let this pass here and let block_manager::Open() handle it. Line 281: << std::endl << SecureDebugString(*metadata_); > So what happens if we see disk failures across every single disk? Shouldn't Yeah, it should be fatal. This is checked is in CheckIntegrity, and in MarkDataDirFailed. Line 314: for (const string& root : canonicalized_all_fs_roots_) { > Should include s.ToString(). Below too. Done Line 356: } > Nit: move the second condition to a new line to better emphasize its simila Done http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/integration-tests/disk_failure-itest.cc File src/kudu/integration-tests/disk_failure-itest.cc: Line 78 > I don't see this called with anything but the default argument. Done Line 362 > Probably need some NO_FATALS in here. Done Line 374 > Got some whitespace here. Done Line 376 > If we're only injecting failures to one tserver, Done Line 383 > ASSERT_OK Done http://gerrit.cloudera.org:8080/#/c/7270/2/src/kudu/tserver/ts_disk_failure-test.cc File src/kudu/tserver/ts_disk_failure-test.cc: Line 154 > Can this be moved to the FLAGS_env_inject_eio_globs section? Done PS2, Line 163: : > Why this? Done PS2, Line 165: : : > Combine. Done -- To view, visit http://gerrit.cloudera.org:8080/7270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
