Adar Dembo has posted comments on this change. Change subject: disk failure: add persistent disk states ......................................................................
Patch Set 6: (29 comments) http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/block_manager_util-test.cc File src/kudu/fs/block_manager_util-test.cc: PS6, Line 65: ASSERT_EQ(kUuid, path_set.all_paths(0).uuid()); : ASSERT_EQ(PathHealthStatePB::HEALTHY, path_set.all_paths(0).health_state()); : ASSERT_EQ(kPath, path_set.all_paths(0).path()); Nit: maybe extract a "const PathPB& path" so you're not repeating path_set.all_paths(0) over and over? http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/block_manager_util.cc File src/kudu/fs/block_manager_util.cc: Line 53: healthy_instance_ = false; \ Maybe we should store the Status that made us unhealthy instead? Then we could return that in various places where we need to create a bad Status to reflect !healthy_instance(). Line 88: // Set up the path set. Should we still populate all_uuids, to support downgrades? Or does that not work? Line 91: new_path_set->set_timestamp_us(1); Shouldn't this be a value of now? PS6, Line 97: new_path_set->add_all_paths(); : *new_path_set->mutable_all_paths(i) = new_path; Can you combine: new_path_set->add_all_paths(std::move(new_path)) Or some such? Line 149: Status PathInstanceMetadataFile::Lock() { Do we expect disk failure errors in LockFile or UnlockFile? Should we use RETURN_NOT_OK_FAIL_INSTANCE_PREPEND there too? http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS6, Line 109: PathInstanceMetadataPB* Pass by cref? Line 116: bool healthy_instance_; Nit: how about just healthy_ (and healthy() for the accessor)? http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: Line 212: if (instance()->healthy_instance()) { A silent no-op if the instance isn't healthy seems error-prone. Can we make this crash instead? Line 256: metadata_file_->SetInstanceFailed(); This is weird; it's the DataDir that failed, not the metadata file. Why not just call DataDirManager::MarkDataDirFailed()? Line 359: uuid_by_path_.swap(uuid_by_path); Why do this before SyncDir()? Shouldn't it be safer to do after we're guaranteed there can be no failures? PS6, Line 406: if (!new_status.IsDiskFailure()) { : return new_status; : } Oh, so I think I get this. 1. For PathInstanceMetadataFile::Lock() and Unlock(), the disk failure is externalized (i.e. callers should check the returned status for disk failure and SetInstanceFailed() if it's detected. 2. For PathInstanceMetadataFile::Create() and LoadFromDisk(), the disk failure is internalized: those functions will SetInstanceFailed() themselves and return no error. This inconsistency is confusing; is there a particular reason for it? Line 419: if (PREDICT_TRUE(instance->healthy_instance())) { Why not create the pool either way? It's very low overhead and will mean you don't need if statements around the existence of the pool. Line 450: dd->mutable_instance()->SetInstanceFailed(); Here's another case where the PathInstanceMetadataFile disk failure is detected and effected externally. I think we should treat this consistently: either PathInstanceMetadataFile healthy is totally internal and not set by outside callers, or totally external and PathInstanceMetadataFile is "dumber". PS6, Line 460: &path_set_ Let's not make changes to member state until we can't fail anymore. PS6, Line 472: return Status::IOError(Substitute("Need to write $0 new instance files but filesystem opened " : "in read-only mode", updated_indices.size())); Why is this fatal? If we're in read-only mode, can't we move forward despite this inconsistency? PS6, Line 476: ; Nit: extra semicolon Line 487: dd->WaitOnClosures(); This should be safe regardless of dd health. Line 677: Status s = candidate->RefreshIsFull(DataDir::RefreshMode::EXPIRED_ONLY); Likewise. Line 724: Status s = e.second->RefreshIsFull(DataDir::RefreshMode::ALWAYS); Why aren't we returning RefreshIsFull errors anymore? Line 771: CHECK(AccessMode::READ_WRITE == mode_) << "Cannot handle disk failures; filesystem is " Use CHECK_EQ. Line 796: void DataDirManager::WritePathHealthStatesUnlocked() { Should DCHECK on the lock that needs to be held. Also, this function does IO, so we shouldn't hold any spinlocks during UpdateOnDisk(). http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: PS6, Line 339: dirty_idx_list you mean 'instances'? PS6, Line 343: // This only affects the DataDirManager state, so it shouldn't be used if : // state out of its jurisdiction (e.g. tablets) needs to be updated. Not exactly sure what guidance this provides. Line 352: const AccessMode mode_; Nit: group with the other const members (i.e. remove the blank line separating them). http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/fs.proto File src/kudu/fs/fs.proto: Line 70: repeated bytes all_uuids = 2; // Deprecated field used before Kudu 1.5.0. Another convention is to rename deprecated proto fields by prepending "DEPRECATED_" to them. It's OK to rename fields; it's only their type and order that matter for backwards compatibility. PS6, Line 75: There is no PathSetPB with a UUID outside the set of UUIDs in : // 'all_paths' of the PathSetPB with the highest 'timestamp_us' This is a pretty dense sentence and is difficult to understand. Reword? http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: Line 1493: continue; Won't this lead to a crash when we look up this dd's limit in block_limits_by_data_dir_ (see LogBlockContainer's constructor)? Line 1538: continue; This won't increment 'i'; are we missing some test coverage here? Seems like that should have been caught by something. -- To view, visit http://gerrit.cloudera.org:8080/7270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
