Andrew Wong has posted comments on this change.

Change subject: KUDU-2135 (part 2): check integrity of new on-disk format
......................................................................


Patch Set 28:

(5 comments)

Note: another significant portion of this patch has been pushed into another 
(see https://gerrit.cloudera.org/#/c/8048/)

This patch is reviewable, but should not be merged until tooling is written to 
change the path set instance files.

http://gerrit.cloudera.org:8080/#/c/7270/26/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

Line 543:     InsertOrDie(&idx_by_uuid, uuid, idx);
> This is confusing w.r.t. failed_data_dirs.
Renamed this to unassigned_dirs; in this case they're equivalent (a failed dir 
will be unassigned), but this provides better context for what we're going to 
do with them.


PS26, Line 633: 
              :   // Finally, initialize the 'fullness' status of the data 
directories.
              :   for (const auto& dd : data_dirs_) {
              :     uint16_t uuid_idx;
              :     CHECK(FindUuidIndexByDataDir(dd.get(), &uuid_idx));
              :     if (ContainsKey(failed_data_dirs_, uuid_idx)) {
              :       continue;
              :     }
              :     Status refresh_status = 
dd->RefreshIsFull(DataDir::RefreshMode::ALWAYS);
              :     if (PREDICT_FALSE(!refresh_status.ok())) {
              :       if (refresh_status.IsDiskFailure()) {
              :         RETURN_NOT_OK(MarkDataDirFailed(uuid_idx, 
refresh_status.ToString()));
              :         continue;
              :       }
              :       return refresh_status;
              :    
> Why was this moved all the way down here? You could use instance health to 
The latter. This is safeguarding if we fail during the fullness refresh.


Line 744:   } else {
> Why crash and not return an IOError, letting a caller higher in the stack c
Done


Line 908: 
> Hmm, what if we just skipped WritePathHealthStates? Would that be incorrect
I suppose it wouldn't be incorrect; the server could continue running without 
marking on-disk that the disk has failed.


Line 953:   if (read_only_) {
> Can you add a comment explaining what this is serializing and why?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to