Mike Percy has posted comments on this change.

Change subject: disk failure: don't fail CHECKs for disk failures
......................................................................


Patch Set 2:

(4 comments)

I didn't finish getting through this whole patch yet but here are some high 
level comments.

I feel like I'm little out of the loop. Is there a design doc or a header 
comment or something to summarize how disk error detection and handling works?

Currently, there are cases that we didn't handle to be "safe" without certain 
durability guarantees. So it needs to be a little clearer how we are handling 
those, and I guess it either needs to be explicit error handling in each case 
(rollback, etc) or a way to mark the tablet as permanently failed / dead 
without affecting the rest of the server.

http://gerrit.cloudera.org:8080/#/c/7442/2/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

Line 696:   // TODO(unknown): need to figure out what to do with error handling 
if this
This should be TODO(todd).

Keeping this TODO makes me nervous. Is this handled or not? :)


http://gerrit.cloudera.org:8080/#/c/7442/2/src/kudu/tablet/local_tablet_writer.h
File src/kudu/tablet/local_tablet_writer.h:

Line 104:     if (tablet_->IsDataInFailedDir()) {
This sort of feels out of place. See other comments


http://gerrit.cloudera.org:8080/#/c/7442/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 803: void Tablet::ApplyRowOperations(WriteTransactionState* tx_state) {
Seems like this should return Status now?


http://gerrit.cloudera.org:8080/#/c/7442/2/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

Line 157:   void ApplyRowOperations(WriteTransactionState* tx_state);
Seems like this has a new (sort of awkward) contract now since you have to 
check some flags after it returns. That should at least be documented but more 
likely I think this should return a Status on failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/7442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I109635a54268b9db741b2ae9ea3e9f1fe072d0a8
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to