Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11249 )

Change subject: WIP KUDU-2469 pt 2: fail replicas on CFile corruption
......................................................................


Patch Set 5:

(6 comments)

Still a WIP since I'd like to maybe add more docs to IOContext and how it 
should be used and add more tests, but just checking this in.

http://gerrit.cloudera.org:8080/#/c/11249/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11249/2//COMMIT_MSG@18
PS2, Line 18:
            :
            :
            :
            :
> I read through the comment in error_manager.h but still don't understand wh
I tried clarifying this in a separate patch: 
https://gerrit.cloudera.org/c/11303/


http://gerrit.cloudera.org:8080/#/c/11249/2/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/11249/2/src/kudu/cfile/cfile_reader.cc@81
PS2, Line 81: DEFINE_double(cfile_inject_corruption, 0,
> One of the reasons I didn't do this while writing the checksum code  was du
Ah nice, thanks for the pointer. Seems pretty useful for unit testing, though I 
think I'll still add this since I find it a bit easier to use in more 
integration-y tests. I agree it's kind of clunky in terms of the number of 
locations, but IMO it also makes it easier to get broader coverage of errors.


http://gerrit.cloudera.org:8080/#/c/11249/2/src/kudu/cfile/cfile_reader.cc@227
PS2, Line 227:
> Should we use a macro for this pattern? Something similar to the HANDLE_DIS
Done


http://gerrit.cloudera.org:8080/#/c/11249/2/src/kudu/cfile/cfile_reader.cc@348
PS2, Line 348:   if (PREDICT_FALSE(checksum_value != expected_checksum)) {
> Can we assert that this exists?
Done


http://gerrit.cloudera.org:8080/#/c/11249/2/src/kudu/fs/error_manager.h
File src/kudu/fs/error_manager.h:

http://gerrit.cloudera.org:8080/#/c/11249/2/src/kudu/fs/error_manager.h@76
PS2, Line 76:   return _s; \
> Do we need a new handler type? Or is TABLET good enough to re-use?
I tried clarifying this in this patch: https://gerrit.cloudera.org/c/11303/
Basically, the error handling should be based on the type of error, not the 
error handling callback. Sure, we want to fail a single tablet, but the TABLET 
handler is being used elsewhere, and changing the callback would have effects 
elsewhere.


http://gerrit.cloudera.org:8080/#/c/11249/2/src/kudu/fs/error_manager.h@103
PS2, Line 103:
> safety
Ack



--
To view, visit http://gerrit.cloudera.org:8080/11249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Gerrit-Change-Number: 11249
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 29 Aug 2018 03:18:07 +0000
Gerrit-HasComments: Yes

Reply via email to