Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11249 )
Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption ...................................................................... Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc File src/kudu/cfile/bloomfile.cc: http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@257 PS9, Line 257: return Status::Corruption("bloom file is compressed (compression not supported)", > I suppose the type of corruption is interesting. Yeah, these ones are pretty fuzzy. I think a few of these `return Corruption`s could probably be DCHECKs or CHECKs, but returning an error is more conservative. If anything, I can just omit this handling since it's a corruption that we wouldn't expect, and it's probably worth surfacing as a bug, rather than handling it. http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@344 PS9, Line 344: RETURN_NOT_OK(ParseBlockHeader(io_context, dblk_data.data(), &hdr, &bloom_data)); > Can we rely on the the above reader_->ReadBlock to handle any corruption is Hrm, that's a good point. As long as checksums are doing their job, all the other `return Corruption` paths _should_ be indicative of a bug. I wonder if this extra layer of protection is valuable at all. http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc@185 PS9, Line 185: RETURN_NOT_OK(ReadAndParseFooter(io_context)); > Can we handle header and footer corruption here with RETURN_NOT_OK_HANDLE_C I'm not too keen on that approach. You're right that that's what we did for EIO handling, but I think this approach makes it much clearer where handling is happening, _and_ makes it easier to say that all the places that should be covered _are_ covered. That approach circumvents this extra plumbing, but I'm not sure it reduces the number of places we need to cover. With current handling, we have to handle at every `return Corruption` site. Relying on RETURN_NOT_OK_HANDLE_CORRUPTION, we would have to wrap every call to a function that may return a corruption. -- To view, visit http://gerrit.cloudera.org:8080/11249 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f Gerrit-Change-Number: 11249 Gerrit-PatchSet: 9 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Thu, 30 Aug 2018 17:54:25 +0000 Gerrit-HasComments: Yes
