Grant Henke has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile ......................................................................
Patch Set 10: (23 comments) http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: Line 301: RETURN_NOT_OK(block_->ReadV(off, results)); > I think it would be a little less obtrusive to have ReadResult construct an My goal was to match the pattern and variables used in the Read function. Especially since this is essentially a version of Read for multiple Slices. http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: Line 210: bool has_checksums_; > This will be padded to 8 bytes, which is actually quite a bit given how man Done http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: Line 262: gscoped_ptr<uint8_t[]> scratch1(new uint8_t[size1]); > Nit: use unique_ptr. I mainly just followed the "pattern" from a few lines above. Happy to change it. Line 267: ReadResult { > Surprised that this style of struct initialization works; isn't it C99 but Will change this. http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/fs/block_manager.h File src/kudu/fs/block_manager.h: Line 162: // Does not modify 'result' on error (but may modify 'scratch'). > You mean results? I mean the result field in the results. I will make this more clear. Line 163: virtual Status ReadV(uint64_t offset, vector<ReadResult> results) const = 0; > nit: Consider making this vector<ReadResult>* perhaps, based on my comments Looking into this. Adar mentioned it too. http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/fs/fs-test-util.h File src/kudu/fs/fs-test-util.h: Line 72: virtual Status ReadV(uint64_t offset, vector<ReadResult> results) const OVERRIDE { > This is a header so we retain namespace qualifications (i.e. this should be Done http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/crc.cc File src/kudu/util/crc.cc: Line 50: uint64_t crc_tmp = static_cast<uint64_t>(crc32); > Nit: extra space here. Done http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/crc.h File src/kudu/util/crc.h: Line 38: uint32_t Crc32c(const void* data, size_t length, uint32_t crc32); > Maybe previous_crc32 to be more clear? Done Line 38: uint32_t Crc32c(const void* data, size_t length, uint32_t crc32); > Maybe name it RollingCrc32()? Also, needs a unit test. I though it was nice to have the same name as the original function since it works in tandem with it. ie: uint32_t crc = Crc32c(data, length); crc = Crc32c(data, length, crc); http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/env.h File src/kudu/util/env.h: Line 371: struct ReadResult { > Should provide a short high-level struct document too. Done Line 403: virtual Status ReadV(uint64_t offset, std::vector<ReadResult> results) const = 0; > I'm finding this to be somewhat confusing. I agree. I had a hard time naming the struct too. I am not sure ReadEntry is perfect either, but it's definitely better. Will change and re-organize ReadResult. Line 536: // TODO (GH): Document > Yep. :) I was waiting until it was decided if I should allow short reads here or not. I will move the readfully logic here and document accordingly. http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: Line 175: ssize_t preadv(int fd, const struct iovec *iovec, int count, off_t offset) { > Should be const struct iovec* Done Line 180: return read_bytes; > Nit: if you're going to elide the braces, merge this line into the conditio Done Line 182: break; > Don't we need to update total_read_bytes with our partial read results befo Done Line 334: virtual Status ReadV(uint64_t offset, vector<ReadResult> results) const OVERRIDE { > Could you add TRACE_EVENT1() calls to ReadV() and Read()? Not sure why they It doesn't look like any read or write calls have this now. Since these are called frequently is there some overhead to worry about? Line 341: return iovec { > Again, surprised this kind of initialization works in C++11. Done Line 353: // Adjust slice sizes based on bytes read for short reads > As we discussed, let's make short reads an error, for Read() too. Do it as Will do. Line 355: for (ReadResult& readResult : results) { > Nit: auto is fine for short-scoped types like this one. Done http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/env_util.cc File src/kudu/util/env_util.cc: Line 196: size_t n = accumulate(results.begin(), results.end(), static_cast<size_t>(0), > nit: name this req_len ? I can rename this. Often when in a new codebase I try to match the style that exists. Given this is a "mirror" of ReadFully above I used similar names. Line 207: // Calculate the read amount of data > Nit: Use punctuation like periods at the end of comment sentences per https Done http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/env_util.h File src/kudu/util/env_util.h: Line 66: Status ReadVFully(RandomAccessFile* file, uint64_t offset, std::vector<ReadResult> results); > If you end up solving the partial read stuff directly in RandomAccessFile, I will move this into the lower level read methods. -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
