Adar Dembo has posted comments on this change.

Change subject: WIP: KUDU-463. Add checksumming to cfile
......................................................................


Patch Set 10:

(21 comments)

I didn't review the checksumming stuff yet, just looked at the rest and the 
plumbing.

Since the ReadV() stuff turned out to be pretty substantial, could you move it 
into its own commit? Would be nice to finish reviewing it separately.

http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

Line 210:   bool has_checksums_;
This will be padded to 8 bytes, which is actually quite a bit given how many 
CFileReaders we instantiate. Perhaps we can calculate this on the fly instead, 
using a mask on footer_'s features?


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

PS10, Line 262: gscoped_ptr
Nit: use unique_ptr.

Actually, these scratch spaces are tiny; can you just allocate them on the 
stack?


Line 267:       ReadResult {
Surprised that this style of struct initialization works; isn't it C99 but not 
C++11 compatible?


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 38: class Env;
No need for this anymore.

Though if you decide to pass the vector by pointer as I suggested in env.h, you 
can omit the env.h include, forward declare ReadResult, and continue to forward 
declare Env.


PS10, Line 162: result
You mean results?


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/fs/fs-test-util.h
File src/kudu/fs/fs-test-util.h:

PS10, Line 72: vector
This is a header so we retain namespace qualifications (i.e. this should be 
std::vector).


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/crc.cc
File src/kudu/util/crc.cc:

PS10, Line 50:  
Nit: extra space here.


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/crc.h
File src/kudu/util/crc.h:

PS10, Line 38: crc32
Maybe previous_crc32 to be more clear?


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/env.h
File src/kudu/util/env.h:

Line 371: struct ReadResult {
Should provide a short high-level struct document too.


PS10, Line 403: results
I'm finding this to be somewhat confusing.

Part of the issue is with the name ReadResult: the majority of stuff in it 
aren't "results"; they're actually input to the function. Maybe ReadEntry or 
ReadVEntry would be better?

Part of the issue is that the Slice* means an extra level of indirection in the 
setup that callers have to do. Perhaps they can be regular Slices, that are 
default-initialized by the caller and set by the callee? That means 'results' 
should be passed by pointer, but I think that's a clearer model anyway, since 
we pass by pointer things that the function is expected to modify.

Also, could you reorganize ReadResult so that the pure IN parameters precede 
IN/OUT and OUT? Meaning, the order should probably be length, scratch, then 
result.


Line 536:   // TODO (GH): Document
Yep. :)


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/env_posix.cc
File src/kudu/util/env_posix.cc:

PS10, Line 175: const struct iovec *
Should be const struct iovec*


Line 180:       return read_bytes;
Nit: if you're going to elide the braces, merge this line into the conditional 
line. This makes it harder to accidentally mess up the conditional if another 
line is added.


Line 182:       break;
Don't we need to update total_read_bytes with our partial read results before 
breaking?


Line 334:   virtual Status ReadV(uint64_t offset, vector<ReadResult> results) 
const OVERRIDE {
Could you add TRACE_EVENT1() calls to ReadV() and Read()? Not sure why they 
were missing in the first place.


PS10, Line 340: ReadResult
Can be cref?


Line 341:       return iovec {
Again, surprised this kind of initialization works in C++11.


PS10, Line 347: results.size()
arraylength(iov) is more idiomatic.


Line 353:     // Adjust slice sizes based on bytes read for short reads
As we discussed, let's make short reads an error, for Read() too. Do it as a 
separate patch and layer this one on top of it.


PS10, Line 355: ReadResult
Nit: auto is fine for short-scoped types like this one.


http://gerrit.cloudera.org:8080/#/c/6630/10/src/kudu/util/env_util.h
File src/kudu/util/env_util.h:

Line 66: Status ReadVFully(RandomAccessFile* file, uint64_t offset, 
std::vector<ReadResult> results);
If you end up solving the partial read stuff directly in RandomAccessFile, this 
won't be necessary.

But if it is, please also doc it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6630
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Grant Henke <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to