Grant Henke has posted comments on this change. Change subject: Reduce arguments in file Read API ......................................................................
Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/6770/3/src/kudu/util/env.h File src/kudu/util/env.h: Line 355: virtual Status Read(Slice* result) = 0; > I think the primary motivation for this refactor was the new ReadV() API th Before the patch that made the read calls "read fully" there was a length IN and a length OUT parameter. The length OUT parameter was in the OUT slice. Now they are essentially the same. Currently the Read() API essentially requires 3 parameters: - offset (IN): position to start reading from - length (IN/OUT): the amount of data to read - scratch (OUT): memory to read into The ReadV() API will similarly need: - offset (IN): position to start reading from - results (IN/OUT): A vector of: - length (IN): the amount of data to read - scratch (OUT): memory to read into Since Slice is commonly used today to encapsulate a scratch buffer and define it's relevant length I thought it would work as an efficent IN/OUT parameter for both READ() and READV(). This results in the following API: - Status Read(uint64_t offset, Slice* result) - Status ReadV(uint64_t offset, vector<Slice>* results) > in the case where we might be reading from a memory-mapped file, we wouldn't > necessarily want to use the buffer referenced in result.data; we would likely > want to reset the Slice to reference mapped memory for zero-copy. Yes this wouldn't necessarily work well for that. In that case a scratch buffer isn't needed at all. It doesn't look like this API handled that use case well in general since the Slice didn't "own" the data and required a memcpy to make sure the data stuck around. Line 376: // Read up to "result.size" from the file starting at "offset". > Why "up to"? Isn't guaranteed to read exactly that amount, or fail? Technically if it failed with an IO error after a short read the scratch buffer would contain a partial result. Though the error status would indicated the result is invalid. Line 382: // possible to read exactly 'length' bytes, an IOError is returned. > This note should probably exist in the other Read() implementations too. Or It's it on all the "bottom" level reads that any other read wraps. Do you think that is enough? Line 504: // Read up to "result.size" from the file starting at "offset". > Same question about "up to". See earlier response. http://gerrit.cloudera.org:8080/#/c/6770/3/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: Line 278: req, offset)); > Nit: I think result->length() is actually more correct here. Yes, the EOF o I think I agree. But the offset is being edited in the loop too. I would need to keep and reference the original offset too. Should I do that? Line 308: *result = Slice(result->mutable_data(), r); > Wouldn't result->truncate(r) be simpler? Good idea. Done. http://gerrit.cloudera.org:8080/#/c/6770/3/src/kudu/util/pb_util-internal.cc File src/kudu/util/pb_util-internal.cc: Line 51: } > I don't think this is possible anymore, is it? Done http://gerrit.cloudera.org:8080/#/c/6770/3/src/kudu/util/pb_util.cc File src/kudu/util/pb_util.cc: Line 183: Status NonShortRead(T* file, uint64_t offset, Slice* result); > Is the indirection via NonShortRead() still needed? Or can we just call rea Done -- To view, visit http://gerrit.cloudera.org:8080/6770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e4df425a7387c28eb428e851a9f8c25167755d4 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
