[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. cfile: change BlockBuilder API to yield a vector of Slices When blocks are appended to cfiles at the IO layer, we already have the ability to write multiple slices using a vectored IO. Prior to this patch, the BlockBuilder API was restricted to returning a single slice, whereas it would be more convenient in some cases to be able to return multiple slices (eg separating the header from the data). This new functionality is used by BinaryDictBlockBuilder to avoid an extra copy in Finish(). Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Reviewed-on: http://gerrit.cloudera.org:8080/15042 Reviewed-by: Andrew Wong Tested-by: Kudu Jenkins --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/encoding-test.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h 14 files changed, 115 insertions(+), 107 deletions(-) Approvals: Andrew Wong: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 8 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 08 Apr 2020 23:03:14 + Gerrit-HasComments: No
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Bankim Bhavsar has uploaded a new patch set (#7) to the change originally created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. cfile: change BlockBuilder API to yield a vector of Slices When blocks are appended to cfiles at the IO layer, we already have the ability to write multiple slices using a vectored IO. Prior to this patch, the BlockBuilder API was restricted to returning a single slice, whereas it would be more convenient in some cases to be able to return multiple slices (eg separating the header from the data). This new functionality is used by BinaryDictBlockBuilder to avoid an extra copy in Finish(). Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/encoding-test.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h 14 files changed, 115 insertions(+), 107 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/7 -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 7 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc File src/kudu/cfile/encoding-test.cc: http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@367 PS6, Line 367: with unsigned types > nit: Isn't this instantiated by IntEncodingTest with both signed and unsign Aah right. Changed to using C++11 std::is_signed instead. No tidy bot warnings. http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@373 PS6, Line 373: char, short > nit: I might be missing something, but isn't this only called with int type It's called with shorter types namely uint8, int16 etc. Updated comment. -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 08 Apr 2020 02:01:33 + Gerrit-HasComments: Yes
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc File src/kudu/cfile/encoding-test.cc: http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@367 PS6, Line 367: with unsigned types nit: Isn't this instantiated by IntEncodingTest with both signed and unsigned ints? http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@373 PS6, Line 373: char, short nit: I might be missing something, but isn't this only called with int types? Why might we see 'char'? -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 07 Apr 2020 19:20:14 + Gerrit-HasComments: Yes
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Bankim Bhavsar has uploaded a new patch set (#6) to the change originally created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. cfile: change BlockBuilder API to yield a vector of Slices When blocks are appended to cfiles at the IO layer, we already have the ability to write multiple slices using a vectored IO. Prior to this patch, the BlockBuilder API was restricted to returning a single slice, whereas it would be more convenient in some cases to be able to return multiple slices (eg separating the header from the data). This new functionality is used by BinaryDictBlockBuilder to avoid an extra copy in Finish(). Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/encoding-test.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h 14 files changed, 116 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/6 -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 5: > Patch Set 5: Verified-1 > > Build Failed > > http://jenkins.kudu.apache.org/job/kudu-gerrit/21289/ : FAILURE TSAN test error: TabletServerDiskError/TabletServerDiskErrorITest.TestFailDuringScanWorkload/0 This test doesn't show up as flaky on the dashboard. At the same TSAN build that includes a change on top of this didn't fail https://gerrit.cloudera.org/c/15043/ Looks like some delay but not sure whether that's the reason for failure. I0403 19:02:27.387187 1261 disk_failure-itest.cc:271] Currently has 9 failed tablets /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/disk_failure-itest.cc:272: Failure Expected: num_failed Which is: 10 To be equal to: failed_on_ts Which is: 9 /home/jenkins-slave/workspace/kudu-master/1/src/kudu/util/test_util.cc:349: Failure Failed Timed out waiting for assertion to pass. /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/disk_failure-itest.cc:313: Failure -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 03 Apr 2020 22:00:21 + Gerrit-HasComments: No
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Bankim Bhavsar has uploaded a new patch set (#5) to the change originally created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. cfile: change BlockBuilder API to yield a vector of Slices When blocks are appended to cfiles at the IO layer, we already have the ability to write multiple slices using a vectored IO. Prior to this patch, the BlockBuilder API was restricted to returning a single slice, whereas it would be more convenient in some cases to be able to return multiple slices (eg separating the header from the data). This new functionality is used by BinaryDictBlockBuilder to avoid an extra copy in Finish(). Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/encoding-test.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h 14 files changed, 121 insertions(+), 107 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/5 -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 4: Code-Review+2 For some reason, IWYU is still not happy: http://jenkins.kudu.apache.org/job/kudu-gerrit/20221/BUILD_TYPE=IWYU/artifact/build/latest/test-logs/iwyu.log -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 22 Jan 2020 01:42:16 + Gerrit-HasComments: No
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15042 to look at the new patch set (#4). Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. cfile: change BlockBuilder API to yield a vector of Slices When blocks are appended to cfiles at the IO layer, we already have the ability to write multiple slices using a vectored IO. Prior to this patch, the BlockBuilder API was restricted to returning a single slice, whereas it would be more convenient in some cases to be able to return multiple slices (eg separating the header from the data). This new functionality is used by BinaryDictBlockBuilder to avoid an extra copy in Finish(). Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/encoding-test.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h 14 files changed, 107 insertions(+), 100 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/4 -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc File src/kudu/cfile/binary_dict_block.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85 PS1, Line 85: data_slices.insert(data_slices.begin(), Slice(buffer_)); : *slices = std::move(data_slices); > I don't think reserve() can necessarily help because data_builder_->Finish( Yep, reserve() could help in the alternative approach that I suggested, but not here. I think it's not a big deal anyway, especially given the way how std::vector grows internally (i.e. reserve() might help only in certain cases on 2x size boundaries or alike). http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc File src/kudu/cfile/encoding-test.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127 PS1, Line 127: contiguous_buf_ > that's true. I think adding that CHECK will break things, though, since thi SGMT -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 Jan 2020 22:27:52 + Gerrit-HasComments: Yes
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc File src/kudu/cfile/binary_dict_block.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85 PS1, Line 85: vector data_slices; : data_builder_->Finish(ordinal_pos > yep, it seems the same since and it's called just once here. at least, mayb I don't think reserve() can necessarily help because data_builder_->Finish() will overwrite (move on top of) data_slices. http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc File src/kudu/cfile/encoding-test.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127 PS1, Line 127: contiguous_buf_ > Ah, indeed. that's true. I think adding that CHECK will break things, though, since this gets called multiple times in the same test suite. I'll just add a comment that says that the slice is only valid until the next call to the same function -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 Jan 2020 21:32:47 + Gerrit-HasComments: Yes
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc File src/kudu/cfile/binary_dict_block.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85 PS1, Line 85: vector data_slices; : data_builder_->Finish(ordinal_pos > isn't that equivalent from a performnace perspective? in either case, you'r yep, it seems the same since and it's called just once here. at least, maybe add appropriate data_slices.reserve()? http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc File src/kudu/cfile/cfile_writer.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@399 PS1, Line 399: ck(null_hea > I think given Slice is a simple value type there isn't any benefit to movin yup, indeed http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc File src/kudu/cfile/encoding-test.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127 PS1, Line 127: contiguous_buf_ > nope, because the Slice doesn't own the pointed-to memory, so it has to out Ah, indeed. Does it mean the callers of this method should expect some strange results if calling this method twice in a row if they keep the returned results at the upper level? Maybe, it make sense to protect against that replacing contiguous_buf_.clear() with CHECK(contiguous_buf_.empty()) ? -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 Jan 2020 18:28:46 + Gerrit-HasComments: Yes
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15042 to look at the new patch set (#2). Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. cfile: change BlockBuilder API to yield a vector of Slices When blocks are appended to cfiles at the IO layer, we already have the ability to write multiple slices using a vectored IO. Prior to this patch, the BlockBuilder API was restricted to returning a single slice, whereas it would be more convenient in some cases to be able to return multiple slices (eg separating the header from the data). This new functionality is used by BinaryDictBlockBuilder to avoid an extra copy in Finish(). Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/encoding-test.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h 14 files changed, 103 insertions(+), 98 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/2 -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc File src/kudu/cfile/binary_dict_block.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@65 PS1, Line 65: void BinaryDictBlockBuilder::Reset() { > Is it necessary to clear the buffer_ member as well? nope, because it's always completely overwritten in Finish(). I'll rename it to header_buf http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85 PS1, Line 85: data_slices.insert(data_slices.begin(), Slice(buffer_)); : *slices = std::move(data_slices); > Inserting into the beginning of a vector is not the best option from the pe isn't that equivalent from a performnace perspective? in either case, you're ending up copying all of the data_slices once (in your case to append to ret, in my case when inserting at the beginning). http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h File src/kudu/cfile/binary_plain_block.h: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h@85 PS1, Line 85: static const size_t kHeaderSize = sizeof(uint32_t) * 3; > nit: would it benefit from adding constexpr ? Done http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc File src/kudu/cfile/binary_plain_block.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc@59 PS1, Line 59: buffer_.resize(kHeaderSize); : buffer_.reserve(options_->storage_attributes.cfile_block_size); > Would it be more optimal to switch these two lines? Done http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h File src/kudu/cfile/bshuf_block.h: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h@53 PS1, Line 53: using std::vector; > Is it really needed? yea not sure how it got there, thanks http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc File src/kudu/cfile/cfile_writer.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@399 PS1, Line 399: null_bitmap > nit: wrap into std::move() ? I think given Slice is a simple value type there isn't any benefit to moving vs copying (otherwise I think clang-tidy would complain here) http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@401 PS1, Line 401: for (const auto& s : data_slices) { : v.emplace_back(s); : } > nit: consider instead Done http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc File src/kudu/cfile/encoding-test.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127 PS1, Line 127: contiguous_buf_ > Is it possible to use just a local variable instead this member field and m nope, because the Slice doesn't own the pointed-to memory, so it has to outlast the function call scope http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h File src/kudu/cfile/rle_block.h: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h@78 PS1, Line 78: buf_ > Slice(buf_) ? Or the compiler is smart enough to wrap it into a Slice itse yea, Slice has an implicit constructor for faststring apparently -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 16 Jan 2020 06:12:59 + Gerrit-HasComments: Yes
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15042 ) Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc File src/kudu/cfile/binary_dict_block.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@65 PS1, Line 65: void BinaryDictBlockBuilder::Reset() { Is it necessary to clear the buffer_ member as well? http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85 PS1, Line 85: data_slices.insert(data_slices.begin(), Slice(buffer_)); : *slices = std::move(data_slices); Inserting into the beginning of a vector is not the best option from the performance perspective. Maybe, replace with: vector ret{ Slice(buffer_) }; std::move(data_slices.begin(), data_slices.end(), std::back_inserter(ret)); *slices = std::move(ret); http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h File src/kudu/cfile/binary_plain_block.h: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h@85 PS1, Line 85: static const size_t kHeaderSize = sizeof(uint32_t) * 3; nit: would it benefit from adding constexpr ? http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc File src/kudu/cfile/binary_plain_block.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc@59 PS1, Line 59: buffer_.resize(kHeaderSize); : buffer_.reserve(options_->storage_attributes.cfile_block_size); Would it be more optimal to switch these two lines? http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h File src/kudu/cfile/bshuf_block.h: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h@53 PS1, Line 53: using std::vector; Is it really needed? In general, adding 'using' into headers is not a good idea. http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc File src/kudu/cfile/cfile_writer.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@399 PS1, Line 399: null_bitmap nit: wrap into std::move() ? http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@401 PS1, Line 401: for (const auto& s : data_slices) { : v.emplace_back(s); : } nit: consider instead std::move(data_slices.begin(), data_slices.end(), std::back_inserter(v)); http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc File src/kudu/cfile/encoding-test.cc: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127 PS1, Line 127: contiguous_buf_ Is it possible to use just a local variable instead this member field and make this method static? http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h File src/kudu/cfile/rle_block.h: http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h@78 PS1, Line 78: buf_ Slice(buf_) ? Or the compiler is smart enough to wrap it into a Slice itself? -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Thu, 16 Jan 2020 01:50:57 + Gerrit-HasComments: Yes
[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices
Hello Alexey Serbin, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/15042 to review the following change. Change subject: cfile: change BlockBuilder API to yield a vector of Slices .. cfile: change BlockBuilder API to yield a vector of Slices When blocks are appended to cfiles at the IO layer, we already have the ability to write multiple slices using a vectored IO. Prior to this patch, the BlockBuilder API was restricted to returning a single slice, whereas it would be more convenient in some cases to be able to return multiple slices (eg separating the header from the data). This new functionality is used by BinaryDictBlockBuilder to avoid an extra copy in Finish(). Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/binary_prefix_block.cc M src/kudu/cfile/binary_prefix_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/encoding-test.cc M src/kudu/cfile/plain_bitmap_block.h M src/kudu/cfile/plain_block.h M src/kudu/cfile/rle_block.h 14 files changed, 99 insertions(+), 96 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/1 -- To view, visit http://gerrit.cloudera.org:8080/15042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db Gerrit-Change-Number: 15042 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Alexey Serbin