[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-08 Thread Andrew Wong (Code Review)
Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..

cfile: change BlockBuilder API to yield a vector of Slices

When blocks are appended to cfiles at the IO layer, we already have the
ability to write multiple slices using a vectored IO. Prior to this
patch, the BlockBuilder API was restricted to returning a single slice,
whereas it would be more convenient in some cases to be able to return
multiple slices (eg separating the header from the data).

This new functionality is used by BinaryDictBlockBuilder to avoid an
extra copy in Finish().

Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Reviewed-on: http://gerrit.cloudera.org:8080/15042
Reviewed-by: Andrew Wong 
Tested-by: Kudu Jenkins
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/bshuf_block.cc
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/plain_bitmap_block.h
M src/kudu/cfile/plain_block.h
M src/kudu/cfile/rle_block.h
14 files changed, 115 insertions(+), 107 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-08 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 08 Apr 2020 23:03:14 +
Gerrit-HasComments: No


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-08 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has uploaded a new patch set (#7) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..

cfile: change BlockBuilder API to yield a vector of Slices

When blocks are appended to cfiles at the IO layer, we already have the
ability to write multiple slices using a vectored IO. Prior to this
patch, the BlockBuilder API was restricted to returning a single slice,
whereas it would be more convenient in some cases to be able to return
multiple slices (eg separating the header from the data).

This new functionality is used by BinaryDictBlockBuilder to avoid an
extra copy in Finish().

Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/bshuf_block.cc
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/plain_bitmap_block.h
M src/kudu/cfile/plain_block.h
M src/kudu/cfile/rle_block.h
14 files changed, 115 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/7
-- 
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-07 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@367
PS6, Line 367: with unsigned types
> nit: Isn't this instantiated by IntEncodingTest with both signed and unsign
Aah right. Changed to using C++11 std::is_signed instead. No tidy bot 
warnings.


http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@373
PS6, Line 373:  char, short
> nit: I might be missing something, but isn't this only called with int type
It's called with shorter types namely uint8, int16 etc. Updated comment.



--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 08 Apr 2020 02:01:33 +
Gerrit-HasComments: Yes


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@367
PS6, Line 367: with unsigned types
nit: Isn't this instantiated by IntEncodingTest with both signed and unsigned 
ints?


http://gerrit.cloudera.org:8080/#/c/15042/6/src/kudu/cfile/encoding-test.cc@373
PS6, Line 373:  char, short
nit: I might be missing something, but isn't this only called with int types? 
Why might we see 'char'?



--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 07 Apr 2020 19:20:14 +
Gerrit-HasComments: Yes


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-06 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has uploaded a new patch set (#6) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..

cfile: change BlockBuilder API to yield a vector of Slices

When blocks are appended to cfiles at the IO layer, we already have the
ability to write multiple slices using a vectored IO. Prior to this
patch, the BlockBuilder API was restricted to returning a single slice,
whereas it would be more convenient in some cases to be able to return
multiple slices (eg separating the header from the data).

This new functionality is used by BinaryDictBlockBuilder to avoid an
extra copy in Finish().

Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/bshuf_block.cc
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/plain_bitmap_block.h
M src/kudu/cfile/plain_block.h
M src/kudu/cfile/rle_block.h
14 files changed, 116 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/6
-- 
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-03 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 5:

> Patch Set 5: Verified-1
>
> Build Failed
>
> http://jenkins.kudu.apache.org/job/kudu-gerrit/21289/ : FAILURE

TSAN test error: 
TabletServerDiskError/TabletServerDiskErrorITest.TestFailDuringScanWorkload/0

This test doesn't show up as flaky on the dashboard. At the same TSAN build 
that includes a change on top of this didn't fail 
https://gerrit.cloudera.org/c/15043/

Looks like some delay but not sure whether that's the reason for failure.

 I0403 19:02:27.387187  1261 disk_failure-itest.cc:271] Currently has 9 failed 
tablets
 
/home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/disk_failure-itest.cc:272:
 Failure
  Expected: num_failed
  Which is: 10  
 To be equal to: failed_on_ts 
  Which is: 9  
/home/jenkins-slave/workspace/kudu-master/1/src/kudu/util/test_util.cc:349: 
Failure
Failed
Timed out waiting for assertion to pass.
/home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/disk_failure-itest.cc:313:
 Failure


--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 03 Apr 2020 22:00:21 +
Gerrit-HasComments: No


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-04-03 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has uploaded a new patch set (#5) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..

cfile: change BlockBuilder API to yield a vector of Slices

When blocks are appended to cfiles at the IO layer, we already have the
ability to write multiple slices using a vectored IO. Prior to this
patch, the BlockBuilder API was restricted to returning a single slice,
whereas it would be more convenient in some cases to be able to return
multiple slices (eg separating the header from the data).

This new functionality is used by BinaryDictBlockBuilder to avoid an
extra copy in Finish().

Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/bshuf_block.cc
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/plain_bitmap_block.h
M src/kudu/cfile/plain_block.h
M src/kudu/cfile/rle_block.h
14 files changed, 121 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/5
-- 
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 4: Code-Review+2

For some reason, IWYU is still not happy: 
http://jenkins.kudu.apache.org/job/kudu-gerrit/20221/BUILD_TYPE=IWYU/artifact/build/latest/test-logs/iwyu.log


--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 22 Jan 2020 01:42:16 +
Gerrit-HasComments: No


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-21 Thread Todd Lipcon (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15042

to look at the new patch set (#4).

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..

cfile: change BlockBuilder API to yield a vector of Slices

When blocks are appended to cfiles at the IO layer, we already have the
ability to write multiple slices using a vectored IO. Prior to this
patch, the BlockBuilder API was restricted to returning a single slice,
whereas it would be more convenient in some cases to be able to return
multiple slices (eg separating the header from the data).

This new functionality is used by BinaryDictBlockBuilder to avoid an
extra copy in Finish().

Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/bshuf_block.cc
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/plain_bitmap_block.h
M src/kudu/cfile/plain_block.h
M src/kudu/cfile/rle_block.h
14 files changed, 107 insertions(+), 100 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/4
--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-16 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc
File src/kudu/cfile/binary_dict_block.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85
PS1, Line 85:   data_slices.insert(data_slices.begin(), Slice(buffer_));
:   *slices = std::move(data_slices);
> I don't think reserve() can necessarily help because data_builder_->Finish(
Yep, reserve() could help in the alternative approach that I suggested, but not 
here.  I think it's not a big deal anyway, especially given the way how 
std::vector grows internally (i.e. reserve() might help only in certain cases 
on 2x size boundaries or alike).


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127
PS1, Line 127: contiguous_buf_
> that's true. I think adding that CHECK will break things, though, since thi
SGMT



--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 Jan 2020 22:27:52 +
Gerrit-HasComments: Yes


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-16 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc
File src/kudu/cfile/binary_dict_block.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85
PS1, Line 85:   vector data_slices;
:   data_builder_->Finish(ordinal_pos
> yep, it seems the same since and it's called just once here. at least, mayb
I don't think reserve() can necessarily help because data_builder_->Finish() 
will overwrite (move on top of) data_slices.


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127
PS1, Line 127: contiguous_buf_
> Ah, indeed.
that's true. I think adding that CHECK will break things, though, since this 
gets called multiple times in the same test suite. I'll just add a comment that 
says that the slice is only valid until the next call to the same function



--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 Jan 2020 21:32:47 +
Gerrit-HasComments: Yes


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-16 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc
File src/kudu/cfile/binary_dict_block.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85
PS1, Line 85:   vector data_slices;
:   data_builder_->Finish(ordinal_pos
> isn't that equivalent from a performnace perspective? in either case, you'r
yep, it seems the same since and it's called just once here. at least, maybe 
add appropriate data_slices.reserve()?


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@399
PS1, Line 399: ck(null_hea
> I think given Slice is a simple value type there isn't any benefit to movin
yup, indeed


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127
PS1, Line 127: contiguous_buf_
> nope, because the Slice doesn't own the pointed-to memory, so it has to out
Ah, indeed.

Does it mean the callers of this method should expect some strange results if 
calling this method twice in a row if they keep the returned results at the 
upper level?  Maybe, it make sense to protect against that replacing 
contiguous_buf_.clear() with CHECK(contiguous_buf_.empty()) ?



--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 Jan 2020 18:28:46 +
Gerrit-HasComments: Yes


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-15 Thread Todd Lipcon (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15042

to look at the new patch set (#2).

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..

cfile: change BlockBuilder API to yield a vector of Slices

When blocks are appended to cfiles at the IO layer, we already have the
ability to write multiple slices using a vectored IO. Prior to this
patch, the BlockBuilder API was restricted to returning a single slice,
whereas it would be more convenient in some cases to be able to return
multiple slices (eg separating the header from the data).

This new functionality is used by BinaryDictBlockBuilder to avoid an
extra copy in Finish().

Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/bshuf_block.cc
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/plain_bitmap_block.h
M src/kudu/cfile/plain_block.h
M src/kudu/cfile/rle_block.h
14 files changed, 103 insertions(+), 98 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/2
--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc
File src/kudu/cfile/binary_dict_block.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@65
PS1, Line 65: void BinaryDictBlockBuilder::Reset() {
> Is it necessary to clear the buffer_ member as well?
nope, because it's always completely overwritten in Finish(). I'll rename it to 
header_buf


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85
PS1, Line 85:   data_slices.insert(data_slices.begin(), Slice(buffer_));
:   *slices = std::move(data_slices);
> Inserting into the beginning of a vector is not the best option from the pe
isn't that equivalent from a performnace perspective? in either case, you're 
ending up copying all of the data_slices once (in your case to append to ret, 
in my case when inserting at the beginning).


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h
File src/kudu/cfile/binary_plain_block.h:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h@85
PS1, Line 85:   static const size_t kHeaderSize = sizeof(uint32_t) * 3;
> nit: would it benefit from adding constexpr ?
Done


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc
File src/kudu/cfile/binary_plain_block.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc@59
PS1, Line 59:   buffer_.resize(kHeaderSize);
:   buffer_.reserve(options_->storage_attributes.cfile_block_size);
> Would it be more optimal to switch these two lines?
Done


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h
File src/kudu/cfile/bshuf_block.h:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h@53
PS1, Line 53: using std::vector;
> Is it really needed?
yea not sure how it got there, thanks


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@399
PS1, Line 399: null_bitmap
> nit: wrap into std::move() ?
I think given Slice is a simple value type there isn't any benefit to moving vs 
copying (otherwise I think clang-tidy would complain here)


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@401
PS1, Line 401:   for (const auto& s : data_slices) {
 : v.emplace_back(s);
 :   }
> nit: consider instead
Done


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127
PS1, Line 127: contiguous_buf_
> Is it possible to use just a local variable instead this member field and m
nope, because the Slice doesn't own the pointed-to memory, so it has to outlast 
the function call scope


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h
File src/kudu/cfile/rle_block.h:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h@78
PS1, Line 78: buf_
> Slice(buf_) ?  Or the compiler is smart enough to wrap it into a Slice itse
yea, Slice has an implicit constructor for faststring apparently



--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 Jan 2020 06:12:59 +
Gerrit-HasComments: Yes


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15042 )

Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc
File src/kudu/cfile/binary_dict_block.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@65
PS1, Line 65: void BinaryDictBlockBuilder::Reset() {
Is it necessary to clear the buffer_ member as well?


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_dict_block.cc@85
PS1, Line 85:   data_slices.insert(data_slices.begin(), Slice(buffer_));
:   *slices = std::move(data_slices);
Inserting into the beginning of a vector is not the best option from the 
performance perspective.  Maybe, replace with:

  vector ret{ Slice(buffer_) };
  std::move(data_slices.begin(), data_slices.end(), std::back_inserter(ret));
  *slices = std::move(ret);


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h
File src/kudu/cfile/binary_plain_block.h:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.h@85
PS1, Line 85:   static const size_t kHeaderSize = sizeof(uint32_t) * 3;
nit: would it benefit from adding constexpr ?


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc
File src/kudu/cfile/binary_plain_block.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/binary_plain_block.cc@59
PS1, Line 59:   buffer_.resize(kHeaderSize);
:   buffer_.reserve(options_->storage_attributes.cfile_block_size);
Would it be more optimal to switch these two lines?


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h
File src/kudu/cfile/bshuf_block.h:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/bshuf_block.h@53
PS1, Line 53: using std::vector;
Is it really needed?

In general, adding 'using' into headers is not a good idea.


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@399
PS1, Line 399: null_bitmap
nit: wrap into std::move() ?


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/cfile_writer.cc@401
PS1, Line 401:   for (const auto& s : data_slices) {
 : v.emplace_back(s);
 :   }
nit: consider instead

  std::move(data_slices.begin(), data_slices.end(), std::back_inserter(v));


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/encoding-test.cc@127
PS1, Line 127: contiguous_buf_
Is it possible to use just a local variable instead this member field and make 
this method static?


http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h
File src/kudu/cfile/rle_block.h:

http://gerrit.cloudera.org:8080/#/c/15042/1/src/kudu/cfile/rle_block.h@78
PS1, Line 78: buf_
Slice(buf_) ?  Or the compiler is smart enough to wrap it into a Slice itself?



--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 16 Jan 2020 01:50:57 +
Gerrit-HasComments: Yes


[kudu-CR] cfile: change BlockBuilder API to yield a vector of Slices

2020-01-15 Thread Todd Lipcon (Code Review)
Hello Alexey Serbin,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/15042

to review the following change.


Change subject: cfile: change BlockBuilder API to yield a vector of Slices
..

cfile: change BlockBuilder API to yield a vector of Slices

When blocks are appended to cfiles at the IO layer, we already have the
ability to write multiple slices using a vectored IO. Prior to this
patch, the BlockBuilder API was restricted to returning a single slice,
whereas it would be more convenient in some cases to be able to return
multiple slices (eg separating the header from the data).

This new functionality is used by BinaryDictBlockBuilder to avoid an
extra copy in Finish().

Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/bshuf_block.cc
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/plain_bitmap_block.h
M src/kudu/cfile/plain_block.h
M src/kudu/cfile/rle_block.h
14 files changed, 99 insertions(+), 96 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/15042/1
--
To view, visit http://gerrit.cloudera.org:8080/15042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifc7a5f148a4a43cedac2428f4c1a18d0f93a10db
Gerrit-Change-Number: 15042
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin