[kudu-CR] [location awareness] Assign locations to registering tablet servers

2018-08-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5 )

Change subject: [location_awareness] Assign locations to registering tablet 
servers
..


Patch Set 4:

(5 comments)

LGTM, just a few nits

http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/master_path_handlers.cc
File src/kudu/master/master_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/master_path_handlers.cc@117
PS4, Line 117: shared_ptr
nit: go a bit further and use auto here?  Feel free to ignore, though.


http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/ts_descriptor-test.cc
File src/kudu/master/ts_descriptor-test.cc:

http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/ts_descriptor-test.cc@86
PS4, Line 86:   const string location = "/foo-bar0/BAAZ.9-quux";
nit: maybe, add underscore as well


http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/ts_descriptor-test.cc@86
PS4, Line 86:   const string location = "/foo-bar0/BAAZ.9-quux";
nit: maybe, add underscore as well


http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/ts_descriptor-test.cc@103
PS4, Line 103: "/foo$", // Contains the illegal character '$'.
nit: maybe, add "\" \"" just in case


http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/ts_descriptor.cc
File src/kudu/master/ts_descriptor.cc:

http://gerrit.cloudera.org:8080/#/c/5/4/src/kudu/master/ts_descriptor.cc@113
PS4, Line 113: StripTrailingWhitespace
nit: maybe, use StripWhiteSpace() to trim both leading the trailing whitespaces?



--
To view, visit http://gerrit.cloudera.org:8080/5
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
Gerrit-Change-Number: 5
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 08 Aug 2018 05:37:08 +
Gerrit-HasComments: Yes


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11149 )

Change subject: python: get new pandas dependency working on Python 2.6
..

python: get new pandas dependency working on Python 2.6

pandas 0.18 dropped support for Python 2.6, so we need to downgrade our
dependency. Moreover, if we don't explicitly install a 2.6-compatible
version of numpy, we'll get the newest one courtesy of the pandas
installation.

Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Reviewed-on: http://gerrit.cloudera.org:8080/11149
Reviewed-by: Andrew Wong 
Tested-by: Kudu Jenkins
---
M build-support/jenkins/build-and-test.sh
M python/requirements.txt
M python/setup.py
3 files changed, 28 insertions(+), 5 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2528: thirdparty downloads should be more robust

2018-08-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11152 )

Change subject: KUDU-2528: thirdparty downloads should be more robust
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic970afdb7ec23d3e04b2d7a9bf77d32774ca4d57
Gerrit-Change-Number: 11152
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 08 Aug 2018 05:01:49 +
Gerrit-HasComments: No


[kudu-CR] Add support for Java dist-tests on Jenkins

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10954 )

Change subject: Add support for Java dist-tests on Jenkins
..


Patch Set 5:

> I am not sure we should merge this yet given we don't have flaky support for 
> Java tests meaning they are not being retried by dist-test. I either need to 
> support them in the flaky dashboard (best case) or tell dist-test to always 
> retry them 3 times.
>
> Because Java tests to-date have been retried 3 times, they seam pretty flaky.

Rather than gate this on flaky test tracking, could we retry 3 times as you 
suggested so we can turn this on now?


--
To view, visit http://gerrit.cloudera.org:8080/10954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20105d3f12fd8c3f95d9fe30fe1b63138614041a
Gerrit-Change-Number: 10954
Gerrit-PatchSet: 5
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 08 Aug 2018 04:55:59 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2528: thirdparty downloads should be more robust

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11152 )

Change subject: KUDU-2528: thirdparty downloads should be more robust
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11152/1/thirdparty/download-thirdparty.sh
File thirdparty/download-thirdparty.sh:

http://gerrit.cloudera.org:8080/#/c/11152/1/thirdparty/download-thirdparty.sh@76
PS1, Line 76: curl --retry 3 -L -O
> Does it make sense to add --fail option as well?  IIRC in case if the serve
I'll remove the file just in case (though I wouldn't expect it to be there).

I don't think --fail makes sense though; it sounds like it yields net less 
information in the output, plus even if curl "succeeded" and produced a file 
containing an HTTP error code, we'll fail to unzip/untar it below and will 
retry anyway.


http://gerrit.cloudera.org:8080/#/c/11152/1/thirdparty/download-thirdparty.sh@78
PS1, Line 78: rm -f "$FILENAME"
> Does it make sense to pause between attempts here?  In case of transient er
Sure, I'll throw in a pause.



--
To view, visit http://gerrit.cloudera.org:8080/11152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic970afdb7ec23d3e04b2d7a9bf77d32774ca4d57
Gerrit-Change-Number: 11152
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 08 Aug 2018 04:28:05 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2528: thirdparty downloads should be more robust

2018-08-07 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11152

to look at the new patch set (#2).

Change subject: KUDU-2528: thirdparty downloads should be more robust
..

KUDU-2528: thirdparty downloads should be more robust

Commit 37f3a95d8 added --retry to our curl invocations, but it turns out
that this only retries failures that curl deems to be "transient". And if
curl exits with a non-zero status, the entire script grinds to a halt
because of set -e.

This patch expands the class of retriable errors by parking the curl
statement in a pipeline, which means it is excused from set -e. After that,
the existing fetch_and_expand retry loop takes over.

Change-Id: Ic970afdb7ec23d3e04b2d7a9bf77d32774ca4d57
---
M thirdparty/download-thirdparty.sh
1 file changed, 10 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/11152/2
--
To view, visit http://gerrit.cloudera.org:8080/11152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic970afdb7ec23d3e04b2d7a9bf77d32774ca4d57
Gerrit-Change-Number: 11152
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..

python: copy pandas dependency into requirements.txt

This preserves the invariant introduced by commit c93f08393 that if you run
'pip install -r requirements.txt', there won't be any additional downloads
when 'python setup.py test' is run later. The invariant was briefly broken
by commit 997ad765f.

Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Reviewed-on: http://gerrit.cloudera.org:8080/11146
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 
---
M python/requirements.txt
M python/setup.py
2 files changed, 6 insertions(+), 0 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Jordan Birdsell (Code Review)
Jordan Birdsell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..


Patch Set 2:

Ok by me. I think the right approach here would be to remove the test 
requirement from setup.py and add back in the skip test logic + the install of 
pandas in the jenkins script. This was the original approach but was objected 
to, but in light of this i think it makes sense to go with the original 
approach.


--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 08 Aug 2018 04:20:06 +
Gerrit-HasComments: No


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Jordan Birdsell (Code Review)
Jordan Birdsell has removed a vote on this change.

Change subject: python: copy pandas dependency into requirements.txt
..


Removed Code-Review-1 by Jordan Birdsell 
--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..


Patch Set 12:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.h@348
PS12, Line 348:   const std::string& GetCurrentValue();
Add 'const' modifier for this method.


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.h@472
PS12, Line 472:   // Value currently pointed to by validx_iter_.
  :   // Currently, it is assumed that 
CFileSet::Iterator::SkipToNextScan(size_t *remaining)
  :   // has exclusive access to use this variable.
Any chance to make update this to keep better level of encapsulation?

If not, maybe add TODO here to be more aware of this caveat.


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.cc@792
PS12, Line 792: DCHECK(!prepared_blocks_.empty());
What happens if this does not hold in release build?

I'm curious why that would not be an option to return non-OK status in that 
case in addition to DCHECK() ?


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@19
PS12, Line 19: #include 
nit: please use '#include ' instead and place that among other C++-style 
include directives below, keeping the list sorted alphabetically.


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@76
PS12, Line 76: true
Per our discussion with Mike today, I thought the idea was to have this 
disabled in first commit before more-like-real-world testing is done.

Am I missing something?


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@435
PS12, Line 435: std::unordered_map
nit: you could use 'auto' here.


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@444
PS12, Line 444: col_id =
Could find_column() return -1 here?


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@462
PS12, Line 462:   if (nonfirst_key_pred_exists) {
  : use_skip_scan_ = true;
  :
  : // Store the predicate column id.
  : skip_scan_predicate_column_id_ = min_col_id;
  :
  : // Store the predicate value.
  : skip_scan_predicate_value_ = pred_value;
  :   }
nit: why not to move this under the same scope where the 
nonfirst_key_pred_exists set to 'true' but after the 'if (col_id < min_col_id)' 
closure?  Doing so, you could get rid of the nonfirst_key_pred_exists, but 
don't forget to add a comment about the semantics of those updates.


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@542
PS12, Line 542:  *
nit: stick the asterisk to the type of the parameter.


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/index_skipscan-test.cc
File src/kudu/tablet/index_skipscan-test.cc:

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/index_skipscan-test.cc@92
PS12, Line 92: s
what it was any other error but IsAlreadyPresent?


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/index_skipscan-test.cc@270
PS12, Line 270: int schema_num = static_cast(GetParam());
Does it make sense to add additional dimension to the test: whether the 
skip-scan feature enabled?  In that way we can make sure that the predicates 
return the same result set in both cases, and that seems to be a good sanity 
check.

For an example of combining multiple parameters in a Cartesian product see 
various tests containing text 'testing::Combine'


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/index_skipscan-test.cc@280
PS12, Line 280: ASSERT_NO_FATAL_FAILURE
nit: you could use NO_FATALS() shortcut instead for better readability


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/index_skipscan-test.cc@321
PS12, Line 321: case kThreePK: {
  : // Test predicate on the third PK column.
  : ScanSpec spec;
  : Slice value_p3("2_p3");
  : auto pred_p3 = 
ColumnPredicate::Equality(schema_.column(2), _p3);
  : spec.AddPredicate(pred_p3);
  : vector results;
  : ASSERT_NO_FATAL_FAILURE(ScanTablet(, , 
"Exact match on P3"));
  : EXPECT_EQ(20, results.size());
  :   break;
  : }
nit: use the same layout of scopes and braces as in other cases (that's for 
better readability).



--
To view, visit 

[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10983

to look at the new patch set (#13).

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..

KUDU-1291. Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
10 files changed, 1,018 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/13
--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 13
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..


Patch Set 2:

> I would like to discuss this before it’s committed. Pandas should not be a 
> requirement

Fair, but currently all of our el6-based tests are failing because we lost 
compatibility. Could we merge this and then discuss better alternatives 
afterwards?


--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 08 Aug 2018 04:10:56 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11067

to look at the new patch set (#26).

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..

KUDU-1291. Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
10 files changed, 1,018 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/26
--
To view, visit http://gerrit.cloudera.org:8080/11067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
Gerrit-Change-Number: 11067
Gerrit-PatchSet: 26
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11067

to look at the new patch set (#25).

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..

KUDU-1291. Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
10 files changed, 1,016 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/25
--
To view, visit http://gerrit.cloudera.org:8080/11067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
Gerrit-Change-Number: 11067
Gerrit-PatchSet: 25
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2528: thirdparty downloads should be more robust

2018-08-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11152 )

Change subject: KUDU-2528: thirdparty downloads should be more robust
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11152/1/thirdparty/download-thirdparty.sh
File thirdparty/download-thirdparty.sh:

http://gerrit.cloudera.org:8080/#/c/11152/1/thirdparty/download-thirdparty.sh@76
PS1, Line 76: curl --retry 3 -L -O
Does it make sense to add --fail option as well?  IIRC in case if the server 
returns HTTP error code, curl would return 0 masking the problem if run without 
--fail.

Also, does it make sense to remove the target file (i.e. $FILENAME) in case of 
an error?  With --fail option, the curl would not create the target file in 
case of HTTP error, but if that were a low-level error then a partially 
downloaded file would be left.


http://gerrit.cloudera.org:8080/#/c/11152/1/thirdparty/download-thirdparty.sh@78
PS1, Line 78: continue
Does it make sense to pause between attempts here?  In case of transient errors 
curl would do exponential back-off on its own while retrying, but in case of 
other errors I think we need to take care of that ourselves.  I don't think we 
need some advanced back-off, but at least pause for 5 seconds or alike.



--
To view, visit http://gerrit.cloudera.org:8080/11152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic970afdb7ec23d3e04b2d7a9bf77d32774ca4d57
Gerrit-Change-Number: 11152
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 08 Aug 2018 00:17:14 +
Gerrit-HasComments: Yes


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Jordan Birdsell (Code Review)
Jordan Birdsell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..


Patch Set 2: Code-Review-1

I would like to discuss this before it’s committed. Pandas should not be a 
requirement


--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 08 Aug 2018 00:08:55 +
Gerrit-HasComments: No


[kudu-CR] [location awareness] Assign locations to registering tablet servers

2018-08-07 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5 )

Change subject: [location_awareness] Assign locations to registering tablet 
servers
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5/3/src/kudu/master/ts_descriptor.cc
File src/kudu/master/ts_descriptor.cc:

http://gerrit.cloudera.org:8080/#/c/5/3/src/kudu/master/ts_descriptor.cc@224
PS3, Line 224: ResolveSockaddr
> Last time I checked, tablet servers can be run with multiple RPC addresses,
I think we can avoid using DNS if we are willing to accept using the host from 
the first listed rpc address for the tablet server (and we could document this 
is how locations are resolved).

Location assignment can only change as often as the TS reregisters, which is 
only as often as the TS restarts.



--
To view, visit http://gerrit.cloudera.org:8080/5
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
Gerrit-Change-Number: 5
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Fengling Wang 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 08 Aug 2018 00:02:04 +
Gerrit-HasComments: Yes


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11149 )

Change subject: python: get new pandas dependency working on Python 2.6
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 23:58:50 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blog: Getting Started with Kudu

2018-08-07 Thread Brock Noland (Code Review)
Hello Jordan Birdsell, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11136

to look at the new patch set (#3).

Change subject: Blog: Getting Started with Kudu
..

Blog: Getting Started with Kudu

Change-Id: I9503dca5e6e565be902c4090bf2996bd1960d763
---
A _posts/2018-08-06-getting-started-with-kudu-an-oreilly-title.md
A img/2018-08-06-getting-started-with-kudu-an-oreilly-title.gif
2 files changed, 57 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/11136/3
--
To view, visit http://gerrit.cloudera.org:8080/11136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9503dca5e6e565be902c4090bf2996bd1960d763
Gerrit-Change-Number: 11136
Gerrit-PatchSet: 3
Gerrit-Owner: Brock Noland 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 23:46:32 +
Gerrit-HasComments: No


[kudu-CR] Implement BloomFilter Predicate in server side.

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11100 )

Change subject: Implement BloomFilter Predicate in server side.
..


Patch Set 2:

(6 comments)

Interesting feature! I just did a pass over the "big picture." The biggest 
thing I'm unsure of is why we're pushing bloom filters into range predicates. 
It seems like they are completely separate predicates, so why combine them with 
ranges?

Also it'd be helpful in reviewing if you could add some comments around some 
the new classes regarding how they are will be used (e.g. how merging works, 
etc.), as well as a high level overview of the design in the commit message.

http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/common/column_predicate.h
File src/kudu/common/column_predicate.h:

http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/common/column_predicate.h@97
PS2, Line 97:   static ColumnPredicate Range(ColumnSchema column, const void* 
lower, const void* upper,
:std::vector* bfs 
= nullptr
I'm not sure range queries should be able to support BloomFilters. Is 
InBloomFilter ANDed with a Range not sufficient? If you add InBloomFilter as 
rank 4 in SelectivityRank() in column_predicate.cc, this should prioritize 
evaluating bloom filters first, and if it yields no rows, further range 
predicates should be skipped.

Regardless, it'd be helpful to add a comment on how 'bfs' works.


http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/common/common.proto
File src/kudu/common/common.proto:

http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/common/common.proto@351
PS2, Line 351: message BloomFilter {
One of the things that I think might be confusing in the future is there is 
this BloomFilter (should probably be renamed to BloomFilterPB), the BloomFilter 
in utili/bloom_filter.h, and BloomFilterInner. Somewhere in this patch (maybe 
the commit message), it's worth explaining what each one is and how it is 
expected to be used. Ex.

BloomFilterPB: protobuf that gets serialized and sent via predicates in scans
BloomFilterInner: in-memory representation of BloomFilterPB
BloomFilter: underlying bloom filter implementation used by the server to check 
the presence of rows


http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/common/common.proto@352
PS2, Line 352: optional int32 nhash = 1;
 : // We currently use CityHash backend
 : optional bytes bloom_data = 2 [(kudu.REDACT) = true];
It's a bit unclear what these fields are. Mind adding comments explaining?

Also, since this is a part of the public API, we'd like to make sure this is 
extendible in the future (e.g. allowing for different kinds of hashes, or 
different kinds of blooms). As such, maybe define an enum to specify the 
implementation details. See common.proto for an example.


http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/common/wire_protocol.cc@412
PS2, Line 412: // Copies a predicate bloom filter data from 'bf_src' into 
'bf_dst'.
 : void CopyPredicateBloomDataToPB(const Slice& bf_src, string* 
bf_dst) {
 :   const void* src;
 :   size_t size;
 :   src = bf_src.data();
 :   size = bf_src.size();
 :   bf_dst->assign(reinterpret_cast(src), size);
 : }
This would be clearer if it took a BloomFilterInner instead of a Slice. Same 
with CopyPredicateBloomDataFromPB, but if it took BloomFilterPB as an 
out-parameter.


http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/tablet/cfile_set-test.cc
File src/kudu/tablet/cfile_set-test.cc:

http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/tablet/cfile_set-test.cc@104
PS2, Line 104: int nrows, BloomFilterBuilder* bf1_contain,
 :BloomFilterBuilder* bf1_exclude,
 :BloomFilterBuilder* bf2_contain,
 :BloomFilterBuilder* bf2_exclude)
Can you comment what these variables do and how they should be used? Same with 
GetBloomFilterResult().


http://gerrit.cloudera.org:8080/#/c/11100/2/src/kudu/tablet/cfile_set-test.cc@451
PS2, Line 451: TEST_F(TestCFileSet, TestBloomFilterPredicates) {
Would you mind adding a high-level summary of what this test is testing and how 
it accomplishes that? What are exclude vs contain, etc.?



--
To view, visit http://gerrit.cloudera.org:8080/11100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I62c2de42667d0255d94e19db773240f7f9ee636c
Gerrit-Change-Number: 11100
Gerrit-PatchSet: 2
Gerrit-Owner: ZhangYao 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10983

to look at the new patch set (#12).

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..

KUDU-1291. Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
10 files changed, 1,017 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/12
--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 12
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11067

to look at the new patch set (#24).

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..

KUDU-1291. Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
10 files changed, 1,017 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/24
--
To view, visit http://gerrit.cloudera.org:8080/11067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
Gerrit-Change-Number: 11067
Gerrit-PatchSet: 24
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2528: thirdparty downloads should be more robust

2018-08-07 Thread Adar Dembo (Code Review)
Hello Andrew Wong,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11152

to review the following change.


Change subject: KUDU-2528: thirdparty downloads should be more robust
..

KUDU-2528: thirdparty downloads should be more robust

Commit 37f3a95d8 added --retry to our curl invocations, but it turns out
that this only retries failures that curl deems to be "transient". And if
curl exits with a non-zero status, the entire script grinds to a halt
because of set -e.

This patch expands the class of retriable errors by parking the curl
statement in a pipeline, which means it is excused from set -e. After that,
the existing fetch_and_expand retry loop takes over.

Change-Id: Ic970afdb7ec23d3e04b2d7a9bf77d32774ca4d57
---
M thirdparty/download-thirdparty.sh
1 file changed, 6 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/11152/1
--
To view, visit http://gerrit.cloudera.org:8080/11152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic970afdb7ec23d3e04b2d7a9bf77d32774ca4d57
Gerrit-Change-Number: 11152
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@470
PS10, Line 470: Status StoreCurrentValue();
  :
> super-nit: in reading the name SetX(), based on other usages in the codebas
Done


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@353
PS10, Line 353:   // Whether the skip scan optimization has searched the 
current prefix for a predicate match
  :   // or whether the prefix has changed since its l
> How about change this to:
Done



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 11
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 07 Aug 2018 23:25:16 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..


Patch Set 11:

(41 comments)

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@59
PS10, Line 59: static const int kEncodedCompositeKeyMaxSize =
> Great question. This is from http://kudu.apache.org/docs/known_issues.html#
Refactored this name to: kEncodedCompositeKeyMaxSize as it makes more sense wrt 
our skip scan use case.


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@339
PS10, Line 339: to store the value obtaine
> nit: note that the "current value" refers to the value after seeking.
Done


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/encoded_key.h
File src/kudu/common/encoded_key.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/encoded_key.h@63
PS10, Line 63: ;
> nit: what does this do?
Created another function IncrementEncodedKeyColumns(..) for better separation 
of responsibility and documented this parameter.


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/partial_row.h
File src/kudu/common/partial_row.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/partial_row.h@52
PS10, Line 52: mplate struct SliceTypeRowOps; // IWYU 
pragma: keep
 : template struct NumTypeRowOps;   // 
IWYU pragma: keep
> nit: not your fault, but mind removing the spaces here?
Done


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@207
PS10, Line 207: gscoped_ptr
> nit: if the change would be contained within cfile_set.cc, mind updating th
Noted. Sticking with gscoped_ptr for now since it's used outside cfile_set.cc 
(in encode_key.cc).


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@209
PS10, Line 209: refix key.
> nit: Since this isn't a variable name, maybe take out the underscore.
Alright. Changed here and elsewhere.


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@210
PS10, Line 210: um_prefix_
> nit: 'num_prefix_cols'
Done


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@210
PS10, Line 210:   // prefix key refers to the first "num_prefix_cols" columns 
of the current key.
  :   // current key is the key currently pointed to by 
validx_iter_ (prior to calling SeekToNextPrefixKey()).
  :   // 'cache_seeked_value' controls whether we will remember the 
next key seeked to in
  :   // this function.
> nit: can you reword this to clarify whether "current" refers to before or a
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@76
PS9, Line 76: true
> Let's default this to false for the initial merge and do some testing to en
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@80
PS9, Line 80:  "Max number of skip attempts the skip scan 
optimization should make before "
> Add:
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@405
PS9, Line 405: e
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@424
PS9, Line 424:
> nit: add spaces around your operators on this line, both the assignment = a
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@430
PS9, Line 430: key_cols;
> nit: move this comment inside the 'if' to right below this line
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@436
PS9, Line 436: nst auto& co
> nit: unnecessary parentheses
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@437
PS9, Line 437:
> nit: put this comment on its own line
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@444
PS9, Line 444: p);
> nit: unnecessary parentheses
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@522
PS9, Line 522: << KUD
> nit: remove so much extra horizontal white space on this line
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@531
PS9, Line 531: "
> s/or/which we consider to be/
Done


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@532
PS9, Line 532:
> currently-seeked key in the primary key index
For better clarity, changed this to -  "cached value obtained after the 
previous seek in the primary key index".


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@576
PS9, Line 576: }
> nit: just 

[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10983

to look at the new patch set (#11).

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..

KUDU-1291. Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
10 files changed, 1,016 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/11
--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 11
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Jordan Birdsell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11149

to look at the new patch set (#2).

Change subject: python: get new pandas dependency working on Python 2.6
..

python: get new pandas dependency working on Python 2.6

pandas 0.18 dropped support for Python 2.6, so we need to downgrade our
dependency. Moreover, if we don't explicitly install a 2.6-compatible
version of numpy, we'll get the newest one courtesy of the pandas
installation.

Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
---
M build-support/jenkins/build-and-test.sh
M python/requirements.txt
M python/setup.py
3 files changed, 28 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/11149/2
--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 


[kudu-CR] python: fix non-DECIMAL unit test breakage

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11148 )

Change subject: python: fix non-DECIMAL unit test breakage
..

python: fix non-DECIMAL unit test breakage

For reference, the table schema is:

  builder.add_column('key').type(kudu.int64).nullable(False)
  builder.add_column('unixtime_micros_val', type_=kudu.unixtime_micros, 
nullable=False)
  if kudu.CLIENT_SUPPORTS_DECIMAL:
builder.add_column('decimal_val', type_=kudu.decimal, precision=5, scale=2)
  builder.add_column('string_val', type_=kudu.string, 
compression=kudu.COMPRESSION_LZ4, encoding='prefix')
  builder.add_column('bool_val', type_=kudu.bool)
  builder.add_column('double_val', type_=kudu.double)
  builder.add_column('int8_val', type_=kudu.int8)
  builder.add_column('binary_val', type_='binary', 
compression=kudu.COMPRESSION_SNAPPY, encoding='prefix')
  builder.add_column('float_val', type_=kudu.float)
  builder.set_primary_keys(['key', 'unixtime_micros_val'])

Change-Id: Ife6cc90ebf229f876854da6cb3a8857c192a296f
Reviewed-on: http://gerrit.cloudera.org:8080/11148
Reviewed-by: Andrew Wong 
Reviewed-by: Grant Henke 
Tested-by: Kudu Jenkins
---
M python/kudu/tests/test_scanner.py
1 file changed, 6 insertions(+), 6 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Grant Henke: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ife6cc90ebf229f876854da6cb3a8857c192a296f
Gerrit-Change-Number: 11148
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11146/1/python/setup.py
File python/setup.py:

http://gerrit.cloudera.org:8080/#/c/11146/1/python/setup.py@193
PS1, Line 193: Note: dependencies in tests_require should also be listed in
 : # requirements.txt so that dependencies aren't downloaded at 
test-time
 : # (when it's more difficult to override various pip 
installation options).
> Does this also apply to setuptools_scm and unittest2?
I don't know that the opposite of this statement (i.e. that things in 
requirements.txt should be in tests_require or install_requires) is necessarily 
true. I've left it alone since I don't fully understand it.


http://gerrit.cloudera.org:8080/#/c/11146/1/python/setup.py@202
PS1, Line 202: cython >= 0.21'
> Not related to this CR, but is it important that this doesn't match up with
Not really, since we guarantee installing Cythin via requirements.txt before 
this. I guess it could have an impact on deployed systems (i.e. if you run 'pip 
install kudu-python', pip will satisfy the contents of install_requires first), 
but I don't know if it really matters or not.



--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 22:24:55 +
Gerrit-HasComments: Yes


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11149 )

Change subject: python: get new pandas dependency working on Python 2.6
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11149/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11149/1//COMMIT_MSG@11
PS1, Line 11: courtesy of the pandas
: installation
> To clarify, pip would install pandas 0.18 but numpy ++1.12, leading to the
That's correct. The problem is that pandas expresses a 'numpy >=1.7.0' 
dependency, which pip can (and does) satisfy using the latest numpy.

pip does support environment markers [1] to isolate particular dependencies to 
things like Python versions, but not all versions of pip support that. You can 
read more about it in commit 3d734055c.

1. https://www.python.org/dev/peps/pep-0508/#environment-markers


http://gerrit.cloudera.org:8080/#/c/11149/1/python/requirements.txt
File python/requirements.txt:

http://gerrit.cloudera.org:8080/#/c/11149/1/python/requirements.txt@11
PS1, Line 11: 0.19
> 0.18, right?
Whoops, yes.


http://gerrit.cloudera.org:8080/#/c/11149/1/python/setup.py
File python/setup.py:

http://gerrit.cloudera.org:8080/#/c/11149/1/python/setup.py@197
PS1, Line 197: 0.19
> 0.18 here too
Done



--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Comment-Date: Tue, 07 Aug 2018 22:22:18 +
Gerrit-HasComments: Yes


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/11149 )

Change subject: python: get new pandas dependency working on Python 2.6
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11149 )

Change subject: python: get new pandas dependency working on Python 2.6
..


Patch Set 1: Verified+1

Overriding Jenkins, unrelated test failure.


--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 22:07:31 +
Gerrit-HasComments: No


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11149 )

Change subject: python: get new pandas dependency working on Python 2.6
..


Patch Set 1: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11149/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11149/1//COMMIT_MSG@11
PS1, Line 11: courtesy of the pandas
: installation
To clarify, pip would install pandas 0.18 but numpy ++1.12, leading to the 
incompatibility? I'm surprised pip doesn't take into account the python version?


http://gerrit.cloudera.org:8080/#/c/11149/1/python/requirements.txt
File python/requirements.txt:

http://gerrit.cloudera.org:8080/#/c/11149/1/python/requirements.txt@11
PS1, Line 11: 0.19
0.18, right?


http://gerrit.cloudera.org:8080/#/c/11149/1/python/setup.py
File python/setup.py:

http://gerrit.cloudera.org:8080/#/c/11149/1/python/setup.py@197
PS1, Line 197: 0.19
0.18 here too



--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 21:40:19 +
Gerrit-HasComments: Yes


[kudu-CR] python: fix non-DECIMAL unit test breakage

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11148 )

Change subject: python: fix non-DECIMAL unit test breakage
..


Patch Set 1:

> I wonder if it'd be worth running tests with both KUDU_SUPPORTS_DECIMAL=true 
> and KUDU_SUPPORTS_DECIMAL=false, even if in a __int128-compatible 
> environment, sort of like what we do for hole-punching and FBM/LBM testing. 
> Though it seems there are only a few places where we are checking for this 
> support.

Yeah, parameterizing the test would make sense, but like you said, I'm hesitant 
to do that right now given the limited payoff.

I think it makes sense to do it once we have a stronger need for test 
parameterization.


--
To view, visit http://gerrit.cloudera.org:8080/11148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife6cc90ebf229f876854da6cb3a8857c192a296f
Gerrit-Change-Number: 11148
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 21:31:36 +
Gerrit-HasComments: No


[kudu-CR] python: fix non-DECIMAL unit test breakage

2018-08-07 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11148 )

Change subject: python: fix non-DECIMAL unit test breakage
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife6cc90ebf229f876854da6cb3a8857c192a296f
Gerrit-Change-Number: 11148
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 21:29:30 +
Gerrit-HasComments: No


[kudu-CR] python: fix non-DECIMAL unit test breakage

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11148 )

Change subject: python: fix non-DECIMAL unit test breakage
..


Patch Set 1: Code-Review+2

I wonder if it'd be worth running tests with both KUDU_SUPPORTS_DECIMAL=true 
and KUDU_SUPPORTS_DECIMAL=false, even if in a __int128-compatible environment, 
sort of like what we do for hole-punching and FBM/LBM testing. Though it seems 
there are only a few places where we are checking for this support.


--
To view, visit http://gerrit.cloudera.org:8080/11148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife6cc90ebf229f876854da6cb3a8857c192a296f
Gerrit-Change-Number: 11148
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 21:25:57 +
Gerrit-HasComments: No


[kudu-CR] Add const attribute to RpcContext::GetInboundSidecar()

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11150 )

Change subject: Add const attribute to RpcContext::GetInboundSidecar()
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1d959c40aa119389b09c9f6ac2afe7445066a38
Gerrit-Change-Number: 11150
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 21:23:41 +
Gerrit-HasComments: No


[kudu-CR] Add const attribute to RpcContext::GetInboundSidecar()

2018-08-07 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11150


Change subject: Add const attribute to RpcContext::GetInboundSidecar()
..

Add const attribute to RpcContext::GetInboundSidecar()

Testing done: rpc-test

Change-Id: Ie1d959c40aa119389b09c9f6ac2afe7445066a38
---
M src/kudu/rpc/rpc_context.cc
M src/kudu/rpc/rpc_context.h
2 files changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/11150/1
--
To view, visit http://gerrit.cloudera.org:8080/11150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie1d959c40aa119389b09c9f6ac2afe7445066a38
Gerrit-Change-Number: 11150
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[kudu-CR] python: get new pandas dependency working on Python 2.6

2018-08-07 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Jordan Birdsell,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11149

to review the following change.


Change subject: python: get new pandas dependency working on Python 2.6
..

python: get new pandas dependency working on Python 2.6

pandas 0.18 dropped support for Python 2.6, so we need to downgrade our
dependency. Moreover, if we don't explicitly install a 2.6-compatible
version of numpy, we'll get the newest one courtesy of the pandas
installation.

Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
---
M build-support/jenkins/build-and-test.sh
M python/requirements.txt
M python/setup.py
3 files changed, 28 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/11149/1
--
To view, visit http://gerrit.cloudera.org:8080/11149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I18dc108ffbe18c74167f6c761d4194293eae76db
Gerrit-Change-Number: 11149
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 


[kudu-CR] python: fix non-DECIMAL unit test breakage

2018-08-07 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Jordan Birdsell,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11148

to review the following change.


Change subject: python: fix non-DECIMAL unit test breakage
..

python: fix non-DECIMAL unit test breakage

For reference, the table schema is:

  builder.add_column('key').type(kudu.int64).nullable(False)
  builder.add_column('unixtime_micros_val', type_=kudu.unixtime_micros, 
nullable=False)
  if kudu.CLIENT_SUPPORTS_DECIMAL:
builder.add_column('decimal_val', type_=kudu.decimal, precision=5, scale=2)
  builder.add_column('string_val', type_=kudu.string, 
compression=kudu.COMPRESSION_LZ4, encoding='prefix')
  builder.add_column('bool_val', type_=kudu.bool)
  builder.add_column('double_val', type_=kudu.double)
  builder.add_column('int8_val', type_=kudu.int8)
  builder.add_column('binary_val', type_='binary', 
compression=kudu.COMPRESSION_SNAPPY, encoding='prefix')
  builder.add_column('float_val', type_=kudu.float)
  builder.set_primary_keys(['key', 'unixtime_micros_val'])

Change-Id: Ife6cc90ebf229f876854da6cb3a8857c192a296f
---
M python/kudu/tests/test_scanner.py
1 file changed, 6 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/11148/1
--
To view, visit http://gerrit.cloudera.org:8080/11148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ife6cc90ebf229f876854da6cb3a8857c192a296f
Gerrit-Change-Number: 11148
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-07 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@59
PS10, Line 59: static const int kPrimaryKeyMaxSize = 16*1024;
> nit: is this a hard limit or just a target? How is this enforced?
Great question. This is from 
http://kudu.apache.org/docs/known_issues.html#_primary_keys and we haven't 
investigated how it is enforced yet or whether there is another constant 
available for this, although I grepped around and didn't find one.


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@473
PS10, Line 473:   // Currently, it is assumed that 
CFileSet::Iterator::SkipToNextScan(size_t *remaining)
  :   // has exclusive access to use this variable.
This breaks encapsulation. We should document something along these lines in 
CFileSet::Iterator itself with respect to the validx_iter_ variable; this class 
should not reference a separate "client" class that uses it, which would be 
especially surprising considering this is a private variable.


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@353
PS10, Line 353:   // To track whether the lower bound prefix key rolled over 
between the first and
  :   // second seek to determine the lower bound key.
How about change this to:

  // Whether the skip scan optimization has searched the current prefix for a 
predicate match or whether the prefix has changed since its last check.



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 10
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 07 Aug 2018 19:43:24 +
Gerrit-HasComments: Yes


[kudu-CR] WIP: Create parallelized loader Spark job

2018-08-07 Thread Mike Percy (Code Review)
Mike Percy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11147


Change subject: WIP: Create parallelized loader Spark job
..

WIP: Create parallelized loader Spark job

This patch adds a new tool called LoadRandomData that will load random
data into a table in a parallelized manner. Perhaps the tool should be
named something else so it would be straightforward to add a sequential
inserter that supports a range-partitioned table.

Marked WIP because it could use more testing.

Change-Id: I4434c9f5d709154037386b4c7be94045df162267
---
A 
java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/RandomDataGenerator.java
M java/kudu-spark-tools/build.gradle
M java/kudu-spark-tools/pom.xml
A 
java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/LoadRandomData.scala
A 
java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/LoadRandomDataOptions.scala
A 
java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/LoadRandomDataTest.scala
6 files changed, 294 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/11147/1
--
To view, visit http://gerrit.cloudera.org:8080/11147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4434c9f5d709154037386b4c7be94045df162267
Gerrit-Change-Number: 11147
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy 


[kudu-CR] tablet bootstrap: adjust mvcc safetime with no-ops

2018-08-07 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11142

to look at the new patch set (#2).

Change subject: tablet_bootstrap: adjust mvcc safetime with no-ops
..

tablet_bootstrap: adjust mvcc safetime with no-ops

Previously, during tablet bootstrap, a tablet would only update its MVCC
safetime based on write messages, as the timstamps in the write messages
are guaranteed to be serialized with respect to one another, by virtue
of being assigned in a single thread (the prepare thread) on the leader
replica.

>From this, we conclude that timestamps for write operations are
monotonically increasing in unison with opid. The same cannot
necessarily be said for timestamps of no-ops and change configs.

This is a conservative conclusion about assigned timestamps, and this
patch hinges on the fact that our Raft implementation ensures the
following sequence of events:

1. replica A becomes leader of Term N
2. leader A assigns a timestamp t1 to its no-op
3. leader A replicates the no-op to replicas B and C, asserting its
   leadership for Term N
4. leader A prepares a write and assigns it a timestamp t2. A assigns a
   higher timestamp than t1, as this step happens after Step 2
5. leader A replicates the write to replicas B and C, checking that it
   is leader for the current term

Given the above series of operations, within the same term, the no-op
used to assert leadership is always assigned a timestamp that must be
lower than any writes in that term. As such, the timestamps assigned to
no-ops can and should be used to bump safetime.

This patch updates tablet bootstrap to adjust MVCC safetime based on
no-ops seen in the WALs. A test is added asserting that this is true of
no-ops with respect to writes. I.e. all replicate messages must have
monotonically increasing OpIds and monotonically increasing timestamps.

A case in tablet_bootstrap-test depends on the ability for no-ops to
written out of timestamp order. To maintain this, and to keep this
functionality around (which may be useful for general timestamp
assignment testing), a flag has been added to the NoOpRequestPB
indicating whether or not its timestamp should be trusted to advance
safetime.

Additionally, I tweaked the artificial timestamps used in
raft_consensus-itest. These timestamps were previously very low and
would overlap with the real timestamp used by the leadership no-op.

Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f
---
M src/kudu/consensus/consensus.proto
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/log_verifier.cc
M src/kudu/integration-tests/log_verifier.h
M src/kudu/integration-tests/raft_consensus-itest.cc
A src/kudu/integration-tests/timestamp_serialization-itest.cc
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_bootstrap.cc
8 files changed, 241 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11142/2
--
To view, visit http://gerrit.cloudera.org:8080/11142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f
Gerrit-Change-Number: 11142
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [docs] Add changing master hostnames workflow

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11058 )

Change subject: [docs] Add changing master hostnames workflow
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11058/2/docs/administration.adoc
File docs/administration.adoc:

http://gerrit.cloudera.org:8080/#/c/11058/2/docs/administration.adoc@718
PS2, Line 718: SELECT
nit: surround in backticks



--
To view, visit http://gerrit.cloudera.org:8080/11058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0df87d5e294d8b7bf5c7b8f94a63599ffd7ebe03
Gerrit-Change-Number: 11058
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Bukor 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 07 Aug 2018 19:03:21 +
Gerrit-HasComments: Yes


[kudu-CR] java: remove flaky TestServerInfo.testConstructorNotSlow

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11125 )

Change subject: java: remove flaky TestServerInfo.testConstructorNotSlow
..

java: remove flaky TestServerInfo.testConstructorNotSlow

This test was added in commit 0b1adf1f6. At the time there was some concern
that the 500ms timeout could cause flakiness, but the reasoning was that the
InetAddress.getByName calls weren't going to do DNS lookups. However, I saw
the test timeout in precommit with the following stack trace:

  org.junit.runners.model.TestTimedOutException: test timed out after 500 
milliseconds
at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242)
at java.io.File.exists(File.java:819)
at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1245)
at sun.misc.URLClassPath$FileLoader.findResource(URLClassPath.java:1212)
at sun.misc.URLClassPath.findResource(URLClassPath.java:188)
at java.net.URLClassLoader$2.run(URLClassLoader.java:569)
at java.net.URLClassLoader$2.run(URLClassLoader.java:567)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findResource(URLClassLoader.java:566)
at java.lang.ClassLoader.getResource(ClassLoader.java:1096)
at org.apache.log4j.helpers.Loader.getResource(Loader.java:110)
at org.apache.log4j.LogManager.(LogManager.java:107)
at org.slf4j.impl.Log4jLoggerFactory.(Log4jLoggerFactory.java:66)
at org.slf4j.impl.StaticLoggerBinder.(StaticLoggerBinder.java:72)
at 
org.slf4j.impl.StaticLoggerBinder.(StaticLoggerBinder.java:45)
at org.slf4j.LoggerFactory.bind(LoggerFactory.java:150)
at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:124)
at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:412)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:357)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:383)
at org.apache.kudu.util.NetUtil.(NetUtil.java:42)
at org.apache.kudu.client.ServerInfo.(ServerInfo.java:58)
at 
org.apache.kudu.client.TestServerInfo.testConstructorNotSlow(TestServerInfo.java:35)

Per the trace, it can take over 500ms just to initialize the test and its
dependents. As such, I think it's flaky by design and should be removed.

Change-Id: Id7bbe47bc88e88d67ebdb199cf34723a4b69e57c
Reviewed-on: http://gerrit.cloudera.org:8080/11125
Reviewed-by: Andrew Wong 
Tested-by: Adar Dembo 
---
M java/kudu-client/src/test/java/org/apache/kudu/client/TestServerInfo.java
1 file changed, 0 insertions(+), 14 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Adar Dembo: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id7bbe47bc88e88d67ebdb199cf34723a4b69e57c
Gerrit-Change-Number: 11125
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Pavel Martynov 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] java: remove flaky TestServerInfo.testConstructorNotSlow

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11125 )

Change subject: java: remove flaky TestServerInfo.testConstructorNotSlow
..


Patch Set 2: Verified+1

Overriding Jenkins, unrelated test failure.


--
To view, visit http://gerrit.cloudera.org:8080/11125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7bbe47bc88e88d67ebdb199cf34723a4b69e57c
Gerrit-Change-Number: 11125
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Pavel Martynov 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 07 Aug 2018 18:56:07 +
Gerrit-HasComments: No


[kudu-CR] java: remove flaky TestServerInfo.testConstructorNotSlow

2018-08-07 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/11125 )

Change subject: java: remove flaky TestServerInfo.testConstructorNotSlow
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/11125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Id7bbe47bc88e88d67ebdb199cf34723a4b69e57c
Gerrit-Change-Number: 11125
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Pavel Martynov 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] java: remove flaky TestServerInfo.testConstructorNotSlow

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11125 )

Change subject: java: remove flaky TestServerInfo.testConstructorNotSlow
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7bbe47bc88e88d67ebdb199cf34723a4b69e57c
Gerrit-Change-Number: 11125
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Pavel Martynov 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 07 Aug 2018 18:50:02 +
Gerrit-HasComments: No


[kudu-CR] tablet bootstrap: adjust mvcc safetime with no-ops

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11142 )

Change subject: tablet_bootstrap: adjust mvcc safetime with no-ops
..


Patch Set 1:

Ah, I omitted the part where we add the timestamp_in_opid_order to the 
NoOpRequestPB, hence the broken tests. Let me update this a bit.


--
To view, visit http://gerrit.cloudera.org:8080/11142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f
Gerrit-Change-Number: 11142
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 18:46:53 +
Gerrit-HasComments: No


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11146/1/python/setup.py
File python/setup.py:

http://gerrit.cloudera.org:8080/#/c/11146/1/python/setup.py@193
PS1, Line 193: Note: dependencies in tests_require should also be listed in
 : # requirements.txt so that dependencies aren't downloaded at 
test-time
 : # (when it's more difficult to override various pip 
installation options).
Does this also apply to setuptools_scm and unittest2?


http://gerrit.cloudera.org:8080/#/c/11146/1/python/setup.py@202
PS1, Line 202: cython >= 0.21'
Not related to this CR, but is it important that this doesn't match up with 
requirements.txt (cython ==0.26.1)?



--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 18:17:51 +
Gerrit-HasComments: Yes


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Jordan Birdsell (Code Review)
Jordan Birdsell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11146 )

Change subject: python: copy pandas dependency into requirements.txt
..


Patch Set 1:

This was done intentionally. Pandas should be optional, not required.


--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 07 Aug 2018 18:17:20 +
Gerrit-HasComments: No


[kudu-CR] python: copy pandas dependency into requirements.txt

2018-08-07 Thread Adar Dembo (Code Review)
Hello Jordan Birdsell, Andrew Wong,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11146

to review the following change.


Change subject: python: copy pandas dependency into requirements.txt
..

python: copy pandas dependency into requirements.txt

This preserves the invariant introduced by commit c93f08393 that if you run
'pip install -r requirements.txt', there won't be any additional downloads
when 'python setup.py test' is run later. The invariant was briefly broken
by commit 997ad765f.

Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
---
M python/requirements.txt
M python/setup.py
2 files changed, 6 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/11146/1
--
To view, visit http://gerrit.cloudera.org:8080/11146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iadb42535956461fbcb41adc67f5e4c73c310463d
Gerrit-Change-Number: 11146
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Jordan Birdsell 


[kudu-CR] wip consensus: add RPC to replicate a NO OP

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11143


Change subject: wip consensus: add RPC to replicate a NO_OP
..

wip consensus: add RPC to replicate a NO_OP

This patch adds an RPC to replicate a NO_OP message. Unlike NO_OPs that
get sent following leader elections, these NO_OPs do not have any
guarantees regarding the monotonicity of their timestamps with respect
to other operations.

In addition to being a nice tool to check that consensus is running on a
given server for a given tablet, this endpoint is useful in stressing
the assignment of timestamps for non-txn operations. Pinging consensus
on a leader will replicate

It's important to be able to determine whether or not a NO_OP message
was created using this endpoint directly, vs through the normal flow of
Raft's election algorithm, as timestamps assigned in the former are
serialized with respect to writes in the term the NO_OP was sent. As
such, a field has been added to the NoOpRequestPB indicating whether the
NO_OP was sent via this RPC or not.

WIP: would like to add more targetted testing

Change-Id: If98292503dc81cdd1290f1d96d8b4229ae56224f
---
M src/kudu/consensus/consensus.proto
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
5 files changed, 101 insertions(+), 21 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/11143/1
--
To view, visit http://gerrit.cloudera.org:8080/11143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If98292503dc81cdd1290f1d96d8b4229ae56224f
Gerrit-Change-Number: 11143
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 


[kudu-CR] wip consensus: atomicize timestamp/opid assignment

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11144


Change subject: wip consensus: atomicize timestamp/opid assignment
..

wip consensus: atomicize timestamp/opid assignment

This patch atomicizes timestamp assignment with respect to OpIds by
moving the assignment of timestamps into the RaftConsensus module, right
before operations are appended to the consensus queue to be sent to the
peers.

This patch adds test cases for timestamp serialization:
1. a test that writes data while triggering many change configs
2. a test that writes data while barraging the PingConsensus endpoint to
   replicate a no-op, circumventing any kind of serialization

WIP: dependent on the previous patch (PingConsensus RPC)

Change-Id: Id347847d117509b0eadedcb4117b4e781f6e3654
---
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/integration-tests/timestamp_serialization-itest.cc
M src/kudu/tablet/transactions/transaction_driver.cc
4 files changed, 168 insertions(+), 35 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/44/11144/1
--
To view, visit http://gerrit.cloudera.org:8080/11144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id347847d117509b0eadedcb4117b4e781f6e3654
Gerrit-Change-Number: 11144
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 


[kudu-CR] tablet bootstrap: adjust mvcc safetime with no-ops

2018-08-07 Thread Andrew Wong (Code Review)
Andrew Wong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11142


Change subject: tablet_bootstrap: adjust mvcc safetime with no-ops
..

tablet_bootstrap: adjust mvcc safetime with no-ops

Previously, during tablet bootstrap, a tablet would only update its MVCC
safetime based on write messages, as the timstamps in the write messages
are guaranteed to be serialized with respect to one another, by virtue
of being assigned in a single thread (the prepare thread) on the leader
replica.

>From this, we conclude that timestamps for write operations are
monotonically increasing in unison with opid. The same cannot
necessarily be said for timestamps of no-ops and change configs.

This is a conservative conclusion about assigned timestamps, and this
patch hinges on the fact that our Raft implementation ensures the
following sequence of events:

1. replica A becomes leader of Term N
2. leader A assigns a timestamp t1 to its no-op
3. leader A replicates the no-op to replicas B and C, asserting its
   leadership for Term N
4. leader A prepares a write and assigns it a timestamp t2. A assigns a
   higher timestamp than t1, as this step happens after Step 2
5. leader A replicates the write to replicas B and C, checking that it
   is leader for the current term

Given the above series of operations, within the same term, the no-op
used to assert leadership is always assigned a timestamp that must be
lower than any writes in that term. As such, the timestamps assigned to
no-ops can and should be used to bump safetime.

This patch updates tablet bootstrap to adjust MVCC safetime based on
no-ops seen in the WALs. A test is added asserting that this is true of
no-ops with respect to writes. I.e. all replicate messages must have
monotonically increasing OpIds and monotonically increasing timestamps.

Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/log_verifier.cc
M src/kudu/integration-tests/log_verifier.h
A src/kudu/integration-tests/timestamp_serialization-itest.cc
M src/kudu/tablet/tablet_bootstrap.cc
5 files changed, 216 insertions(+), 13 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11142/1
--
To view, visit http://gerrit.cloudera.org:8080/11142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f
Gerrit-Change-Number: 11142
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 


[kudu-CR] [Java] Update readme with Gradle commands

2018-08-07 Thread Grant Henke (Code Review)
Grant Henke has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11135 )

Change subject: [Java] Update readme with Gradle commands
..

[Java] Update readme with Gradle commands

Updates the Java readme to use primarily Gradle
commands. Additionally changes the format
from Markdown to AsciiDoc to match our
other readme files.

Includes a couple small fixes to simplifiy Intellij
interaction as described in the readme.
- Adds generated sources for protobuf and avro files
- Uses the sources distribution in the wrapper

Change-Id: I349ac7259beb1f6b80b80265803a06ec532019d7
Reviewed-on: http://gerrit.cloudera.org:8080/11135
Tested-by: Kudu Jenkins
Reviewed-by: Grant Henke 
---
A java/README.adoc
D java/README.md
M java/gradle/protobuf.gradle
M java/gradle/wrapper.gradle
M java/gradle/wrapper/gradle-wrapper.properties
5 files changed, 190 insertions(+), 175 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Grant Henke: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I349ac7259beb1f6b80b80265803a06ec532019d7
Gerrit-Change-Number: 11135
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] [Java] Update readme with Gradle commands

2018-08-07 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11135 )

Change subject: [Java] Update readme with Gradle commands
..


Patch Set 3: Code-Review+2

(1 comment)

Carrying the +2 through the nit fix.

http://gerrit.cloudera.org:8080/#/c/11135/2/java/README.adoc
File java/README.adoc:

http://gerrit.cloudera.org:8080/#/c/11135/2/java/README.adoc@136
PS2, Line 136: a future releas
> Nit: in a future release (we'd probably remove it all at once, right?)
Done



--
To view, visit http://gerrit.cloudera.org:8080/11135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I349ac7259beb1f6b80b80265803a06ec532019d7
Gerrit-Change-Number: 11135
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 07 Aug 2018 14:39:41 +
Gerrit-HasComments: Yes


[kudu-CR] [Java] Update readme with Gradle commands

2018-08-07 Thread Grant Henke (Code Review)
Hello Mike Percy, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11135

to look at the new patch set (#3).

Change subject: [Java] Update readme with Gradle commands
..

[Java] Update readme with Gradle commands

Updates the Java readme to use primarily Gradle
commands. Additionally changes the format
from Markdown to AsciiDoc to match our
other readme files.

Includes a couple small fixes to simplifiy Intellij
interaction as described in the readme.
- Adds generated sources for protobuf and avro files
- Uses the sources distribution in the wrapper

Change-Id: I349ac7259beb1f6b80b80265803a06ec532019d7
---
A java/README.adoc
D java/README.md
M java/gradle/protobuf.gradle
M java/gradle/wrapper.gradle
M java/gradle/wrapper/gradle-wrapper.properties
5 files changed, 190 insertions(+), 175 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/11135/3
--
To view, visit http://gerrit.cloudera.org:8080/11135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I349ac7259beb1f6b80b80265803a06ec532019d7
Gerrit-Change-Number: 11135
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy