[kudu-CR] KUDU-2525: KuduTableInputFormat may halt before exhausting scan

2018-08-03 Thread zhangqianqiong (Code Review)
zhangqianqiong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/1 )

Change subject: KUDU-2525: KuduTableInputFormat may halt before exhausting scan
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
File 
java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java:

http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java@66
PS1, Line 66:  * This input format generates one split per tablet and the only 
location for each split is that
> Could you try to synthesize this into a unit test? That'll help solidify th
I also thought It is OK in here without a unit test.
If possible, The TabletService.Scan() of the backend should add a unit test in 
case the buget_ms is out.


http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java@448
PS1, Line 448: // TODO(qqzhang) scanner.nextRows() sometimes returns an 
empty RowResultIterator,
> Thanks for the added content, but why TODO(...)? Again, we only use TODOs w
Thanks for your point, I re-edited the comments



--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbfdd2efbd281e4d849917664b33e183e180bafd
Gerrit-Change-Number: 1
Gerrit-PatchSet: 2
Gerrit-Owner: zhangqianqiong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: zhangqianqiong 
Gerrit-Comment-Date: Sat, 04 Aug 2018 02:35:21 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2525: KuduTableInputFormat may halt before exhausting scan

2018-08-03 Thread zhangqianqiong (Code Review)
Hello Dan Burkert, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/1

to look at the new patch set (#3).

Change subject: KUDU-2525: KuduTableInputFormat may halt before exhausting scan
..

KUDU-2525: KuduTableInputFormat may halt before exhausting scan

Change-Id: Ifbfdd2efbd281e4d849917664b33e183e180bafd
---
M 
java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
1 file changed, 9 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/1/3
--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbfdd2efbd281e4d849917664b33e183e180bafd
Gerrit-Change-Number: 1
Gerrit-PatchSet: 3
Gerrit-Owner: zhangqianqiong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: zhangqianqiong 


[kudu-CR] [consensus] KUDU-2335 increment term on explicit step down

2018-08-03 Thread Alexey Serbin (Code Review)
Hello Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11122

to look at the new patch set (#3).

Change subject: [consensus] KUDU-2335 increment term on explicit step down
..

[consensus] KUDU-2335 increment term on explicit step down

Prior to this patch, the catalog manager could get a tablet report
from a former leader replica with empty leader UUID and old term.
A dedicated logic in the catalog manager's code (see section 7d(i))
would amend the empty leader UUID to replace it with the previous
leader's UUID.  As a result of those shenanigans, catalog manager
would interpret the incoming report as a report from a leader replica
that reports its own health status as UNKNOWN.

The TwoConcurrentRebalancers scenario of the recently introduced
ConcurrentRebalancersTest reproduces the issue pretty often
(about 1 in 100 runs failed), so it was easy to pin-point the problem.
Mike sketched the fix and I ran the new code via dist-test about 1K
times and verified the problem is gone.

As for the test coverage, in addition to the already mentioned
would-be-flaky TwoConcurrentRebalancers scenario, I modified
RaftConsensusElectionITest.LeaderStepDown to reliably catch regressions.

Additionally, I updated the TabletCopyITest.TestRejectRogueLeader
scenario not asking rogue leader to step down: otherwise it would
fail.

Change-Id: I4e1f1446176a78ba04e74dd1153f9048a32d8d5f
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/integration-tests/raft_consensus_election-itest.cc
M src/kudu/integration-tests/tablet_copy-itest.cc
3 files changed, 35 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/11122/3
--
To view, visit http://gerrit.cloudera.org:8080/11122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4e1f1446176a78ba04e74dd1153f9048a32d8d5f
Gerrit-Change-Number: 11122
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] memrowset: support iteration with is deleted virtual column

2018-08-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10990 )

Change subject: memrowset: support iteration with is_deleted virtual column
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10990/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10990/9//COMMIT_MSG@15
PS9, Line 15: t have a read default value.
:
: Change-Id: Ic6b053f5a369
> FWIW I think this would be fine, the non-nullable PK restrictions are kind
OK, I'll enforce non-nullability with a read default then.


http://gerrit.cloudera.org:8080/#/c/10990/9/src/kudu/tablet/memrowset.cc
File src/kudu/tablet/memrowset.cc:

http://gerrit.cloudera.org:8080/#/c/10990/9/src/kudu/tablet/memrowset.cc@408
PS9, Line 408:   // Enforce some properties on the virtual column that 
simplify our
> Is this missing a break?  Otherwise it's the last instance.
Oops, good catch.



--
To view, visit http://gerrit.cloudera.org:8080/10990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6b053f5a3696eb9d7c26b8e3d96752f4f87bcd8
Gerrit-Change-Number: 10990
Gerrit-PatchSet: 10
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 04 Aug 2018 00:24:18 +
Gerrit-HasComments: Yes


[kudu-CR] memrowset: support iteration with is deleted virtual column

2018-08-03 Thread Adar Dembo (Code Review)
Hello Mike Percy, Dan Burkert, Kudu Jenkins, Grant Henke, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10990

to look at the new patch set (#10).

Change subject: memrowset: support iteration with is_deleted virtual column
..

memrowset: support iteration with is_deleted virtual column

This patch rounds out the MemRowSet changes for incremental backups.
Taken together, it is now possible to iterate on a specific time range and
to learn which rows were deleted during that time range.

Data type based virtual columns show a wart here: the virtual column definition
itself does not specify whether it is nullable or has a read default value. To
simplify, we enforce that it must have a read default value.

Change-Id: Ic6b053f5a3696eb9d7c26b8e3d96752f4f87bcd8
---
M src/kudu/tablet/memrowset-test.cc
M src/kudu/tablet/memrowset.cc
M src/kudu/tablet/memrowset.h
3 files changed, 146 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/10990/10
--
To view, visit http://gerrit.cloudera.org:8080/10990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic6b053f5a3696eb9d7c26b8e3d96752f4f87bcd8
Gerrit-Change-Number: 10990
Gerrit-PatchSet: 10
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Add unstable client APIs to fetch HMS integration configuration status

2018-08-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11121 )

Change subject: Add unstable client APIs to fetch HMS integration configuration 
status
..


Patch Set 1:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/11121/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11121/1//COMMIT_MSG@12
PS1, Line 12: The new C++ API is utilized by the HMS tools to
: get the HMS connection information without having to use the 
GetFlags
: API, which requires admin privileges.
Do you anticipate a paranoid user wanting to restrict access to this 
information? At the moment it's available to anyone who can authenticate to 
Kudu, right?


http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java:

http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@298
PS1, Line 298:   private HiveMetastoreConfig hiveMetastoreConfig = null;
Document.


http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@839
PS1, Line 839:   public Deferred getHiveMetastoreConfig() {
Is there any way to reuse the guts of exportAuthenticationCredentials here? 
Perhaps a generic helper whose type is either byte[] or HiveMetastoreConfig, 
with a passed-in callback which returns something different after the lookup in 
either case?


http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@848
PS1, Line 848: .addCallback(new MasterLookupCB(masterTable, null, 1))
I know this was just copied from exportAuthenticationCredentials, but could you 
document the importance of null and 1?


http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@852
PS1, Line 852: // Connecting to the cluster should have also 
fetched the
 : // authn data.
Not relevant for this use case.


http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java@1579
PS1, Line 1579: synchronized (AsyncKuduClient.this) {
Could you rewrite this block so that the lock is only held around the reset of 
hiveMetastoreConfig?


http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/HiveMetastoreConfig.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/HiveMetastoreConfig.java:

http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/main/java/org/apache/kudu/client/HiveMetastoreConfig.java@27
PS1, Line 27: @InterfaceStability.Evolving
Why Evolving and not Unstable?


http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java
File 
java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java:

http://gerrit.cloudera.org:8080/#/c/11121/1/java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java@93
PS1, Line 93: FakeDNS.getInstance().install();
Is this needed to pacify the HMS? I think it's used in testKerberos for the KDC.


http://gerrit.cloudera.org:8080/#/c/11121/1/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/11121/1/src/kudu/client/client-internal.cc@713
PS1, Line 713: hive_config
Can't this be null if there's no integration?


http://gerrit.cloudera.org:8080/#/c/11121/1/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/11121/1/src/kudu/client/client.h@67
PS1, Line 67: bool GetHiveMetastoreSaslEnabled(const client::KuduClient&);
: std::string GetHiveMetastoreUris(const client::KuduClient&);
We've been pretty inconsistent on defining private C++ client APIs:
1. KuduClient::GetTablet is a public (in the C++ sense of the word) function 
but its implementation is not exported.
2. KuduTableAlterer::alter_external_catalogs is a private function with an 
exported implementation and friendship.
3. A few CLI functions aren't even part of the client API at all; they access 
the PIMPL'ed data members directly via friendship. These new APIs follow this 
approach.

AFAICT, all uses of approach #2 and #3 were added for HMS integration, so I 
don't feel too bad asking you to spend a little time making this more 
consistent. I prefer the approach taken by GetTablet, for a few reasons:
- It doesn't require friendship or forward declaration of external 
functions/classes.
- It's a bit harder to abuse (you need to rewrite the shared object instead of 
a plain text header).
- It doesn't require awkwardly passing KuduClient as a function argument.
- It also makes it easy to document 

[kudu-CR] [consensus] KUDU-2335 increment term on explicit step down

2018-08-03 Thread Alexey Serbin (Code Review)
Hello Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11122

to look at the new patch set (#2).

Change subject: [consensus] KUDU-2335 increment term on explicit step down
..

[consensus] KUDU-2335 increment term on explicit step down

Prior to this patch, the catalog manager could get a tablet report
from a former leader replica with empty leader UUID and old term.
A dedicated logic in the catalog manager's code (see section 7d(i))
would amend the empty leader UUID to replace it with the previous
leader's UUID.  As a result of those shenanigans, catalog manager
would interpret the incoming report as a report from a leader replica
that reports its own health status as UNKNOWN.

The TwoConcurrentRebalancers scenario of the recently introduced
ConcurrentRebalancersTest reproduces the issue pretty often
(about 1 in 100 runs failed), so it was easy to pin-point the problem.
Mike sketched the fix and I ran the new code via dist-test about 1K
times and verified the problem is gone.

As for the test coverage, in addition to the already mentioned
would-be-flaky TwoConcurrentRebalancers scenario, I modified
RaftConsensusElectionITest.LeaderStepDown to reliably catch regressions.

Change-Id: I4e1f1446176a78ba04e74dd1153f9048a32d8d5f
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/integration-tests/raft_consensus_election-itest.cc
2 files changed, 31 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/11122/2
--
To view, visit http://gerrit.cloudera.org:8080/11122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4e1f1446176a78ba04e74dd1153f9048a32d8d5f
Gerrit-Change-Number: 11122
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] [consensus] KUDU-2335 increment term on explicit step down

2018-08-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11122


Change subject: [consensus] KUDU-2335 increment term on explicit step down
..

[consensus] KUDU-2335 increment term on explicit step down

Prior to this patch, the catalog manager could get a tablet report
from a former leader replica with empty leader UUID and old term.
A dedicated logic in the catalog manager's code (see section 7d(i))
would amend the empty leader UUID to replace it with the previous
leader's UUID.  As a result of those shenanigans, catalog manager
would interpret the incoming report as a report from a leader replica
that reports its own health status as UNKNOWN.

The recently introduced integration test scenario for rebalancer
The TwoConcurrentRebalancers scenario of the recently introduced
test ConcurrentRebalancersTest reproduce the scenario pretty often
(about 1 in 100 runs), so it was easy to pin-point the problem.

As for the test coverage, in addition to TwoConcurrentRebalancers
scenario, the RaftConsensusElectionITest.LeaderStepDown scenario
has been updated to reliably catch regressions.

Change-Id: I4e1f1446176a78ba04e74dd1153f9048a32d8d5f
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/integration-tests/raft_consensus_election-itest.cc
2 files changed, 31 insertions(+), 14 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/11122/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4e1f1446176a78ba04e74dd1153f9048a32d8d5f
Gerrit-Change-Number: 11122
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] Add unstable client APIs to fetch HMS integration configuration status

2018-08-03 Thread Dan Burkert (Code Review)
Hello Hao Hao, Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11121

to review the following change.


Change subject: Add unstable client APIs to fetch HMS integration configuration 
status
..

Add unstable client APIs to fetch HMS integration configuration status

The Impala/Kudu integration must know whether the Kudu cluster is
configured with the HMS integration enabled. This commit adds an
unstable API to the Java client and a private API to the Kudu client to
lookup this information. The new C++ API is utilized by the HMS tools to
get the HMS connection information without having to use the GetFlags
API, which requires admin privileges.

The patch required adding support for enabling the HMS integration to
the Java mini cluster.

Change-Id: Iddb89787ed35c41e85f1d9bf953c4c228dcafcdb
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
A java/kudu-client/src/main/java/org/apache/kudu/client/HiveMetastoreConfig.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestMiniKuduCluster.java
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client.h
M src/kudu/hms/mini_hms.cc
M src/kudu/hms/mini_hms.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/tools/tool.proto
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_hms.cc
M src/kudu/tools/tool_action_test.cc
18 files changed, 268 insertions(+), 39 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/11121/1
--
To view, visit http://gerrit.cloudera.org:8080/11121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iddb89787ed35c41e85f1d9bf953c4c228dcafcdb
Gerrit-Change-Number: 11121
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [location awareness] Assign locations to registering tablet servers

2018-08-03 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/5

to look at the new patch set (#3).

Change subject: [location_awareness] Assign locations to registering tablet 
servers
..

[location_awareness] Assign locations to registering tablet servers

This patch introduces a location field maintained by the master for
each tablet server. The master determines the value of this field
whenever a tablet server registers. It does this by using an
external command, specified with the flag --location_mapping_cmd,
to produce a location from the hostname or IP address of the tablet
server.

To help with ad hoc testing, I also added the location field to the
/tablet-servers web page, and fixed a small oversight where the
table of tablet servers wasn't sorted, so its order changed depending
on the order tablet servers first registered in.

I also altered the DATA_FILES CMake function so that data files copied
to the build directory or to slaves by dist-test are executable as well
as readable. This was necessary for the new TSDescriptor test which
tests location assignment.

Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
---
M CMakeLists.txt
M src/kudu/master/CMakeLists.txt
M src/kudu/master/master_path_handlers.cc
A src/kudu/master/testdata/first_argument.sh
A src/kudu/master/ts_descriptor-test.cc
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M www/tablet-servers.mustache
8 files changed, 319 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/5/3
--
To view, visit http://gerrit.cloudera.org:8080/5
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
Gerrit-Change-Number: 5
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] build: allow overriding of pip installation parameters

2018-08-03 Thread Adar Dembo (Code Review)
Hello Andrew Wong, Jordan Birdsell, Grant Henke,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/6

to review the following change.


Change subject: build: allow overriding of pip installation parameters
..

build: allow overriding of pip installation parameters

This commit introduces the PIP_INSTALL_FLAGS environment variable to
build-and-test.sh which can be used to override various pip installation
behaviors. For example, you can pass "-i " so that pip uses a different
installation URL to retrieve packages.

For this to work effectively, PIP_INSTALL_FLAGS must come after all of
build-and-test.sh's pip installation flags. That's because if the same flag
is passed multiple times on the command line, the last value "wins". To make
this consistent, I modified MVN_FLAGS and GRADLE_FLAGS to behave the same.

I also copied the pytest-timeout dependency from setup.py to
requirements.txt so that "python setup.py test" won't have to download
anything if it was preceeded with "pip install -r requirements.txt". It is
possible to configure setuptools with a custom index URL instead, but the
various[1] approaches[2] involve modifying files rather than passing
command line arguments.

1. 
https://setuptools.readthedocs.io/en/latest/easy_install.html#configuration-files
2. https://setuptools.readthedocs.io/en/latest/setuptools.html (search for 
dependency_links)

Change-Id: Ib09da4bb2e2cdcbf7ea4a8ab2c1089ab46d934fb
---
M build-support/jenkins/build-and-test.sh
M python/requirements.txt
2 files changed, 37 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/16/6/1
--
To view, visit http://gerrit.cloudera.org:8080/6
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib09da4bb2e2cdcbf7ea4a8ab2c1089ab46d934fb
Gerrit-Change-Number: 6
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Jordan Birdsell 


[kudu-CR] [catalog manager] re-replication warning message update

2018-08-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/4 )

Change subject: [catalog_manager] re-replication warning message update
..

[catalog_manager] re-replication warning message update

Updated the warning message upon creating a table with replication
factor F when only (F - 1) tablet servers are alive.

Do not mention unsetting --raft_prepare_replacement_before_eviction
since the 3-4-3 replica management scheme allows to spawn a new
replica in-place in case of unrecoverable failure such as IO error
and falling behind WAL's ancient history mark.  So, the 3-2-3 scheme
is no better in that regard and would not be able to re-replicate
in case of a tablet server failure either.  Also, the 3-2-3 scheme
is deprecated at this point.

I also did few other petty changes around the warning message.

There are no functional changes in this patch.

Change-Id: I608ce179548ac25d660d713e7ce1d0e067dd4e02
Reviewed-on: http://gerrit.cloudera.org:8080/4
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M src/kudu/master/catalog_manager.cc
1 file changed, 9 insertions(+), 14 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Mike Percy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/4
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I608ce179548ac25d660d713e7ce1d0e067dd4e02
Gerrit-Change-Number: 4
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [location awareness] Assign locations to registering tablet servers

2018-08-03 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/5

to look at the new patch set (#2).

Change subject: [location_awareness] Assign locations to registering tablet 
servers
..

[location_awareness] Assign locations to registering tablet servers

This patch introduces a location field maintained by the master for
each tablet server. The master determines the value of this field
whenever a tablet server registers. It does this by using an
external command, specified with the flag --location_mapping_cmd,
to produce a location from the hostname or IP address of the tablet
server.

To help with ad hoc testing, I also added the location field to the
/tablet-servers web page, and fixed a small oversight where the
table of tablet servers wasn't sorted, so its order changed depending
on the order tablet servers first registered in.

I also altered the DATA_FILES CMake function so that data files copied
to the build directory or to slaves by dist-test are executable as well
as readable. This was necessary for the new TSDescriptor test which
tests location assignment.

Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
---
M CMakeLists.txt
M src/kudu/master/CMakeLists.txt
M src/kudu/master/master_path_handlers.cc
A src/kudu/master/testdata/first_argument.sh
A src/kudu/master/ts_descriptor-test.cc
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M www/tablet-servers.mustache
8 files changed, 318 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/5/2
--
To view, visit http://gerrit.cloudera.org:8080/5
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
Gerrit-Change-Number: 5
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [catalog manager] re-replication warning message update

2018-08-03 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/4 )

Change subject: [catalog_manager] re-replication warning message update
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/4
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I608ce179548ac25d660d713e7ce1d0e067dd4e02
Gerrit-Change-Number: 4
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 03 Aug 2018 20:29:02 +
Gerrit-HasComments: No


[kudu-CR] [location awareness] Assign locations to registering tablet servers

2018-08-03 Thread Will Berkeley (Code Review)
Will Berkeley has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/5


Change subject: [location_awareness] Assign locations to registering tablet 
servers
..

[location_awareness] Assign locations to registering tablet servers

This patch introduces a location field maintained by the master for
each tablet server. The master determines the value of this field
whenever a tablet server registers. It does this by using an
external command, specified with the flag --location_mapping_cmd,
to produce a location from the hostname or IP address of the tablet
server.

To help with ad hoc testing, I also added the location field to the
/tablet-servers web page, and fixed a small oversight where the
table of tablet servers wasn't sorted, so its order changed depending
on the order tablet servers first registered in.

I also altered the DATA_FILES CMake function so that data files copied
to the build directory or to slaves by dist-test are executable as well
as readable. This was necessary for the new TSDescriptor test which
tests location assignment.

Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
---
M CMakeLists.txt
M src/kudu/master/CMakeLists.txt
M src/kudu/master/master_path_handlers.cc
A src/kudu/master/testdata/first_argument.sh
A src/kudu/master/ts_descriptor-test.cc
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M www/tablet-servers.mustache
8 files changed, 313 insertions(+), 44 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/5/1
--
To view, visit http://gerrit.cloudera.org:8080/5
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5eb98823ab7b3b8141b8630196c29c1ebf8e6878
Gerrit-Change-Number: 5
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 


[kudu-CR] KUDU-2525: KuduTableInputFormat may halt before exhausting scan

2018-08-03 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/1 )

Change subject: KUDU-2525: KuduTableInputFormat may halt before exhausting scan
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
File 
java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java:

http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java@66
PS1, Line 66:  * This input format generates one split per tablet and the only 
location for each split is that
> Could you try to synthesize this into a unit test? That'll help solidify th
Empty scan batches can happen in at least two ways:

* It's possible for the client to set the scan bytes limit to 0 when creating a 
new scanner, and then use something bigger for subsequent continue scan calls.  
In the case of the new scanner call it will be an empty row batch (see 
https://github.com/apache/kudu/blob/master/src/kudu/tserver/tablet_service.cc?utf8=%E2%9C%93#L1867).
  As far as I know you can't actually configure a Java scanner to do this, 
though.

* If a scan is filtering many rows it may not be able to find a single matching 
row before the internal 500ms timeout expires.  Unfortunately this is one the 
few completely hardcoded timeouts in Kudu, so we can't easily toggle it for a 
unit test: 
https://github.com/apache/kudu/blob/master/src/kudu/tserver/tablet_service.cc?utf8=%E2%9C%93#L1941

Neither of these are easy to reproduce in a unit tests easily.  My best idea 
would be to add a test-only flag to the tserver to 'inject' empty batches 
randomly.

That being said, because the Spark KuduRDD has effectively the exact same loop 
(https://github.com/apache/kudu/blob/master/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala#L124),
 I'd also be OK just landing this as-is without a unit test.



--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbfdd2efbd281e4d849917664b33e183e180bafd
Gerrit-Change-Number: 1
Gerrit-PatchSet: 2
Gerrit-Owner: zhangqianqiong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: zhangqianqiong 
Gerrit-Comment-Date: Fri, 03 Aug 2018 18:09:32 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-03 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..


Patch Set 7:

(21 comments)

Thanks for the comments. Please review the changes.

http://gerrit.cloudera.org:8080/#/c/10983/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10983/5//COMMIT_MSG@7
PS5, Line 7: KUDU-1291
> nit: s/Kudu-1291/KUDU-1291/
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/cfile/cfile_reader.h@340
PS5, Line 340: o value index,
> document this parameter in the header file comment
Done


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.h@343
PS4, Line 343:bool *exact_match, bool set_current_value 
= false);
> nit: add const specifier for the method.  Also, consider returning const re
Done


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc@788
PS4, Line 788:
> what if prepared_blocks_ is empty?  If that the thing-that-cannot-be, add D
Done


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@248
PS4, Line 248: r_bound_idx wi
> style nit: here and elsewhere -- we tend to stick the asterisk and ampersan
Done


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@399
PS4, Line 399:
> If 'spec' instance is not modified here, why not to pass it as a const refe
Done


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@414
PS4, Line 414: bool nonfirst_key_pred_exists = false;
> Is copying necessary here?  Why not to use const reference?
Done


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@418
PS4, Line 418:
> nit: I don't think the parentheses are necessary here
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@413
PS5, Line 413:
> nit: add punctuation to all your comments (add a period at the end of the s
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@511
PS5, Line 511:
> this kind of api doc should go into the header file, not the implementation
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@527
PS5, Line 527:
> since we are not using this, can we just make this /* exact_match= */ nullp
We cannot make it nullptr , because exact match is being de-referenced in 
functions called by SeekAtOrAfter. I have added a comment that exact_match 
value is not used after the current call.


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@565
PS5, Line 565: tig
> s/Get/Return/
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@568
PS5, Line 568: const ColumnSchema  = cont_row.schema()->column(i);
> remove or VLOG(2)
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@581
PS5, Line 581: rena2(1024);
> nit: use more descriptive variable names here
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@588
PS5, Line 588: let_schema(
> how about rename this to: SkipToNextScan()
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@602
PS5, Line 602: //store the row id that represents the last relevant entry 
(upper bound) wrt the
> I'm not comfortable merging this patch without this feature
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@662
PS5, Line 662:
> exclusive upper bound
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@668
PS5, Line 668:
 : bool exact
> what guarantees the skip_scan_upper_bound_idx_ has been set to upper_bound_
This is a valid point. I have  updated the skip_scan_upper_bound_idx_ value 
here now.


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@675
PS5, Line 675: ow
> wording nit: Check to see whether we have seeked backwards. If so, we need
Done


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@684
PS5, Line 684: // We didn't find an exact match for our lower bound, so 
re-seek.
> can we convert this to a while-loop instead of a do-while-loop, now that th
Done



[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-03 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10983

to look at the new patch set (#7).

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..

KUDU-1291. Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
9 files changed, 920 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/7
--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 7
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components

2018-08-03 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10983

to look at the new patch set (#6).

Change subject: Kudu-1291 Efficiently support predicates on non-prefix key 
components
..

Kudu-1291 Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
9 files changed, 920 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/6
--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components

2018-08-03 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11067

to look at the new patch set (#15).

Change subject: Kudu-1291 Efficiently support predicates on non-prefix key 
components
..

Kudu-1291 Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
9 files changed, 920 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/15
--
To view, visit http://gerrit.cloudera.org:8080/11067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
Gerrit-Change-Number: 11067
Gerrit-PatchSet: 15
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components

2018-08-03 Thread Anupama Gupta (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11067

to look at the new patch set (#14).

Change subject: Kudu-1291 Efficiently support predicates on non-prefix key 
components
..

Kudu-1291 Efficiently support predicates on non-prefix key components

This patch implements index skip scan approach in the case when there
exists an equality predicate on any of the non-first
primary key columns.

This feature uses the following approach:
1. Seek at the first unique prefix key in the ad-hoc index tree.
2. Seek to the relevant entry corresponding to the unique prefix key
   found in 1. and the predicate value on the suffix key column.
3. To handle the case where multiple entries exist with respect to
   a unique prefix key, store the upper bound index of these entries.
4. Repeat steps 1-3, till all the unique prefix keys are scanned.

Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/common/encoded_key.cc
M src/kudu/common/encoded_key.h
M src/kudu/common/partial_row.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
A src/kudu/tablet/index_skipscan-test.cc
9 files changed, 925 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/14
--
To view, visit http://gerrit.cloudera.org:8080/11067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d
Gerrit-Change-Number: 11067
Gerrit-PatchSet: 14
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2525: KuduTableInputFormat may halt before exhausting scan

2018-08-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/1 )

Change subject: KUDU-2525: KuduTableInputFormat may halt before exhausting scan
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
File 
java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java:

http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java@66
PS1, Line 66:  * This input format generates one split per tablet and the only 
location for each split is that
> I can be sure that there is not an empty tablet in the table.
Could you try to synthesize this into a unit test? That'll help solidify the 
conditions that must be met for this to occur, and will also serve as a 
regression test.


http://gerrit.cloudera.org:8080/#/c/1/1/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java@448
PS1, Line 448: // TODO(qqzhang) scanner.nextRows() sometimes returns an 
empty RowResultIterator,
> Done
Thanks for the added content, but why TODO(...)? Again, we only use TODOs when 
there's some work left to be done.

See this Google Style Guide link for more details: 
https://google.github.io/styleguide/cppguide.html#TODO_Comments



--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbfdd2efbd281e4d849917664b33e183e180bafd
Gerrit-Change-Number: 1
Gerrit-PatchSet: 2
Gerrit-Owner: zhangqianqiong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: zhangqianqiong 
Gerrit-Comment-Date: Fri, 03 Aug 2018 16:49:02 +
Gerrit-HasComments: Yes


[kudu-CR] [python] : Add Pandas Support to Scanner

2018-08-03 Thread Jordan Birdsell (Code Review)
Jordan Birdsell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10809 )

Change subject: [python] : Add Pandas Support to Scanner
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10809/7/python/kudu/client.pyx
File python/kudu/client.pyx:

http://gerrit.cloudera.org:8080/#/c/10809/7/python/kudu/client.pyx@1883
PS7, Line 1883: by yielding batches of tuples.
> I think this doc needs to be more explicit -- it actually generates _batche
Done



--
To view, visit http://gerrit.cloudera.org:8080/10809
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90fab5c0c42448bcc17ea22be37420c6ef2f4915
Gerrit-Change-Number: 10809
Gerrit-PatchSet: 8
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 03 Aug 2018 16:22:56 +
Gerrit-HasComments: Yes


[kudu-CR] [python] : Add Pandas Support to Scanner

2018-08-03 Thread Jordan Birdsell (Code Review)
Hello Will Berkeley, David Ribeiro Alves, Kudu Jenkins, Grant Henke, Todd 
Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10809

to look at the new patch set (#8).

Change subject: [python] : Add Pandas Support to Scanner
..

[python] : Add Pandas Support to Scanner

This patch adds support for converting scanner results to a
Pandas DataFrame. There are a few basic unit tests included
in this patch as well.

Change-Id: I90fab5c0c42448bcc17ea22be37420c6ef2f4915
---
M python/kudu/__init__.py
M python/kudu/client.pyx
M python/kudu/tests/test_scanner.py
M python/setup.py
4 files changed, 153 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/10809/8
--
To view, visit http://gerrit.cloudera.org:8080/10809
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I90fab5c0c42448bcc17ea22be37420c6ef2f4915
Gerrit-Change-Number: 10809
Gerrit-PatchSet: 8
Gerrit-Owner: Jordan Birdsell 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Jordan Birdsell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley