[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: [c++ client] re-acquire authn token if expired
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

Line 753:   if (leader_master_rpc_primary_creds_ || need_new_rpc_primary_creds) 
{
> I was having a hard time following this, so I tried to take a crack at redu
I asked Mike to look at this as well to have an additional independent reviewer 
:)

To Mike the reference thing was also confusing.  Also, both variants looked not 
so good to him, so I decided to go with the alternative that you proposed: if 
you trust that code better, let it be the trusted code.  I also found it doing 
the right thing, but I also spent some time to make sure it behaves like I want 
it to behave.


-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6648

to look at the new patch set (#18).

Change subject: [c++ client] re-acquire authn token if expired
..

[c++ client] re-acquire authn token if expired

Updated C++ client to re-acquire authn token if server responds with
FATAL_INVALID_AUTHENTICATION_TOKEN error on connection negotiation.
If that happens while negotiating a connection for a call, the
connection is closed and the client reconnects to the cluster to get
a new authn token.  After successfully re-acquiring authn token the
client retries the RPC call again.

Added corresponding integration test to cover authn token re-acquisition
for various scenarios.

Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
---
M docs/known_issues.adoc
M docs/security.adoc
M src/kudu/client/batcher.cc
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/authn_token_expire-itest.cc
M src/kudu/integration-tests/cluster_verifier.h
M src/kudu/rpc/client_negotiation.cc
M src/kudu/rpc/client_negotiation.h
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/outbound_call.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.h
M src/kudu/rpc/server_negotiation.cc
24 files changed, 752 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/6648/18
-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs

2017-05-15 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: Refactor ConsensusStatePB to hold committed and pending configs
..


Patch Set 10: Code-Review+1

(1 comment)

Looks good to me, just one nit.

http://gerrit.cloudera.org:8080/#/c/6809/10/src/kudu/consensus/quorum_util.cc
File src/kudu/consensus/quorum_util.cc:

PS10, Line 199: 
nit: sorry to nitpick so hard on this, can we just align this opening quote 
with the opening quote on the above line? The GSG says it best: 
https://google.github.io/styleguide/cppguide.html#Function_Calls

If the arguments do not all fit on one line, they should be broken up onto 
multiple lines, with each subsequent line aligned with the first argument. Do 
not add spaces after the open paren or before the close paren:

  bool result = DoSomething(averyveryveryverylongargument1,
argument2, argument3);


-- 
To view, visit http://gerrit.cloudera.org:8080/6809
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] [rpc] introduce per-RPC credentials policy

2017-05-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6875

to look at the new patch set (#7).

Change subject: [rpc] introduce per-RPC credentials policy
..

[rpc] introduce per-RPC credentials policy

This patch introduces policy for RPC authentication credentials.  The
authentication credentials policy allows for control over the type of
client-side credentials used for making a remote procedure call.

The idea behind this change is simple: sometimes the server's behavior
depends on the type of client's credentials used to authenticate the
client to the server in the context of the remote procedure call.  If
the client expects some particular behavior from the server, it has to
explicitly specify the type of credentials it wants to use for the call.

One example of an RPC depending on the type of the specified credentials
is MasterService::ConnectToMaster().  It's impossible to receive an
authentication token from the master if calling that method over a
connection established with an authn token.  To get a new authn token
in that case, it's necessary to open a new connection to the master
using types of credentials other than authn token (e.g., Kerberos
credentials or TLS certificate will work).

In other words, derived/secondary authentication credentials
(such as authn token) can only be acquired if using the primary ones.
That's a crucial restriction to allow for enforcing expiration
of derived/secondary credentials.  With this patch a client has an
ability to re-acquire secondary authentication credentials (authn token)
regardless of the type of credentials used to established current
connection to Kudu master.

As a part of this patch, a new unit test is added to cover the new
functionality.

Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/outbound_call.h
M src/kudu/rpc/proxy.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_controller.cc
M src/kudu/rpc/rpc_controller.h
M src/kudu/rpc/rpc_stub-test.cc
13 files changed, 305 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/6875/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6648

to look at the new patch set (#17).

Change subject: [c++ client] re-acquire authn token if expired
..

[c++ client] re-acquire authn token if expired

Updated C++ client to re-acquire authn token if server responds with
FATAL_INVALID_AUTHENTICATION_TOKEN error on connection negotiation.
If that happens while negotiating a connection for a call, the
connection is closed and the client reconnects to the cluster to get
a new authn token.  After successfully re-acquiring authn token the
client retries the RPC call again.

Added corresponding integration test to cover authn token re-acquisition
for various scenarios.

Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
---
M docs/known_issues.adoc
M docs/security.adoc
M src/kudu/client/batcher.cc
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/authn_token_expire-itest.cc
M src/kudu/rpc/client_negotiation.cc
M src/kudu/rpc/client_negotiation.h
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/outbound_call.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.h
M src/kudu/rpc/server_negotiation.cc
23 files changed, 754 insertions(+), 142 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/6648/17
-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: [c++ client] re-acquire authn token if expired
..


Patch Set 16:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6648/16/docs/known_issues.adoc
File docs/known_issues.adoc:

PS16, Line 172: one week is now supported only for Kudu C++ client
> nit: I think for the context of limitations, it's better to be explicit abo
Done


http://gerrit.cloudera.org:8080/#/c/6648/16/docs/security.adoc
File docs/security.adoc:

PS16, Line 221: Java
> nit: insert "The"
Done


PS16, Line 223: yet 
> nit: we decided at some point to not use these time-related words in the do
Done


PS16, Line 223:  C++
> nit: the
Done


PS16, Line 224: client
> clients
Done


http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

PS16, Line 415: existring
> typo
Done


http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

PS16, Line 731: ANY_BUT_AUTHN_OTKEN
> I suggested the new name, so I'll go to bat for it.  Besides being a positi
ok.

ANY_CREDENTIALS and PRIMARY_CREDENTIALS, right?


Line 740:   DCHECK(creds_policy == CredentialsPolicy::ANY_CREDENTIALS ||
> Isn't this statically verified by the compiler?
I don't think it's going to be verified by compiler if somebody adds a new 
credentials policy, like SECONDARY_CREDENTIALS (meaning only secondary 
credentials may be used for connection negotiation).

This debug-only assert will catch the case when this piece of code is not 
updated after that.


Line 753:   if (leader_master_rpc_primary_creds_ || need_new_rpc_primary_creds) 
{
> I was having a hard time following this, so I tried to take a crack at redu
To me -- nope, it's not :)  I also spent some time trying to make sure that 
your code does what I want.  It also has a misleading comment on non-existent 
connections if the final 'else' part.

The two boolean flags store the fact whether it's necessary to create new RPC 
calls with appropriate credentials.


Line 761: scoped_refptr& rpc(
> This statement confusing.  I think it's copying the scoped_refptr to an unn
What would be that 'unnamed stack-allocated variable'?

It just makes a reference either to leader_master_rpc_primary_creds_ or to 
leader_master_rpc_any_creds_

I'm not sure I understand: what unbound variable are you talking about?


http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.h
File src/kudu/client/client-internal.h:

PS16, Line 165:   rpc::CredentialsPolicy creds_policy =
  :   rpc::CredentialsPolicy::ANY_CREDENTIALS);
> maybe it's just me, but I find this wrapping a bit odd. Maybe wrap all the 
I wrapped all parameters at +4 shift -- that looks cleaner (at least no 
additional line just for the default value).


PS16, Line 237:   scoped_refptr 
leader_master_rpc_any_creds_;
  :   std::vector 
leader_master_callbacks_any_creds_;
  :   scoped_refptr 
leader_master_rpc_primary_creds_;
  :   std::vector 
leader_master_callbacks_primary_creds_;
> Yes, the idea was to not wait for that re-connection again in that rare cas
Also, if we don't split the callbacks, how do we know the process of scheduling 
the retries converges at all?  I.e., consider the case when a retries are 
scheduled again and again, and the any-creds call always arrives first all the 
time, then comes the primary-creds call, and that so long?


-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2001 Add UNDO size to tablet on-disk size

2017-05-15 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-2001 Add UNDO size to tablet on-disk size
..


Patch Set 5: Code-Review+1

lgtm modulo the commit message, will defer to Todd for final approval

-- 
To view, visit http://gerrit.cloudera.org:8080/6850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I59001adadb9a768a464e7b2cf0f0a5df0ef5393a
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] KUDU-2001 Add UNDO size to tablet on-disk size

2017-05-15 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-2001 Add UNDO size to tablet on-disk size
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6850/5//COMMIT_MSG
Commit Message:

Line 12: DeltaTracker::EstimateOnDiskSizeForCompaction() that
> this got renamed in the latest rev?
Yeah, renamed to EstimateRedoDeltaOnDiskSize()

Also yeah, for Git the typical wrap length is 72 chars per 
https://stackoverflow.com/questions/2290016/git-commit-messages-50-72-formatting


http://gerrit.cloudera.org:8080/#/c/6850/5/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 692:   return base_data_->EstimateOnDiskSize() + 
delta_tracker_->EstimateOnDiskSize();
> since this now includes the UNDO deltas, whereas before it didn't, will thi
The major delta compaction estimate is below on L742 which now calls into L683 
here so this looks right to me. I searched the code base for other uses of 
EstimateOnDiskSize() and I didn't find anything that could affect compaction 
policy.


-- 
To view, visit http://gerrit.cloudera.org:8080/6850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I59001adadb9a768a464e7b2cf0f0a5df0ef5393a
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] [rpc] introduce per-RPC credentials policy

2017-05-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6875

to look at the new patch set (#6).

Change subject: [rpc] introduce per-RPC credentials policy
..

[rpc] introduce per-RPC credentials policy

This patch introduces policy for RPC authentication credentials.  The
authentication credentials policy allows for control over the type of
client-side credentials used for making a remote procedure call.

The idea behind this change is simple: sometimes the server's behavior
depends on the type of client's credentials used to authenticate the
client to the server in the context of the remote procedure call.  If
the client expects some particular behavior from the server, it has to
explicitly specify the type of credentials it wants to use for the call.

One example of an RPC depending on the type of the specified credentials
is MasterService::ConnectToMaster().  It's impossible to receive an
authentication token from the master if calling that method over a
connection established with an authn token.  To get a new authn token
in that case, it's necessary to open a new connection to the master
using types of credentials other than authn token (e.g., Kerberos
credentials or TLS certificate will work).

In other words, derived/secondary authentication credentials
(such as authn token) can only be acquired if using the primary ones.
That's a crucial restriction to allow for enforcing expiration
of derived/secondary credentials.  With this patch a client has an
ability to re-acquire secondary authentication credentials (authn token)
regardless of the type of credentials used to established current
connection to Kudu master.

As a part of this patch, a new unit test is added to cover the new
functionality.

Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/outbound_call.h
M src/kudu/rpc/proxy.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_controller.cc
M src/kudu/rpc/rpc_controller.h
M src/kudu/rpc/rpc_stub-test.cc
13 files changed, 305 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/6875/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [rpc] introduce per-RPC credentials policy

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: [rpc] introduce per-RPC credentials policy
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc
File src/kudu/rpc/negotiation.cc:

PS5, Line 163: SatisfiesCredentialsPolicy
> yea, I think my main complaint about this is that we're calling SatisfiesCr
We discussed this on-line (kudu-security Slack channel) and decided to add 
accessor for 'credentials_policy_' member.


Line 215:   
conn->set_remote_features(client_negotiation.take_server_features());
> I think we're being pessimistic here in a very certain case:
We discussed that on Slack kudu-security channel and the decision is to keep it 
as is (keep it simple).

The issue with AuthenticationType is that it's not known before actual 
negotiation has happened, but we need to know about the policy of any 
connection prior to that because we want to find appropriate connection in 
ReactorThread::FindOrStartConnection().

Also, we discussed 'upgrading' the policy de facto as soon as the connection 
has been negotiated, but that seems to bring some additional complexity to fix 
a very corner case when authentication token expires on a connection which has 
been negotiated using the ANY_CREDENTIALS policy, but the actually there were 
no authentication token available to use, so the connection ended up to be 
satisfying the PRIMARY_CREDENTIALS policy.


http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/reactor.cc
File src/kudu/rpc/reactor.cc:

PS5, Line 368:   for (auto it = range.first; it != range.second; ++it) {
 : const auto& c = it->second;
 : if (c->scheduled_for_shutdown()) {
 :   // This connection is not to be used for any new calls.
 :   continue;
 : }
 : if (!c->SatisfiesCredentialsPolicy(cred_policy)) {
 :   // Do not use a connection with a non-compliant 
credentials policy.
 :   // Instead, open a new one, while marking the former as 
scheduled for
 :   // shutdown. This process converges: any connection that 
satisfies the
 :   // PRIMARY_CREDENTIALS policy automatically satisfies the 
ANY_CREDENTIALS
 :   // policy as well. The idea is to keep only one usable 
connection identified
 :   // by the specified 'conn_id'.
 :   c->set_scheduled_for_shutdown();
 :   continue;
 : }
 : 
 : // If the test-only 'one-connection-per-RPC' mode is 
enabled, connections
 : // are re-established at every RPC call.
 : if (PREDICT_FALSE(FLAGS_rpc_reopen_outbound_connections)) {
 :   c->set_scheduled_for_shutdown();
 :   continue;
 : }
 :   }
 :   // Second pass: shutdown idle connections to the target 
destination if they
 :   // have been marked. Non-idle connections will be taken care 
of later by the
 :   // idle connection scanner.
 :   scoped_refptr found_conn;
 :   for (auto it = range.first; it != range.second;) {
 : const auto& c = it->second;
 : if (c->scheduled_for_shutdown()) {
 :   if (c->Idle()) {
 : DCHECK_EQ(Connection::CLIENT, c->direction());
 : c->Shutdown(Status::NetworkError("connection is closed 
due to non-reuse policy"));
 : it = client_conns_.erase(it);
 : continue;
 :   }
 : } else {
 :   DCHECK(!found_conn);
 :   found_conn = c;
 :   // Appropriate connection is found; continue further to 
take care of the
 :   // rest of connections marked for shutdown by the first 
pass above.
 : }
 : ++it;
 :   }
> I think these loops can be simplified by combining them:
I thought breaking this into two parts would give the code better readability, 
so it was intentional.  I see that's not the case.  Fine, I'll update this 
piece.


http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/rpc-test.cc
File src/kudu/rpc/rpc-test.cc:

PS5, Line 295: PRC
> RPC
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 

[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [c++ client] re-acquire authn token if expired
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

PS16, Line 731: ANY_BUT_AUTHN_OTKEN
> You are right -- that's the old name (with an additional typo).  Now it's P
I suggested the new name, so I'll go to bat for it.  Besides being a positive 
label, which I think is always better, it leaves the door open for considering 
IPKI certs to be non-primary in the future.  We don't have any reason to do 
this at the moment, but I can imagine we might want to add a tserver->master 
API which is only valid when connecting via primary credentials.  In fact, 
requesting a signed cert should probably already be like this.


-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [c++ client] re-acquire authn token if expired
..


Patch Set 16:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

PS16, Line 731: ANY_BUT_AUTHN_OTKEN
PRIMARY


Line 740:   DCHECK(creds_policy == CredentialsPolicy::ANY_CREDENTIALS ||
Isn't this statically verified by the compiler?


Line 753:   if (leader_master_rpc_primary_creds_ || need_new_rpc_primary_creds) 
{
I was having a hard time following this, so I tried to take a crack at reducing 
the number of branches, this is what I came up with (in particular the two 
boolean flags are removed, since I had trouble following those):

if (leader_master_rpc_primary_creds_) {
  // Piggy back on an existing connection with primary credentials.
  leader_master_callbacks_primary_creds_.push_back(cb);
} else if (leader_master_rpc_any_creds_ &&
   cred_policy == CredentialsPolicy::ANY_CREDENTIALS) {
  // Piggy back on an existing connection with any credentials.
  leader_master_callbacks_any_creds_.push_back(cb);
} else {
  // There are no existing connections, create a new one with the required 
policy.

  scoped_refptr rpc(
  new internal::ConnectToClusterRpc(
std::bind(::Data::ConnectedToClusterCb, this,
  std::placeholders::_1,
  std::placeholders::_2,
  std::placeholders::_3,
  creds_policy),
std::move(master_sockaddrs),
deadline,
client->default_rpc_timeout(),
messenger_,
creds_policy));

  switch (creds_policy) {
case PRIMARY_CREDENTIALS:
  leader_master_rpc_primary_creds_.reset(rpc.get());
  leader_master_callbacks_primary_creds_.push_back(cb);
  break;
case ANY_CREDENTIALS:
  leader_master_rpc_any_creds_.reset(rpc.get());
  leader_master_callbacks_any_creds_.push_back(cb);
  break;
  }
  l.unlock();
  rpc->SendRpc();
}

hopefully that's simpler?


Line 761: scoped_refptr& rpc(
This statement confusing.  I think it's copying the scoped_refptr to an unnamed 
stack allocated variable, and then taking a reference to that, but I'm not 
really sure.  I think what you want is 

scoped_refptr& rpc = need_new_rpc_primary_creds 
? leader_master_rpc_primary_creds_ : leader_master_rpc_any_creds_;

I really think we should forbid taking references to unbound variables, the 
lifetime semantics are just too confusing, and there's never a good reason to 
do it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: [c++ client] re-acquire authn token if expired
..


Patch Set 16:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

PS16, Line 731: ANY_BUT_AUTHN_OTKEN
> this isn't the current name (though I actually like something like you have
You are right -- that's the old name (with an additional typo).  Now it's 
PRIMARY_CREDENTIALS.  I'll update this.

The new naming looks more generic.


http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.h
File src/kudu/client/client-internal.h:

PS16, Line 237:   scoped_refptr 
leader_master_rpc_any_creds_;
  :   std::vector 
leader_master_callbacks_any_creds_;
  :   scoped_refptr 
leader_master_rpc_primary_creds_;
  :   std::vector 
leader_master_callbacks_primary_creds_;
> is it really necessary to split the callbacks? eg if you try to request a C
Yes, the idea was to not wait for that re-connection again in that rare case.


-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2001 Add UNDO size to tablet on-disk size

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2001 Add UNDO size to tablet on-disk size
..


Patch Set 5: Code-Review-1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6850/5//COMMIT_MSG
Commit Message:

Line 12: DeltaTracker::EstimateOnDiskSizeForCompaction() that
this got renamed in the latest rev?

nit: can you re-wrap this paragraph? seems like it's wrapped really small


http://gerrit.cloudera.org:8080/#/c/6850/5/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 692:   return base_data_->EstimateOnDiskSize() + 
delta_tracker_->EstimateOnDiskSize();
since this now includes the UNDO deltas, whereas before it didn't, will this 
have a potential effect on compaction policy?

We need to be quite careful with such a change to make sure it doesn't have 
unintended consequences. (eg what if a DRS including its undos is now larger 
than the whole compaction budget, but those UNDOs are non-GCable?)


-- 
To view, visit http://gerrit.cloudera.org:8080/6850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I59001adadb9a768a464e7b2cf0f0a5df0ef5393a
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1192 Periodically flush glog buffers from a thread
..


Patch Set 5:

(3 comments)

Can you also add something like:
  google::SetCommandLineOptionWithMode("logbufsecs", "5",
   
google::FlagSettingMode::SET_FLAGS_DEFAULT);

in ParseCommandLineFlags() in flags.cc? I think changing our default to 5 secs 
instead of 30 is a good idea with this change, so that the messages are flushed 
a bit quicker.

http://gerrit.cloudera.org:8080/#/c/6853/5/src/kudu/util/async_logger.cc
File src/kudu/util/async_logger.cc:

PS5, Line 120: regardless there
"regardless whether there is"


http://gerrit.cloudera.org:8080/#/c/6853/5/src/kudu/util/logging-test.cc
File src/kudu/util/logging-test.cc:

Line 175:   // wait a little more than one wake up cycle
nit: capitalize sentence and end in '.'
nit: "wake-up"


PS5, Line 180: flush
nit: flushed


-- 
To view, visit http://gerrit.cloudera.org:8080/6853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: William Li 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: William Li 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1549: compact LBM container metadata files at startup

2017-05-15 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: KUDU-1549: compact LBM container metadata files at startup
..


KUDU-1549: compact LBM container metadata files at startup

Here's another patch to speed up LBM startup by reducing the amount of
metadata processed from disk. The idea is to find metadata files with lots
of "dead" blocks (i.e. CREATE + DELETE pairs) and to "compact" them by
rewriting the files without the dead blocks' records. The set of containers
to compact is determined by the ratio of live blocks to total blocks, and
there's a new gflag to configure that.

To make this work, I had to adjust the accounting in BlockCreated() yet
again. The new approach preserves next_block_offset_, but uses
fs_aligned_length() directly for byte-related stats. Without this change,
the aligned container stats (total_bytes and live_bytes_aligned) are
completely out of whack in a container whose metadata was compacted.

Like the dead container deletion patch, this one is quick and dirty in that
the new logic is unsuitable for real-time compaction. Also like that patch,
compaction in real-time would require non-trivial synchronization changes.

Testing is done in several places:
- BlockManagerStressTest: a "real" workload that'll compact some containers
  with additional inconsistencies injected by the LBMCorruptor.
- BlockManagerTest.TestMetadataOkayDespiteFailedWrites: filesystem errors
  are injected into metadata compaction.
- LogBlockManagerTest.TestCompactFullContainerMetadataAtStartup: explicit
  test for metadata compaction (and for the stat accounting fix).

Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756
Reviewed-on: http://gerrit.cloudera.org:8080/6826
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/fs_report.cc
M src/kudu/fs/fs_report.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/env_posix.cc
8 files changed, 331 insertions(+), 44 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6826
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log block manager: reopen container metadata writers at the OS level

2017-05-15 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: log block manager: reopen container metadata writers at the OS 
level
..


log block manager: reopen container metadata writers at the OS level

Another patch I'm working on compacts LBM metadata files at startup by
writing compacted metadata into a temporary file, then overwriting the
existing metadata file with it. For this to work properly, "reopening" a
metadata file should actually reopen the file on the filesystem. This patch
does just that.

There should be no impact on any existing code; the reopening done after
truncating a partial record from a metadata file is just as happy if done at
the logical level (before) as at the physical level (now).

By popular demand, I've attached below a long explanation of Kudu's approach
to object initialization and thread safety that came up during code review.

Because we don't use exceptions within Kudu, the initialization of an object
is typically done in two phases:
 1. Constructor, for operations that cannot fail.
 2. Initializer function, for operations that may fail.

In most objects, the initializer function is named Init(), returns a Status
type, and needn't be thread safe because it is called right after the
constructor, before the object is "made public" to other threads.

Some objects offer two initializer functions: one to initialize a brand new
object and one to initialize an object with existing state. The idea is that
callers always use the constructor and then call one of the two initializer
functions, depending on their use case. WritablePBContainerFile is one such
object; Init() is for a brand new container file, and Open() is for a
container file that already exists on disk.

Previously, Reopen() served double duty: it was both the initializer
function for files with existing state, and it was an arbitrary function
(like Append()) that could be called at any time and by any thread. That
second responsibility is why it was thread safe. Now, OpenExisting() is just
the equivalent of CreateNew(), and so it makes sense for it to be just as
thread unsafe as CreateNew() is.

To be clear, the LBM was the only "arbitrary" caller to Reopen(), and the
LBM didn't _need_ it to be thread safe. But, I think it was net less
confusing for Reopen() to be thread safe (and thus equivalent in semantics
to Append()) than for it to be thread unsafe (and thus exceptional when
compared to Append(), Sync(), etc.).

Change-Id: I6029d499422f5a2db036d796c7e208f2af71c394
Reviewed-on: http://gerrit.cloudera.org:8080/6825
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves 
---
M src/kudu/fs/log_block_manager-test-util.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/pb_util-test.cc
M src/kudu/util/pb_util.cc
M src/kudu/util/pb_util.h
6 files changed, 33 insertions(+), 40 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Todd Lipcon: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6825
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6029d499422f5a2db036d796c7e208f2af71c394
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1549: compact LBM container metadata files at startup

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1549: compact LBM container metadata files at startup
..


Patch Set 5: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

Line 127: // Note: the BlockRecordPB is passed by pointer so that it can be 
swapped
> I'm pining away for protobuf move constructors.
yea, we should add -DPROTO_EXPERIMENTAL_ENABLE_MOVE to our protobuf build now 
that we are on protobuf 3.3. I'll do that in a separate patch while I'm 
thinking of it.


http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

PS2, Line 295:   //files compacted.
 :   //
 :  
> Correct: the LBM maintains a global map of LogBlocks, each of which is a li
ok, not too worried then, will cross fingers and we can address if it becomes 
problematic for some reason


-- 
To view, visit http://gerrit.cloudera.org:8080/6826
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-15 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1952 Remove round-robin for block placement
..


Patch Set 30: Code-Review+1 Verified+1

Test failure is a known flake.

Todd/David, I think this is good to go as-is (and fixes KUDU-1952). Could one 
of you take another look?

-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 30
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1549: compact LBM container metadata files at startup

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1549: compact LBM container metadata files at startup
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS2, Line 87: DEFINE_double(log_container_live_metadata_before_compact_ratio, 
0.10,
:   "Desired ratio of live block metadata in log 
containers. If a "
:   "container's live to total block ratio dips below 
this value, "
:   "the container's metadata file will be compacted at 
startup.");
> Thanks for the research.
Yea, I think it would be good to capture the ideas above in a new JIRA. 
Definitely don't want to derail this work.


-- 
To view, visit http://gerrit.cloudera.org:8080/6826
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [rpc] introduce per-RPC credentials policy

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: [rpc] introduce per-RPC credentials policy
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc
File src/kudu/rpc/negotiation.cc:

PS5, Line 163: SatisfiesCredentialsPolicy
> FWIW I like the PRIMARY name.  This logic may end up changing if we go the 
yea, I think my main complaint about this is that we're calling 
SatisfiesCredentialsPolicy on a connection that has not yet negotiated, which 
seems a bit odd. To me the name of the method implies that we're checking 
whether the negotiated authentication of a connection satistifes some policy, 
but here we are deciding _how_ to negotiate based on its configured policy.


-- 
To view, visit http://gerrit.cloudera.org:8080/6875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: [c++ client] re-acquire authn token if expired
..


Patch Set 16:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6648/16/docs/known_issues.adoc
File docs/known_issues.adoc:

PS16, Line 172: one week is now supported only for Kudu C++ client
nit: I think for the context of limitations, it's better to be explicit about 
the negative cases. i.e "is not supported by the Java client" or "is only 
supported by the C++ client but not by the Java client" or somesuch.


http://gerrit.cloudera.org:8080/#/c/6648/16/docs/security.adoc
File docs/security.adoc:

PS16, Line 221: Java
nit: insert "The"


PS16, Line 223: yet 
nit: we decided at some point to not use these time-related words in the docs, 
since they imply some future timeline, but we aren't actually talking about 
that timeline.


PS16, Line 223:  C++
nit: the


PS16, Line 224: client
clients


http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

PS16, Line 415: existring
typo


http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

PS16, Line 731: ANY_BUT_AUTHN_OTKEN
this isn't the current name (though I actually like something like you have 
here better!)


http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.h
File src/kudu/client/client-internal.h:

PS16, Line 165:   rpc::CredentialsPolicy creds_policy =
  :   rpc::CredentialsPolicy::ANY_CREDENTIALS);
maybe it's just me, but I find this wrapping a bit odd. Maybe wrap all the 
params at the '(' instead so that the default param ends up further-indented 
from its param name?


PS16, Line 237:   scoped_refptr 
leader_master_rpc_any_creds_;
  :   std::vector 
leader_master_callbacks_any_creds_;
  :   scoped_refptr 
leader_master_rpc_primary_creds_;
  :   std::vector 
leader_master_callbacks_primary_creds_;
is it really necessary to split the callbacks? eg if you try to request a 
ConnectToCluster with primary-creds-only policy, but one is already running 
without that policy, the worst that happens is that the callback happens and, 
on your retry, you still don't have active tokens, and you have to reconnect 
again, right? i.e this could only be transient and only be transient once in a 
rare case


-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [test-workload] setter for admin operation timeout

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged.

Change subject: [test-workload] setter for admin operation timeout
..


[test-workload] setter for admin operation timeout

Added a setter for client's default admin operation timeout.  It sets
corresponding property for the aggregated client builder.

Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c
Reviewed-on: http://gerrit.cloudera.org:8080/6887
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert 
---
M src/kudu/integration-tests/test_workload.h
1 file changed, 4 insertions(+), 0 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6887
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [tests] operations timeout for ClusterVerifier

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged.

Change subject: [tests] operations timeout for ClusterVerifier
..


[tests] operations timeout for ClusterVerifier

Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873
Reviewed-on: http://gerrit.cloudera.org:8080/6888
Tested-by: Alexey Serbin 
Reviewed-by: Dan Burkert 
---
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
2 files changed, 21 insertions(+), 10 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Alexey Serbin: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 


[kudu-CR] [rpc] introduce per-RPC credentials policy

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [rpc] introduce per-RPC credentials policy
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc
File src/kudu/rpc/negotiation.cc:

PS5, Line 163: SatisfiesCredentialsPolicy
> this seems a little hard to follow. This makes it sound like "Are primary c
FWIW I like the PRIMARY name.  This logic may end up changing if we go the 
route of storing the authentication type in the connection instead of the 
policy, in which case I think the policy will get passed in directly here, 
which would make the check more like your equality check.


-- 
To view, visit http://gerrit.cloudera.org:8080/6875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 34:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6514/34/src/kudu/rpc/server_negotiation.cc
File src/kudu/rpc/server_negotiation.cc:

PS34, Line 66: allows
nit: "to allow"


Line 95: vector* g_local_networks = new vector();
I think it's better to initialize this to nullptr, and then set it to the 
vector only after parsing/initializing the value. That way if we have a bug 
where we accidentally use it before it's been properly set, we'd get a crash 
instead of silent wrong results.

Also, please wrap this in an anonymous namespace so that the symbol doesn't get 
exported to other translation units


PS34, Line 209:   NegotiatePB request;
  :   RETURN_NOT_OK(RecvNegotiatePB(, _buf));
can you add a comment why we need to consume this? not quite following, since 
'request' is unused


PS34, Line 212:  or unauthenticated
this error can be more specific to unencrypted, right?


PS34, Line 729: unencrypted or unauthenticated
this one can be specific to unauthenticated


Line 914:   std::call_once(once, [] {
the initialization of g_local_networks could be moved inside the 'if' statement 
below on line 925. this would help us provide a workaround in the case that, 
for some reason, GetLocalNetworks crashes, we can configure explicit networks 
and avoid calling it at all.


PS34, Line 915: GetLocalNetworks
does this need WARN_NOT_OK or something?


PS34, Line 919: KUDU_C
can just be CHECK_OK (the 'KUDU_...' variants are only used in the context of 
client headers)


PS34, Line 926: = 
can use |= here instead of the || on line 928


http://gerrit.cloudera.org:8080/#/c/6514/34/src/kudu/util/net/net_util.h
File src/kudu/util/net/net_util.h:

PS34, Line 95: is 
nit: are


PS34, Line 113: vector Network
nit: "vector of"


-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 34
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1826 add propagated timestamp to sync client

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6798

to look at the new patch set (#4).

Change subject: KUDU-1826 add propagated timestamp to sync client
..

KUDU-1826 add propagated timestamp to sync client

Updating and getting the propagated timestamp only exist as functions
of the AsyncKuduClient, but these functions are still useful to the
KuduClient.

This patch exposes AsyncKuduClient's updateLastPropagatedTimestamp()
and getLastPropagatedTimestamp() to the KuduClient. The static value
NO_TIMESTAMP is also added to the KuduClient so timestamps returned
by KuduClient.getLastPropagatedTimestamp() can be compared with
KuduClient.NO_TIMESTAMP.

Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
3 files changed, 51 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/6798/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] WIP KUDU-1826 add propagated timestamp to sync client

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6798

to look at the new patch set (#3).

Change subject: WIP KUDU-1826 add propagated timestamp to sync client
..

WIP KUDU-1826 add propagated timestamp to sync client

Updating and getting the propagated timestamp only exist as functions
of the AsyncKuduClient, but these functions are still useful to the
KuduClient.

This patch exposes AsyncKuduClient's updateLastPropagatedTimestamp()
and getLastPropagatedTimestamp() to the KuduClient. The static value
NO_TIMESTAMP is also added to the KuduClient so timestamps returned
by KuduClient.getLastPropagatedTimestamp() can be compared with
KuduClient.NO_TIMESTAMP.

Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
3 files changed, 51 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/6798/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1860: ksck doesn't identify tablets that are evicted but still in config

2017-05-15 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but 
still in config
..


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tools/ksck-test.cc
File src/kudu/tools/ksck-test.cc:

Line 70:   const Schema& schema,
> warning: parameter 'tablet_id' is unused [misc-unused-parameters]
Expected


Line 71:   const ChecksumOptions& options,
> warning: parameter 'schema' is unused [misc-unused-parameters]
Expected


Line 242:   Status RunKsck() {
> warning: non-const reference parameter 'replicas', make it const or use a p
Done


http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

Line 798: } // namespace kudu
> error: cannot initialize a parameter of type 'FILE *' (aka '_IO_FILE *') wi
Done


Line 798: } // namespace kudu
> error: use of undeclared identifier 'setw'; did you mean 'getw'? [clang-dia
Done


Line 798: } // namespace kudu
> error: use of undeclared identifier 'setw'; did you mean 'getw'? [clang-dia
Done


http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tools/ksck.h
File src/kudu/tools/ksck.h:

Line 104: return term_;
> warning: value argument 'opid_index' can be moved to avoid copy [misc-move-
Done


http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 115: using kudu::consensus::LeaderStepDownResponsePB;
> warning: using decl 'RaftConfigPB' is unused [misc-unused-using-decls]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] [tests] operations timeout for ClusterVerifier

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [tests] operations timeout for ClusterVerifier
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-HasComments: No


[kudu-CR] KUDU-1860: ksck doesn't identify tablets that are evicted but still in config

2017-05-15 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6772

to look at the new patch set (#6).

Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but 
still in config
..

KUDU-1860: ksck doesn't identify tablets that are evicted but still in config

This patch enhances ksck to gather consensus info from every tablet. It
compares this info with master and outputs the master's config and every
conflicting config, if there are any conflicts. To do this efficiently
it implements a new GetAllConsensusStates RPC that gathers info about
every replica's consensus state.

This will catch at least the two problems identified in KUDU-1860:
1. The leader has a pending config to remove a tablet, but it is not
committed so the master does not see this config. This can hide an
unhealthy tablet if, e.g., one pending config member is down and the
pending-to-be-kicked-out member is up, so 1/2 replicas are alive in
the leader's active config but the master thinks 2/3 are alive.
2. No replica is leader but the master believes there is a leader
because its cache is old and hasn't been updated.

Sample output showing #1:
https://gist.github.com/wdberkeley/f964439bb0c407d3769c2dd9a0959ca0

Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
---
M src/kudu/consensus/consensus.proto
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/ksck_remote.h
M src/kudu/tools/tool_action_cluster.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
9 files changed, 412 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/6772/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [webui] Improvements for when there's many tablets & cleanup

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [webui] Improvements for when there's many tablets & cleanup
..


Patch Set 3:

OK well I'm in favor of 'write buffer memory usage'.  I just think it's really 
likely that people will look at the page, add up the amounts, and get confused 
when it's only 1/2 of the server memory usage.

-- 
To view, visit http://gerrit.cloudera.org:8080/6870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] WIP Don't suicide on EIO

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6773

to look at the new patch set (#11).

Change subject: WIP Don't suicide on EIO
..

WIP Don't suicide on EIO

Rather than suiciding when reaching an EIO, this patch adds a
mechanism that triggers error-handling in the form of a callback.
This handler is attached to the lowest non-env operations that may
result in EIO.
E.g. PosixRWFile::Write() is an env operation that may result in an
EIO. LogBlockContainer::WriteData()'s call to it must be wrapped in
the new error-handling macro KUDU_RETURN_OR_HANDLE_EIO. Thus, all
direct readers/writers of files must now implement EIO-handling code
and wrap disk IO in KUDU_RETURN_OR_HANDLE_EIO.

Also included is a new tablet data state called TABLET_DATA_CORRUPTED
in which tablet data will not be deleted until explicitly requested
or during a tablet copy.

This patch is marked WIP, as the correct behavior after a disk fails
is not yet included. For now, upon failure, the block managers will
keep track of the dead disks and will mark the appropriate tablet
peers as failed. Additionally, the error handling has been moved
between layers offline, so more cleanup is being done to align with
the above description.

Change-Id: I4c221a5ea97c7b7ec5f8d395b5527dc16e63f828
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
A src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/util/status.h
22 files changed, 354 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/6773/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6773
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c221a5ea97c7b7ec5f8d395b5527dc16e63f828
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] [test-workload] setter for admin operation timeout

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [test-workload] setter for admin operation timeout
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6887
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] WIP Allow external miniclusters to use many data dirs

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6845

to look at the new patch set (#7).

Change subject: WIP Allow external miniclusters to use many data dirs
..

WIP Allow external miniclusters to use many data dirs

In order to test different disk configurations, it is becoming
increasingly important to have end-to-end testing with nodes backed
by multiple directories.

This patch adds this functionality to the ExternalMiniCluster class,
which now supports the 'num_dirs_per_tserver' option, and the
ExternalDaemon class, which now supports a list of data dirs instead
of a single data dir.

The ExternalMiniCluster will create the ExternalDaemon with a list of
data dirs that reflects 'num_dirs_per_tserver' by using the original
data dir name as a prefix and appending an integer (e.g. '/dir_name'
with 'num_dirs_per_tserver = 2' will result in '/dir_name-0' and
'/dir_name-1').

A new test called disk-failure-itest is added that exercises this.

WIP: EIO-handling patch must be completed for further testing to make
sense.

Change-Id: Id2f5def6980ad394c8558ad97ba830f1b0257332
---
M src/kudu/fs/file_block_manager.cc
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/disk-failure-itest.cc
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/external_mini_cluster.h
M src/kudu/integration-tests/ts_itest-base.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/util/path_util.cc
M src/kudu/util/path_util.h
M src/kudu/util/status.h
10 files changed, 282 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/45/6845/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6845
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2f5def6980ad394c8558ad97ba830f1b0257332
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] Add fault injection of EIOs

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6881

to look at the new patch set (#3).

Change subject: Add fault injection of EIOs
..

Add fault injection of EIOs

This patch adds the functionality to inject EIOs in env_posix based on
a flag-specified ECMAScript regex determining which paths should fail
and a flag-specified double indicating how often these failures should
occur.

Change-Id: I5190d5e7f296cf27b65f1ecd2c16f3315cc46a39
---
M src/kudu/util/env_posix.cc
M src/kudu/util/fault_injection.cc
M src/kudu/util/fault_injection.h
3 files changed, 93 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6881/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6881
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5190d5e7f296cf27b65f1ecd2c16f3315cc46a39
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] [webui] Improvements for when there's many tablets & cleanup

2017-05-15 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: [webui] Improvements for when there's many tablets & cleanup
..


Patch Set 3:

> Adar, do you have any opinion on 'write buffer memory usage' vs
 > plain 'memory usage'?

Not beyond bikeshedding, no. :)

-- 
To view, visit http://gerrit.cloudera.org:8080/6870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] [webui] Improvements for when there's many tablets & cleanup

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [webui] Improvements for when there's many tablets & cleanup
..


Patch Set 3:

Adar, do you have any opinion on 'write buffer memory usage' vs plain 'memory 
usage'?

-- 
To view, visit http://gerrit.cloudera.org:8080/6870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] [rpc] introduce per-RPC credentials policy

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [rpc] introduce per-RPC credentials policy
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc
File src/kudu/rpc/negotiation.cc:

Line 215:   
conn->set_remote_features(client_negotiation.take_server_features());
I think we're being pessimistic here in a very certain case:

The connection is created with CredentialsPolicy::ANY, but there currently is 
no token.  In reality, that connection will end up using Kerberos if it's 
connecting to secure server.  In that case, we could 'upgrade' ANY to PRIMARY.

This could be done here by directly setting it on the connection, or you could 
store the negotiated AuthenticationType into the Connection instead of storing 
the CredentialsPolicy.  I think this may be a little cleaner in the long run.


http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/reactor.cc
File src/kudu/rpc/reactor.cc:

PS5, Line 368:   for (auto it = range.first; it != range.second; ++it) {
 : const auto& c = it->second;
 : if (c->scheduled_for_shutdown()) {
 :   // This connection is not to be used for any new calls.
 :   continue;
 : }
 : if (!c->SatisfiesCredentialsPolicy(cred_policy)) {
 :   // Do not use a connection with a non-compliant 
credentials policy.
 :   // Instead, open a new one, while marking the former as 
scheduled for
 :   // shutdown. This process converges: any connection that 
satisfies the
 :   // PRIMARY_CREDENTIALS policy automatically satisfies the 
ANY_CREDENTIALS
 :   // policy as well. The idea is to keep only one usable 
connection identified
 :   // by the specified 'conn_id'.
 :   c->set_scheduled_for_shutdown();
 :   continue;
 : }
 : 
 : // If the test-only 'one-connection-per-RPC' mode is 
enabled, connections
 : // are re-established at every RPC call.
 : if (PREDICT_FALSE(FLAGS_rpc_reopen_outbound_connections)) {
 :   c->set_scheduled_for_shutdown();
 :   continue;
 : }
 :   }
 :   // Second pass: shutdown idle connections to the target 
destination if they
 :   // have been marked. Non-idle connections will be taken care 
of later by the
 :   // idle connection scanner.
 :   scoped_refptr found_conn;
 :   for (auto it = range.first; it != range.second;) {
 : const auto& c = it->second;
 : if (c->scheduled_for_shutdown()) {
 :   if (c->Idle()) {
 : DCHECK_EQ(Connection::CLIENT, c->direction());
 : c->Shutdown(Status::NetworkError("connection is closed 
due to non-reuse policy"));
 : it = client_conns_.erase(it);
 : continue;
 :   }
 : } else {
 :   DCHECK(!found_conn);
 :   found_conn = c;
 :   // Appropriate connection is found; continue further to 
take care of the
 :   // rest of connections marked for shutdown by the first 
pass above.
 : }
 : ++it;
 :   }
I think these loops can be simplified by combining them:

scoped_refptr found_conn;
for (auto it = range.first; it != range.second;) {
  const auto& c = it->second;

  if (c->scheduled_for_shutdown() ||
  !c->SatisfiesCredentialsPolicy(cred_policy) ||
  PREDICT_FALSE(FLAGS_rpc_reopen_outbound_connections)) {

  // Shutdown idle connections. Non-idle connections will be taken care of
  // later by the idle connection scanner.
  if (c->Idle()) {
DCHECK_EQ(Connection::CLIENT, c->direction());
c->Shutdown(Status::NetworkError("connection is closed due to non-reuse 
policy"));
it = client_conns_.erase(it);
continue;
  }

  c->set_scheduled_for_shutdown();
  } else {
DCHECK(!found_conn);
found_conn = c;
  }
  ++it;
}


http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/rpc-test.cc
File src/kudu/rpc/rpc-test.cc:

PS5, Line 295: PRC
RPC


-- 
To view, visit http://gerrit.cloudera.org:8080/6875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 

[kudu-CR] [webui] Improvements for when there's many tablets & cleanup

2017-05-15 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: [webui] Improvements for when there's many tablets & cleanup
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6870/3//COMMIT_MSG
Commit Message:

Line 29: Also I added a memory usage column to the detailed tablets
> That's definitely correct for the block cache, but I'll let Adar answer for
AFAIK WAL buffers aren't tracked by memtrackers. The log index cache is, but 
its tracker isn't parented to the tablet's tracker.


-- 
To view, visit http://gerrit.cloudera.org:8080/6870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] [java client] removes duplicate checks at KuduScanToken

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: [java client] removes duplicate checks at KuduScanToken
..


[java client] removes duplicate checks at KuduScanToken

KUDU-579 added fault tolerant scanner support at java
client. However,it missed TODO for KUDU-1040 and results
in duplicate code at KuduScanToken. This patch removes it.

Change-Id: I7e036d5fe333009a74294d27c54fe846cdc404e9
Reviewed-on: http://gerrit.cloudera.org:8080/6882
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
1 file changed, 0 insertions(+), 4 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7e036d5fe333009a74294d27c54fe846cdc404e9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [java client] removes duplicate checks at KuduScanToken

2017-05-15 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: [java client] removes duplicate checks at KuduScanToken
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7e036d5fe333009a74294d27c54fe846cdc404e9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1860: ksck doesn't identify tablets that are evicted but still in config

2017-05-15 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6772

to look at the new patch set (#5).

Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but 
still in config
..

KUDU-1860: ksck doesn't identify tablets that are evicted but still in config

This patch enhances ksck to gather consensus info from every tablet. It
compares this info with master and outputs the master's config and every
conflicting config, if there are any conflicts. To do this efficiently
it implements a new GetAllConsensusStates RPC that gathers info about
every replica's consensus state.

This will catch at least the two problems identified in KUDU-1860:
1. The leader has a pending config to remove a tablet, but it is not
committed so the master does not see this config. This can hide an
unhealthy tablet if, e.g., one pending config member is down and the
pending-to-be-kicked-out member is up, so 1/2 replicas are alive in
the leader's active config but the master thinks 2/3 are alive.
2. No replica is leader but the master believes there is a leader
because its cache is old and hasn't been updated.

Sample output showing #1:
https://gist.github.com/wdberkeley/f964439bb0c407d3769c2dd9a0959ca0

Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
---
M src/kudu/consensus/consensus.proto
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/ksck_remote.h
M src/kudu/tools/tool_action_cluster.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
9 files changed, 408 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/6772/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] spark: add support for fault tolerant scanner

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has submitted this change and it was merged.

Change subject: spark: add support for fault tolerant scanner
..


spark: add support for fault tolerant scanner

This adds support to use fault tolerant scanner for spark job.
By default non fault tolerant scanner is used. To turn on fault
tolerant scanner, use job config: 'kudu.faultTolerantScan'.

Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2
Reviewed-on: http://gerrit.cloudera.org:8080/6782
Reviewed-by: Dan Burkert 
Tested-by: Kudu Jenkins
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
M 
java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
4 files changed, 37 insertions(+), 8 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6782
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [webui] Improvements for when there's many tablets & cleanup

2017-05-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [webui] Improvements for when there's many tablets & cleanup
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6870/3//COMMIT_MSG
Commit Message:

Line 13: 2. /tablets has a toggle-collapse detailed table of tablets.
> There's 4 possible elements in the tablets section
I see, thanks for the clarification.


Line 29: Also I added a memory usage column to the detailed tablets
> It's not too jargony if it's correct and avoids the number being misleading
That's definitely correct for the block cache, but I'll let Adar answer for WAL 
usage since I'm not sure.


-- 
To view, visit http://gerrit.cloudera.org:8080/6870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-15 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: KUDU-1952 Remove round-robin for block placement
..


Patch Set 29:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

PS24, Line 412: other
> nit: in other places in this file this kind of statement is written as 'if 
Done


PS24, Line 529:   data_dir_set.insert(group_indices.begin(), 
group_indices.end());
  :   for (auto& e : data_dir_by_uuid_idx_) {
  :
> nit: since it's C++11, consider
Neat! Done


PS24, Line 532: URN_
> nit: auto& ?
Done


http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

PS24, Line 20: #include 
> nit: per code style this should be placed after std and before kudu headers
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#29).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff
between I/O and failure disk-failure tolerance, the default behavior
will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 991 insertions(+), 190 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/29
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1952 Remove round-robin for block placement
..


Patch Set 26:

(4 comments)

Just skimmed through.  I need to get better understanding of the background to 
get more thoughtful feedback -- I hope I'll do one more pass this week.

http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

PS24, Line 412: other
nit: in other places in this file this kind of statement is written as 'if 
(other != nullptr)'.  Consider unifying this among those places.


PS24, Line 529:   for (uint16_t uuid : group_indices) {
  : data_dir_set.insert(uuid);
  :   }
nit: since it's C++11, consider

data_dir_set.insert(group_indices.begin(), group_indices.end());


PS24, Line 532: auto
nit: auto& ?


http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

PS24, Line 20: #include 
nit: per code style this should be placed after std and before kudu headers; 
like it is in data_dirs.cc

https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes


-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 26
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

2017-05-15 Thread eric-maynard (Code Review)
eric-maynard has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
..


Patch Set 5:

The build error seems unrelated (?):

09:29:53   Error: chose Boost outside
09:29:53   
/home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/common:
09:29:53   
/home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include
09:29:53 
09:29:53 
09:29:53 -- Configuring incomplete, errors occurred!

-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: eric-maynard 
Gerrit-HasComments: No


[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

2017-05-15 Thread eric-maynard (Code Review)
eric-maynard has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
..


Patch Set 5:

Added trimming when error queue capacity is decreased

-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: eric-maynard 
Gerrit-HasComments: No


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#28).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff
between I/O and failure disk-failure tolerance, the default behavior
will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 992 insertions(+), 190 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/28
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [webui] Improvements for when there's many tablets & cleanup

2017-05-15 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: [webui] Improvements for when there's many tablets & cleanup
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6870/3//COMMIT_MSG
Commit Message:

Line 13: 2. /tablets has a toggle-collapse detailed table of tablets.
> I'm not sure this is an improvement, since there is no content under the ta
There's 4 possible elements in the tablets section
1. summary of live tablets
2. table of live tablets
3. summary of tombstoned tablet
4. table of tombstoned tablets
so having the tablets tables collapsed shows the 1-2 summaries. Previously, if 
there were tombstoned tablets, info about them would be hidden below the live 
tablets table.


Line 29: Also I added a memory usage column to the detailed tablets
> I'm concerned it may be misleading to call this 'memory usage', since it on
It's not too jargony if it's correct and avoids the number being misleading. 
But I'm not familiar with how the mem trackers are working-- is it only 
tracking the write buffers because block caches, logs, etc are owned by 
different trackers?


-- 
To view, visit http://gerrit.cloudera.org:8080/6870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs

2017-05-15 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1875: Refuse unauthenticated connections from publicly 
routable IP addrs
..


Patch Set 34:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6514/32/src/kudu/rpc/server_negotiation.cc
File src/kudu/rpc/server_negotiation.cc:

Line 60: DEFINE_string(trusted_subnets,
> We discussed offline about this, just going to write up what we decided:
Done


PS32, Line 65: bnets of all local network interfaces w
> I think this sentence will read better as "Set it to '0.0.0.0/0' to allow u
Done


Line 76: 
> It's typical to make the iteration variable a 'const auto&', which means th
Done


PS32, Line 212: rypted
> nit: publicly
Done


Line 728: if (!IsTrustedConnection(addr)) {
> Same here
Done


Line 916:   });
> You can probably just wrap this in a KUDU_CHECK_OK call.  It shouldn't ever
Done


http://gerrit.cloudera.org:8080/#/c/6514/32/src/kudu/util/net/net_util.h
File src/kudu/util/net/net_util.h:

Line 96: // (same as network byte order).
> add period to end of sentence.
Done


http://gerrit.cloudera.org:8080/#/c/6514/32/src/kudu/util/net/sockaddr.h
File src/kudu/util/net/sockaddr.h:

Line 73:   // the default auto-generated copy constructor is fine here
> What do you think about defining this operator on Network instead of Sockad
Done


Line 76: };
> I'm not sure it's worth defining this, since it's easily done with std::any
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6514
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e
Gerrit-PatchSet: 34
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [test-workload] setter for admin operation timeout

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6887

Change subject: [test-workload] setter for admin operation timeout
..

[test-workload] setter for admin operation timeout

Added a setter for client's default admin operation timeout.  It sets
corresponding property for the aggregated client builder.

Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c
---
M src/kudu/integration-tests/test_workload.h
1 file changed, 4 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/6887/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6887
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [rpc] introduce per-RPC credentials policy

2017-05-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6875

to look at the new patch set (#5).

Change subject: [rpc] introduce per-RPC credentials policy
..

[rpc] introduce per-RPC credentials policy

This patch introduces policy for RPC authentication credentials.  The
authentication credentials policy allows for control over the type of
client-side credentials used for making a remote procedure call.

The idea behind this change is simple: sometimes the server's behavior
depends on the type of client's credentials used to authenticate the
client to the server in the context of the remote procedure call.  If
the client expects some particular behavior from the server, it has to
explicitly specify the type of credentials it wants to use for the call.

One example of an RPC depending on the type of the specified credentials
is MasterService::ConnectToMaster().  It's impossible to receive an
authentication token from the master if calling that method over a
connection established with an authn token.  To get a new authn token
in that case, it's necessary to open a new connection to the master
using types of credentials other than authn token (e.g., Kerberos
credentials or TLS certificate will work).

In other words, derived/secondary authentication credentials
(such as authn token) can only be acquired if using the primary ones.
That's a crucial restriction to allow for enforcing expiration
of derived/secondary credentials.  With this patch a client has an
ability to re-acquire secondary authentication credentials (authn token)
regardless of the type of credentials used to established current
connection to Kudu master.

As a part of this patch, a new unit test is added to cover the new
functionality.

Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/outbound_call.h
M src/kudu/rpc/proxy.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_controller.cc
M src/kudu/rpc/rpc_controller.h
M src/kudu/rpc/rpc_stub-test.cc
13 files changed, 291 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/6875/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [c++ client] re-acquire authn token if expired

2017-05-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6648

to look at the new patch set (#16).

Change subject: [c++ client] re-acquire authn token if expired
..

[c++ client] re-acquire authn token if expired

Updated C++ client to re-acquire authn token if server responds with
FATAL_INVALID_AUTHENTICATION_TOKEN error on connection negotiation.
If that happens while negotiating a connection for a call, the
connection is closed and the client reconnects to the cluster to get
a new authn token.  After successfully re-acquiring authn token the
client retries the RPC call again.

Added corresponding integration test to cover authn token re-acquisition
for various scenarios.

Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
---
M docs/known_issues.adoc
M docs/security.adoc
M src/kudu/client/batcher.cc
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/authn_token_expire-itest.cc
M src/kudu/rpc/client_negotiation.cc
M src/kudu/rpc/client_negotiation.h
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/outbound_call.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.h
M src/kudu/rpc/server_negotiation.cc
23 files changed, 755 insertions(+), 143 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/6648/16
-- 
To view, visit http://gerrit.cloudera.org:8080/6648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [tests] operations timeout for ClusterVerifier

2017-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6888

Change subject: [tests] operations timeout for ClusterVerifier
..

[tests] operations timeout for ClusterVerifier

Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873
---
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
2 files changed, 21 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/6888/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 


[kudu-CR] KUDU-1952 Remove round-robin for block placement

2017-05-15 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6636

to look at the new patch set (#27).

Change subject: KUDU-1952 Remove round-robin for block placement
..

KUDU-1952 Remove round-robin for block placement

This is the first of a multi-patch patchset to mitigate the effects of
single-disk failure. Throughout the code, the term "DataDir" refers to
a data directory, which is often mounted on a distinct disk. Thus,
"disks" and "data directories" will be used interchangeably.

This patch adds a mapping from tablet to a set of disks and uses it to
replace the existing round-robin placement of blocks. Tablets are
mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks
are placed randomly in directories within each tablet's DataDirGroup.

Tablet-to-group mappings are generated and stored as metadata upon
tablet creation, or upon tablet replacement during a tablet copy.
During group creation, disks are added to groups by randomly selecting
two available directories and selecting the one with fewer tablets on
it. This avoids pigeonholing new tablets to disks with relatively few
tablets, while still trending towards filling underloaded disks.

Groups are maintained when restarting the server, as they are flushed
with metadata, and are deleted upon tablet deletion.  When loading
tablet data from a previous version of Kudu, the tablet's metadata
will not have a DataDirGroup. One will be generated containing all
data directories, as the tablet's data may already be spread across
any number of disks.

As this patch only addresses block placement, it does not itself
mitigate single-disk failure. Given this, and given the tradeoff
between I/O and failure disk-failure tolerance, the default behavior
will be to spread tablet data across all disks.

Testing is done at the block manager level in block_manager-test and
log_block_manager-test, as well as in the new data_dirs-test.

A design doc can be found here:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
---
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/fs/CMakeLists.txt
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.h
A src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/delta_compaction-test.cc
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/deltamemstore-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
38 files changed, 987 insertions(+), 190 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/27
-- 
To view, visit http://gerrit.cloudera.org:8080/6636
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b
Gerrit-PatchSet: 27
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon