[kudu-CR] WIP: clock: add a built-in NTP client implementation

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: WIP: clock: add a built-in NTP client implementation
..


Patch Set 6:

did a rebase and another rev with a few more docs but still not a high quality 
patch. needs another rev or two before review.

-- 
To view, visit http://gerrit.cloudera.org:8080/7477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb9eee9f0334236f39617492bd6f01304d1a0255
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] WIP: clock: add a built-in NTP client implementation

2017-08-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7477

to look at the new patch set (#6).

Change subject: WIP: clock: add a built-in NTP client implementation
..

WIP: clock: add a built-in NTP client implementation

This adds a built-in stripped-down implementation of NTP without any
reliance on kernel features. This should hopefully make it easier for
users to configure NTP even if they don't have root, and also can
maintain better clock error than the system implementation, since we can
prioritize low error bounds rather than low jitter.

TODO: needs a lot more docs, testing, etc

Change-Id: Ieb9eee9f0334236f39617492bd6f01304d1a0255
---
M src/kudu/clock/CMakeLists.txt
A src/kudu/clock/builtin_ntp.cc
A src/kudu/clock/builtin_ntp.h
M src/kudu/clock/hybrid_clock.cc
A src/kudu/clock/ntp-test.cc
5 files changed, 919 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/77/7477/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ieb9eee9f0334236f39617492bd6f01304d1a0255
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1942. Kerberos fails to log in on hostnames with capital letters

2017-08-17 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: KUDU-1942. Kerberos fails to log in on hostnames with capital 
letters
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7693/2//COMMIT_MSG
Commit Message:

PS2, Line 12: With this change I was able to set up a working cluster on my 
laptop
: with a capitalized hostname, where before it would fail as 
described in
: the JIRA.
> the problem is that in the EMC context, we have no hostnames. So, the bind 
Oh right, my bad. I forgot that the conversion to lower case applies to the 
'hostname' and not the principal flag.


-- 
To view, visit http://gerrit.cloudera.org:8080/7693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5ef65dd827459476a2d225d8e3f7c80ff2fdf627
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2032 (part 1): pass pre-resolution hostname into RPC proxies

2017-08-17 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: KUDU-2032 (part 1): pass pre-resolution hostname into RPC 
proxies
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7687/4/src/kudu/master/ts_descriptor.cc
File src/kudu/master/ts_descriptor.cc:

Line 246:   RETURN_NOT_OK(ResolveSockaddr(, ));
In cases like this, wouldn't addr.ToStringWithoutPort() be the same as 'host'?

Are we just trying to avoid another reverse DNS lookup by doing this?


http://gerrit.cloudera.org:8080/#/c/7687/4/src/kudu/util/net/sockaddr.cc
File src/kudu/util/net/sockaddr.cc:

PS4, Line 102:   char str[INET_ADDRSTRLEN];
 :   ::inet_ntop(AF_INET, _.sin_addr, str, INET_ADDRSTRLEN);
 :   return StringPrintf("%s:%d", str, port());
return StringPrintf("%s:%d", ToStringWithoutPort(), port());


-- 
To view, visit http://gerrit.cloudera.org:8080/7687
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I96fb3c73382f0be6e30e29ae2e7176be42f3bb98
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] disk failure: add persistent disk states

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: disk failure: add persistent disk states
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7270/15//COMMIT_MSG
Commit Message:

Line 11: failed disk may be partially written and thus should not be used.
I dont quite follow this logic. If we crashed in the middle of writing a block 
(due to an IO error on thedisk or whatever) then the metadata flush which 
points to the half-written block would not have happened yet. Thus, we don't 
have any partially-written data which is referred to by any metadata.

The bigger argument to persisting the failure status seems to be that, if a 
disk failed, it probably is either read-only, unmounted, or likely to continue 
producing IOErrors after a restart, but we'd still like to start up the server, 
right?

One concern about this is that, in the case of a transient fixable error like 
"out of space" it would certainly be nice to be able to restart without losing 
data. (eg consider the case when some rogue job fills up disks on many nodes at 
once, so we have simultaneous crash of several)


-- 
To view, visit http://gerrit.cloudera.org:8080/7270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2032 (part 1): pass pre-resolution hostname into RPC proxies

2017-08-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7687

to look at the new patch set (#4).

Change subject: KUDU-2032 (part 1): pass pre-resolution hostname into RPC 
proxies
..

KUDU-2032 (part 1): pass pre-resolution hostname into RPC proxies

This modifies the constructor of RPC proxies (generated and otherwise)
to take the remote hostname in addition to the existing resolved
Sockaddr parameter. The hostname is then passed into the ConnectionId
object, and plumbed through to the SASL client in place of the IP
address that was used previously.

The patch changes all of the construction sites of Proxy to fit the new
interface. In most of the test cases, we don't have real hostnames, so
we just use the dotted-decimal string form of the remote Sockaddr, which
matches the existing behavior.

In the real call sites, we have actual host names typically specified by
the user, and in those cases we'll need to pass those into the proxy. In
a few cases, they were conveniently available in the same function that
creates the proxy. In others, they are relatively far away, so this
patch just uses the dotted-decimal string and leaves TODOs.

In the case that Kerberos is not configured, this change should have no
effect since the hostname is ignored by SASL "plain". In the case that
Kerberos is configured with 'rdns=true', they also have no effect,
because the krb5 library will resolve and reverse the hostname from the
IP as it did before. In the case that 'rdns=false', this moves us one
step closer to fixing KUDU-2032 by getting a hostname into the SASL
library.

I verified that, if I set 'rdns = false' on a Kerberized client, I'm now
able to run  'kudu master status ' successfully where it would not
before. This tool uses a direct proxy instantiation where the hostname
was easy to plumb in. 'kudu table list ' still does not work because
it uses the client, which wasn't convenient to plumb quite yet.

Given that this makes incremental improvement towards fixing the issue
without any regression, and is already a fairly wide patch, I hope to
commit this and then address the remaining plumbing in a separate patch.

Change-Id: I96fb3c73382f0be6e30e29ae2e7176be42f3bb98
---
M src/kudu/client/client-internal.cc
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/exactly_once_writes-itest.cc
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/internal_mini_cluster.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/master_cert_authority-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/integration-tests/table_locations-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/master-test.cc
M src/kudu/master/master.cc
M src/kudu/master/sys_catalog-test.cc
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/rpc/connection.h
M src/kudu/rpc/connection_id.cc
M src/kudu/rpc/connection_id.h
M src/kudu/rpc/exactly_once_rpc-test.cc
M src/kudu/rpc/mt-rpc-test.cc
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/protoc-gen-krpc.cc
M src/kudu/rpc/proxy.cc
M src/kudu/rpc/proxy.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-bench.cc
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_stub-test.cc
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_service-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_server_test_util.cc
M src/kudu/util/net/net_util-test.cc
M src/kudu/util/net/sockaddr.cc
M src/kudu/util/net/sockaddr.h
42 files changed, 234 insertions(+), 134 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/7687/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7687
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I96fb3c73382f0be6e30e29ae2e7176be42f3bb98
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2032 (part 2): propagate master hostnames into client

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2032 (part 2): propagate master hostnames into client
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7692/2/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

Line 708:   // TODO: Do address resolution asynchronously as well.
> warning: missing username/bug in TODO [google-readability-todo]
Done


http://gerrit.cloudera.org:8080/#/c/7692/2/src/kudu/client/master_rpc.cc
File src/kudu/client/master_rpc.cc:

PS2, Line 92: /
> nit: remove
Done


http://gerrit.cloudera.org:8080/#/c/7692/2/src/kudu/client/master_rpc.h
File src/kudu/client/master_rpc.h:

Line 22: #include 
> warning: #includes are not sorted properly [llvm-include-order]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7692
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie5838f22c96f757124112b505825a53740468ce1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] disk failure: add persistent disk states

2017-08-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: disk failure: add persistent disk states
..


Patch Set 15:

(30 comments)

Made it further in. Todd should review this too, since he was in that meeting 
we had to discuss this approach (a month or so ago).

http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/block_manager_util-test.cc
File src/kudu/fs/block_manager_util-test.cc:

Line 151: TEST_F(KuduTest, CheckIntegrity) {
There's a lot to process here. In total, do you get full coverage of every 
single path out of PathInstanceMetadataFile::CheckIntegrity? And full coverage 
of the various "update these instance files" branches too?


http://gerrit.cloudera.org:8080/#/c/7270/6/src/kudu/fs/block_manager_util.cc
File src/kudu/fs/block_manager_util.cc:

Line 88: 
> Good point, that should work.
I don't see this change in PS15.


http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/block_manager_util.cc
File src/kudu/fs/block_manager_util.cc:

PS15, Line 54: CloneAndPrepend(msg)
CloneAndPrepend() is work; can we defer it to if !s.ok() and we want to return 
the error?


PS15, Line 69: Status::OK()
This is the default value anyway; you can omit health_status_ from the list.


PS15, Line 121:   // If the write fails, don't update the timestamp.
  :   uint64_t original_timestamp = pb->path_set().timestamp_us();
  :   auto reset_timestamp = MakeScopedCleanup([&] {
  : 
pb->mutable_path_set()->set_timestamp_us(original_timestamp);
  :   });
This doesn't seem that important, but OK.


PS15, Line 137:   // Some instances on failed disks (e.g. those that failed to 
canonicalize
  :   // their path) are prefixed with illegal whitespaces to 
signify failure.
As we discussed offline, rather than modifying the path to indicate a failure 
in canonicalization, let's pass that information along separately (perhaps by 
passing the raw canonicalization status of each path). Then we won't need this 
hack, and we probably won't need to sanitize here at all?


PS15, Line 197:   // Identify the instance that is the most up-to-date and 
check that there
  :   // are no duplicate UUIDs among the healthy input instances.
Should add that the old invariant was that all instances have to have identical 
PathSetPBs. We're relaxing that invariant now and adding timestamps to 
PathSetPBs to identify the most up-to-date one. So if we find ourselves without 
any timestamps, it's OK to pick a path_set arbitrarily to use as the "main 
instance".


PS15, Line 205: max_timestamp
Nit: add the units (max_timestamp_us?).


Line 208:   const PathSetPB path_set = instances[i]->metadata()->path_set();
Store a cref; no need to make a copy.


Line 218: updated_instances->insert(instances[i]);
Can we modify a temporary set first, so that if we return an error we avoid 
touching 'updated_instances'?


Line 231:   int path_set_uuids_size;
Isn't this always the same as main_uuids.size() after L232-244?

Ah, I see, you're protecting against duplicates on L247-257. Maybe add a 
comment here to explain?


Line 232:   if (main_path_set->all_paths_size() == 0) {
It's rather easy to forget that this means "legacy". Perhaps store it in an 
intermediate "bool is_legacy" or some such?


PS15, Line 234: 
main_uuids.insert(main_path_set->deprecated_all_uuids().begin(),
  :   
main_path_set->deprecated_all_uuids().end());
  : path_set_uuids_size = 
main_path_set->deprecated_all_uuids_size();
Nit: flip the order so it's the same as L239-243.


Line 240: DCHECK_GT(path_set_uuids_size, 0);
I think this is trivially correct (by virtue of being in this else statement).


Line 242:   main_uuids.insert(path.uuid());
Should this InsertOrDie? Or can we expect duplicates?


PS15, Line 265: depcreated
deprecated


PS15, Line 267: main_path_updated
main_path_set_updated?


PS15, Line 302: main_path_set->mutable_all_paths(i)
How about using pb_at_i here (though make it a pointer, not a cref).


PS15, Line 325: path
path_set


http://gerrit.cloudera.org:8080/#/c/7270/15/src/kudu/fs/data_dirs-test.cc
File src/kudu/fs/data_dirs-test.cc:

PS15, Line 61: // Use log block mode. This should not have much effect on the 
directory
 : // manager itself, but log blocks are expected to be more common 
in production.
 : static const char* kBlockType = "log";
This is persisted (and compared) in PathInstanceMetadataFile, so it doesn't 
matter what value it is. May as well just pass "test" or something in the one 
place where you use it.


PS15, Line 94:   vector GetDirNames(int num_dirs) {
 : vector ret;
 : for (int i = 0; i < num_dirs; i++) {
 :   string dir_name = Substitute("$0-$1", kDirNamePrefix, i);
 :   ret.push_back(GetTestPath(dir_name));
 :   

[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..


Patch Set 1:

(8 comments)

Mike should probably review this too.

http://gerrit.cloudera.org:8080/#/c/7701/1//COMMIT_MSG
Commit Message:

PS1, Line 9: with
Nit: into


PS1, Line 11: table
tablet


PS1, Line 11: sync
synced


PS1, Line 13: I did a manual test to copy tablet with size of ~37GB.
: With this change, the operation time dropped down from ~718s
: to ~523s.
We talked about this in person a little bit, but could you provide more details 
about the testing set up? Please specify how many tservers were involved. If 
just one, specify the OS, the filesystem, the disk setup, etc.


http://gerrit.cloudera.org:8080/#/c/7701/1/src/kudu/tserver/tablet_copy_client-test.cc
File src/kudu/tserver/tablet_copy_client-test.cc:

Line 196: TEST_F(TabletCopyClientTest, TestDownloadAllBlocks) {
Maybe as part of this test we could verify that the number of fsyncs due to 
block downloading was some number we expect? We don't currently have metrics 
for fsyncs but that's something we could feasibly add (to the LogBlockManager 
or perhaps even to the file implementations in env_posix.cc).


http://gerrit.cloudera.org:8080/#/c/7701/1/src/kudu/tserver/tablet_copy_client.h
File src/kudu/tserver/tablet_copy_client.h:

Line 27: #include "kudu/fs/block_manager.h"
Why do you need this? Can't you forward declare BlockTransaction?


PS1, Line 164: belong
belonging


Line 177:   // Data block is opened with options so that it will fsync() on 
close.
Update this (and maybe the entire comment).


-- 
To view, visit http://gerrit.cloudera.org:8080/7701
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-2065, KUDU-2011: Release sidecars on cancellation or 
timeout
..


KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout

Previously, when an outbound call times out after transmission
of the request has begun but not finished, the RPC layer will
continue to send the entire payload (including any sidecars) to
the remote destination until the promised number of bytes specified
in the header are sent. This is problematic for the users of the
RPC layer as there is no well defined point in which the sidecars
are no longer referenced by the RPC layer. The only model in which
this would work is for the caller to either transfer or share ownership
of the sidecars with the RPC layer. This has caused some life cycle
issues with row batches in Impala before (e.g. IMPALA-5093).

This change fixes the problem above by modifying the RPC protocol
to allow a mid-transmission RPC call to be cancelled. Specifically,
a new footer is added to all outbound call requests. It contains
a flag, when true, indicates the outbound call was cancelled after
it has started but before it finished. The server will inspect this
flag and ignore inbound calls with this flag set. This footer enables
the caller to relinquish references to the sidecars early when an
outbound call is cancelled or timed-out. Once the call is cancelled
or timed-out, the RPC layer will send the remainder of the bytes for
the request with some dummy bytes. Since the server always ignores
cancelled call requests, it's okay to send random bytes.

To avoid breaking compatibility, a new RPC feature flag REQUEST_FOOTERS
is introduced in this change. During connection negotiation, the
client will always include this flag in its feature set. A server which
supports parsing footer will include REQUEST_FOOTERS in its feature set
if it sees the client also supports it. A client will only send the
footer if the remote server has the RPC feature flag REQUEST_FOOTERS.

Change-Id: I5c8e294090279649082eebc4f6cfb6fe858bbbfc
Reviewed-on: http://gerrit.cloudera.org:8080/7599
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M docs/design-docs/rpc.md
M src/kudu/rpc/blocking_ops.cc
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/constants.cc
M src/kudu/rpc/inbound_call.cc
M src/kudu/rpc/inbound_call.h
M src/kudu/rpc/outbound_call.cc
M src/kudu/rpc/outbound_call.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_header.proto
M src/kudu/rpc/rtest.proto
M src/kudu/rpc/serialization.cc
M src/kudu/rpc/serialization.h
M src/kudu/rpc/server_negotiation.cc
M src/kudu/rpc/transfer.cc
M src/kudu/rpc/transfer.h
19 files changed, 532 insertions(+), 159 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5c8e294090279649082eebc4f6cfb6fe858bbbfc
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2032 (part 2): propagate master hostnames into client

2017-08-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7692

to look at the new patch set (#3).

Change subject: KUDU-2032 (part 2): propagate master hostnames into client
..

KUDU-2032 (part 2): propagate master hostnames into client

This changes the client code to remember the user-specified master
addresses and propagate them into the creation of master proxies.

It's not possible to reproduce the necessary DNS configurations in a
minicluster test, but with this patch I am now able to use 'kudu perf
loadgen' against a Kerberized cluster even when my local krb5.conf has
rdns=false.

Change-Id: Ie5838f22c96f757124112b505825a53740468ce1
---
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/integration-tests/external_mini_cluster.cc
5 files changed, 57 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/7692/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7692
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie5838f22c96f757124112b505825a53740468ce1
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2032 (part 1): pass pre-resolution hostname into RPC proxies

2017-08-17 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7687

to look at the new patch set (#3).

Change subject: KUDU-2032 (part 1): pass pre-resolution hostname into RPC 
proxies
..

KUDU-2032 (part 1): pass pre-resolution hostname into RPC proxies

This modifies the constructor of RPC proxies (generated and otherwise)
to take the remote hostname in addition to the existing resolved
Sockaddr parameter. The hostname is then passed into the ConnectionId
object, and plumbed through to the SASL client in place of the IP
address that was used previously.

The patch changes all of the construction sites of Proxy to fit the new
interface. In most of the test cases, we don't have real hostnames, so
we just use the dotted-decimal string form of the remote Sockaddr, which
matches the existing behavior.

In the real call sites, we have actual host names typically specified by
the user, and in those cases we'll need to pass those into the proxy. In
a few cases, they were conveniently available in the same function that
creates the proxy. In others, they are relatively far away, so this
patch just uses the dotted-decimal string and leaves TODOs.

In the case that Kerberos is not configured, this change should have no
effect since the hostname is ignored by SASL "plain". In the case that
Kerberos is configured with 'rdns=true', they also have no effect,
because the krb5 library will resolve and reverse the hostname from the
IP as it did before. In the case that 'rdns=false', this moves us one
step closer to fixing KUDU-2032 by getting a hostname into the SASL
library.

I verified that, if I set 'rdns = false' on a Kerberized client, I'm now
able to run  'kudu master status ' successfully where it would not
before. This tool uses a direct proxy instantiation where the hostname
was easy to plumb in. 'kudu table list ' still does not work because
it uses the client, which wasn't convenient to plumb quite yet.

Given that this makes incremental improvement towards fixing the issue
without any regression, and is already a fairly wide patch, I hope to
commit this and then address the remaining plumbing in a separate patch.

Change-Id: I96fb3c73382f0be6e30e29ae2e7176be42f3bb98
---
M src/kudu/client/client-internal.cc
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/integration-tests/create-table-stress-test.cc
M src/kudu/integration-tests/exactly_once_writes-itest.cc
M src/kudu/integration-tests/external_mini_cluster.cc
M src/kudu/integration-tests/internal_mini_cluster.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/master_cert_authority-itest.cc
M src/kudu/integration-tests/master_replication-itest.cc
M src/kudu/integration-tests/security-itest.cc
M src/kudu/integration-tests/table_locations-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/master-test.cc
M src/kudu/master/master.cc
M src/kudu/master/sys_catalog-test.cc
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/rpc/connection.h
M src/kudu/rpc/connection_id.cc
M src/kudu/rpc/connection_id.h
M src/kudu/rpc/exactly_once_rpc-test.cc
M src/kudu/rpc/mt-rpc-test.cc
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/protoc-gen-krpc.cc
M src/kudu/rpc/proxy.cc
M src/kudu/rpc/proxy.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-bench.cc
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_stub-test.cc
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_service-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_server_test_util.cc
M src/kudu/util/net/net_util-test.cc
M src/kudu/util/net/sockaddr.cc
M src/kudu/util/net/sockaddr.h
42 files changed, 232 insertions(+), 133 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/7687/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7687
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I96fb3c73382f0be6e30e29ae2e7176be42f3bb98
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2032 (part 1): pass pre-resolution hostname into RPC proxies

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2032 (part 1): pass pre-resolution hostname into RPC 
proxies
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7687/2/src/kudu/master/ts_descriptor.cc
File src/kudu/master/ts_descriptor.cc:

PS2, Line 219: addrs.size()
> nit: while you are at it, would it make sense to update this to
Done


http://gerrit.cloudera.org:8080/#/c/7687/2/src/kudu/master/ts_descriptor.h
File src/kudu/master/ts_descriptor.h:

PS2, Line 129: std::string* host
> nit: would it make sense to have the second parameter optional, i.e. change
it's a private method and both places we call it, we need the hostname, so not 
sure the value of making it optional.


http://gerrit.cloudera.org:8080/#/c/7687/2/src/kudu/rpc/connection_id.h
File src/kudu/rpc/connection_id.h:

PS2, Line 67:   
> nit: const?  Or it's not feasible because of id0 = id1 assignments somewher
yea, not feasible due to assignment in proxy.cc


http://gerrit.cloudera.org:8080/#/c/7687/2/src/kudu/tools/ksck_remote.cc
File src/kudu/tools/ksck_remote.cc:

PS2, Line 61:   tserver::TabletServer::kDefaultPort, ));
:   generic_proxy_.reset(new server::GenericServiceProxy(
:   messenger_, addresses[0], host_port_.host()));
:   ts_proxy_.reset(new tserver::TabletServerServiceProxy(
:   messenger_, addresses[0], host_port_.host()));
:   consensus_proxy_.reset(new consensus::ConsensusServiceProxy(
:   messenger_, addresses[0], host_port_.host()));
> style nit: maybe, introduce variables (references) to address[0] and host_p
Done


http://gerrit.cloudera.org:8080/#/c/7687/2/src/kudu/util/net/sockaddr.cc
File src/kudu/util/net/sockaddr.cc:

PS2, Line 110: string(
> nit: consider dropping this for brevity:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7687
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I96fb3c73382f0be6e30e29ae2e7176be42f3bb98
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1942. Kerberos fails to log in on hostnames with capital letters

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1942. Kerberos fails to log in on hostnames with capital 
letters
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7693/2//COMMIT_MSG
Commit Message:

PS2, Line 12: With this change I was able to set up a working cluster on my 
laptop
: with a capitalized hostname, where before it would fail as 
described in
: the JIRA.
> This may not be the best way, but it is possible to do it through the exter
the problem is that in the EMC context, we have no hostnames. So, the bind 
hosts are just IPs with no suitable reverse lookup, and thus we have to set the 
kerberos principal specifically (in the function you linked) rather than using 
the FQDN. So, the case of a capitalized FQDN isn't really testable.


-- 
To view, visit http://gerrit.cloudera.org:8080/7693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5ef65dd827459476a2d225d8e3f7c80ff2fdf627
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

PS6, Line 195:   // Ask TS1 for a vote that should be denied (old last-logged 
opid).
 :   s = itest::RequestVote(ts1_ets, tablet_id, "B", current_term, 
old_opid,
 :  /*ignore_live_leader=*/ true, 
/*is_pre_election=*/ false, kTimeout);
> Maybe I'm misunderstanding. It sounds like you're asking for a test of re-v
my request was to have a directed test where the last logged op id of the 
requestor is greater than the tombstoned replica's and not just precisely its 
last logged opid, but that should still be granted, assuming this will be the 
common case since the replica is tombstoned and doesn't actually increase it's 
last logged op id.

moreover I was also wondering if there would be a problem if the last logged op 
id of the tombstoned replica turns back due to its log being deleted (and it 
being restarted, for instance)


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2101 Include a table summary at the bottom

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2101 Include a table summary at the bottom
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7707/1/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

PS1, Line 222: push_back
emplace_back(std::move(ts))


Line 231:   // The table summary is in alphabetical order by name.
is this best or should we put the unhealthy tables at the bottom, so that when 
you have more than a page full of tables the bad ones stick out?


PS1, Line 239:   vector names;
 :   vector statuses;
 :   vector total_tablets;
 :   vector num_healthy;
 :   vector num_underreplicated;
 :   vector num_unhealthy;
is using the 'AddRow' interface not fewer lines of code / easier to follow here?


PS1, Line 255: UNHEALTHY
I find "UNHEALTHY" to be a little confusing. Should we say "unavailable" or 
something to be more precise? I was a little confused when I saw output that 
said "0 under-replicated but 1 unhealthy". Is under-replicated not unhealthy in 
some sense?


-- 
To view, visit http://gerrit.cloudera.org:8080/7707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1913352e3a1f91b4bb07e2f5001c8cc94d5155d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1489: move tablet metadata

2017-08-17 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7617

to look at the new patch set (#2).

Change subject: KUDU-1489: move tablet metadata
..

KUDU-1489: move tablet metadata

Currently tablet metadata are placed in the first specified data root.
This patch allows users to specify that metadata be spread across all
data roots.

To accomplish this, servers will scan meta and cmeta directories across
all roots and note where tablets currently are. When the meta or cmeta
directory is needed, if a more appropriate directory exists, it is
provided as an  alternate. With striping, an appropriate directory for
a tablet's metadata is any directory in the tablet's directory group.

Changes to the tablet lifecycle with striping:
- During DataDirManager::Open(), scan through all the directories and
  map all the tablets' current meta/cmeta directories.
- During calls to ListTabletIds(), scan through all directories and
  update the tablets' current directories.
- If there are duplicates (not an expected case, but may happen in the
  case of an unexpected shutdown), only one is kept.
- When writing the metas (e.g. ReplaceSuperBlockUnlocked()), check to
  see whether the metadata dir is appropriate, if not, write it to a
  different dir.
- The DataDirManager is now in charge of creating the metadata
  directories.

Testing is done in fs_manager-test to ensure that tablet metadata gets
striped and removed as needed.

Change-Id: Ia5308f8d4b4b738f353f511e2f1a2634b675a60a
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/integration-tests/external_mini_cluster_fs_inspector.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
10 files changed, 654 insertions(+), 168 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/7617/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5308f8d4b4b738f353f511e2f1a2634b675a60a
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
..


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 143:   log_cache_.Init(queue_state_.last_appended);
mind adding a TODO to merge LogCache::Init into the LogCache ctor, given we now 
have all the necessary parameters at construction time?


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 245:   queue.get(),
in the case that one of the RETURN_NOT_OK calls below fails, then you'll end up 
with peer_manager_ constructed with a pointer to a destructed 'queue'. I think 
you may need to also defer storage of peer_manager_ until down below as well.


Line 250:   pending->SetInitialCommittedOpId(info.last_committed_id);
I believe it was on purpose that this was called after the appending of the 
pending ops to the PendingRound structure below.


PS1, Line 1492: opt_local_last_logged_opid) ? 
you can use .value_or(MinimumOpId())


Line 1493:  : 
MinimumOpId();
is this right? I thought, if we don't know our own local op id, then it would 
be unsafe to vote ever, no? ie it should be MaximumOpId or something?


Line 2226:   if (!queue_ || !pending_) return boost::none;
think it's worth a comment here explaining the cases where we could hit this 
case


PS1, Line 2234: default:
  :   return boost::none;
is there another valid value to pass here? not FATAL?


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 134:   if (last_logged_opid && last_logged_opid->term() > caller_term) {
same question as in the Vote case -- is it correct to allow replacement if our 
last opid is 'unknown'?


Line 237: if (last_logged_opid && last_logged_opid->term() > 
remote_cstate_->current_term()) {
same


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 460: if (last_logged_opid) {
same question


Line 474: int64_t last_logged_term = opt_last_logged_opid->term();
is this guaranteed to be non-none?


-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [iwyu] first pass

2017-08-17 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [iwyu] first pass
..


Patch Set 18:

(41 comments)

Good job on trimming the pragmas, looking much better now.

http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

Line 31: #include  // IWYU pragma: keep
I think function could be forward declared.


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/client/client-unittest.cc
File src/kudu/client/client-unittest.cc:

Line 36: 
extra space


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/client/error-internal.cc
File src/kudu/client/error-internal.cc:

Line 23: #include "kudu/client/write_op.h"   // IWYU pragma: keep
KuduWriteOperation could perhaps be forward declared?


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/client/shared_ptr.h
File src/kudu/client/shared_ptr.h:

Line 42: #include  // IWYU pragma: export
I don't think this pragma is correct (or the one below): this file is only 
about defining the kudu::client::sp::* symbols.  If a file is using 
std::shared_ptr, i'd expect it to have to include  normally.


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/codegen/jit_wrapper.cc
File src/kudu/codegen/jit_wrapper.cc:

Line 22: #include "kudu/codegen/jit_wrapper.h"
I think it was correct to list this one first.


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/common/key_encoder.cc
File src/kudu/common/key_encoder.cc:

Line 25: #include "kudu/util/faststring.h" // IWYU pragma: keep
forward declare?


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/common/schema.h
File src/kudu/common/schema.h:

Line 1: // or more contributor license agreements.  See the NOTICE file
Typo ^


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/common/wire_protocol.h
File src/kudu/common/wire_protocol.h:

Line 29: class optional;
I'm curious - is this preferred to optional_fwd.hpp?


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/consensus/mt-log-test.cc
File src/kudu/consensus/mt-log-test.cc:

Line 33: 
extra space


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 74: 
extra space


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/experiments/rwlock-perf.cc
File src/kudu/experiments/rwlock-perf.cc:

Line 29: #include 
This one's suspicious - I think we've gotten rid of all boost shared_ptr usage?


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

Line 18: 
extra space


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

Line 33: #include "kudu/gutil/callback_forward.h"  // IWYU pragma: keep
I don't think it needs callback.h and callback_forward.h


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/gutil/strings/join.h
File src/kudu/gutil/strings/join.h:

Line 19: //
extra line


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

Line 49: class optional;
same - consider optional_fwd.hpp


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/integration-tests/test_workload.cc
File src/kudu/integration-tests/test_workload.cc:

Line 37: //#include "kudu/integration-tests/internal_mini_cluster.h"
leftover?


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

Line 53: class optional;
same - consider optional_fwd.hpp


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/rpc/reactor.h
File src/kudu/rpc/reactor.h:

Line 27: #include  // IWYU pragma: keep
might be possible to forward declare


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/rpc/service_queue.h
File src/kudu/rpc/service_queue.h:

Line 37: class optional;
same -condisder optional_fwd.hpp


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/security/cert.h
File src/kudu/security/cert.h:

Line 32: template 
same


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/security/init.h
File src/kudu/security/init.h:

Line 22: template 
same


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/server/generic_service.cc
File src/kudu/server/generic_service.cc:

Line 26: #include // for LOG, LogMessage, 
COMPACT_GOO...
I think this is self explanatory


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

Line 45: class optional;
same


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/tablet/delta_stats.h
File src/kudu/tablet/delta_stats.h:

Line 25: #include "kudu/common/schema.h" // IWYU pragma: keep
Is this just for ColumnId?  Might be able to forward declare


http://gerrit.cloudera.org:8080/#/c/4738/18/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

Line 41: // IWYU pragma: no_include "kudu/util/monotime.h"
Could this be replaced with a 

[kudu-CR] KUDU-1407: reassign failed tablets

2017-08-17 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7440/11/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

Line 76: using tserver::TSTabletManager;
> warning: using decl 'TSTabletManager' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/7440/11/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 620:   // TODO: There's actually a race here between the check and 
shutdown, but
> warning: missing username/bug in TODO [google-readability-todo]
Done


Line 753: replica->SetFailed(s);
> error: no member named 'SetFailed' in 'kudu::tablet::TabletReplica' [clang-
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1407: reassign failed tablets

2017-08-17 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#12).

Change subject: KUDU-1407: reassign failed tablets
..

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there
may be an inconsistency on-disk. This has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest is added to test that
  failed tablets while running are evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 167 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/12
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 19:

(14 comments)

I think I reviewed everything, but I'm curious to see how this will evolve 
following my suggestions.

http://gerrit.cloudera.org:8080/#/c/7207/19//COMMIT_MSG
Commit Message:

PS19, Line 22: :
Nit: can just omit this.


http://gerrit.cloudera.org:8080/#/c/7207/19/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 245:   typedef int CloseFlags;
Don't need this; just change "enum CloseFlag" to "enum CloseFlags".


Line 499:   int64_t next_sync_offset() const { return next_sync_offset_.Load(); 
}
Don't need this; it's only ever used internally (by LogBlockContainer).


Line 582:   // The offset of the container that next sync should start with.
Can we get by with a boolean? Seems like we could set it in RoundUpContainer 
when the next_block_offset_ increases and clear it when we sync.


PS19, Line 594:   // Protects sync data operation to avoid unnecessary I/O.
  :   Mutex lock_;
I don't understand what protection the lock offers. Furthermore, 
PosixRWFile::Sync() already keeps track of whether a sync is actually necessary.

Dan and I had a conversation about PosixRWFile::Sync. The use of pending_sync_ 
as an optimization is unsafe as it can cause the "losing" thread in a race to 
return early under the assumption that the data has been made durable when it 
hasn't (yet). I filed KUDU-2102 for this; feel free to tackle that next if you 
like.


Line 906:   auto cleanup = MakeScopedCleanup([&]() {
You should add a note here about mutating result_status on failure, like in 
LogWritableBlock::DoClose().


Line 939:   int64_t offset = mode == ROUND_UP ? next_block_offset() : 
block_offset;
I don't understand this, and it's making me nervous.

Let's simplify by exposing and using LogWritableBlock's offset. Hopefully we 
can do it without modifying WritableBlock itself (may need to down_cast earlier 
and deal with LogWritableBlocks through and through).


PS19, Line 955:   if (mode == ROUND_UP) {
  : WARN_NOT_OK(TruncateDataToNextBlockOffset(),
  : "could not truncate excess preallocated space");
  : 
  : if (full() && block_manager()->metrics()) {
  :   block_manager()->metrics()->full_containers->Increment();
  : }
  :   }
Here's another case where, even though FinishBlock() can be called 
concurrently, we would never want to call TruncateDataToNextBlockOffset() 
concurrently, and the ROUND_UP check prevents us from doing that. But it's 
tricky to follow 'mode' through the various call chains to prove this guarantee.


Line 1076: if (next_sync_offset() < cur_next_block_offset) {
These two accesses to next_sync_offset_ aren't atomic. Use StoreMax() instead.

But since you can make this a bool, a simple CAS should do the trick. Look at 
how PosixRWFile uses pending_sync_.


PS19, Line 1173:   int64_t new_next_block_offset = KUDU_ALIGN_UP(
   :   block_offset + block_length,
   :   instance()->filesystem_block_size_bytes());
   :   if (PREDICT_FALSE(new_next_block_offset < 
next_block_offset())) {
   : LOG(WARNING) << Substitute(
   : "Container $0 unexpectedly tried to lower its next 
block offset "
   : "(from $1 to $2), ignoring",
   : ToString(), next_block_offset(), 
new_next_block_offset);
   :   } else {
   : next_block_offset_.Store(new_next_block_offset);
   :   }
I keep circling back to this as a potential synchronization issue. The 
comparison and subsequent assignment to next_block_offset_ is not atomic. It 
didn't need to be prior to your change because BlockCreated() was guaranteed to 
be called by just one thread at a time. It may not need to be now, but I'm 
finding it really tough (based on tracing through the code paths) to guarantee 
that, due to the proliferation of RoundMode in various deep methods.

I think this would be safer if we used StoreMax() instead. And maybe 
total_bytes_ and total_blocks_ should be atomic too.


Line 1188: int64_t LogBlockContainer::fs_aligned_length(int64_t block_offset,
Surely we needn't duplicate this from LogBlock; if you need it at the container 
level (for operations where LogBlocks may not exist), can't we move it instead?

In general please try a little harder to avoid duplicating code, especially 
code with large comments like this.


PS19, Line 1464: if ((flags & SYNC) &&
   : (state_ == CLEAN || state_ == DIRTY || state_ == 
FINALIZED)) {
   :   VLOG(3) << "Syncing block " << id();
   : 
   :   // TODO(unknown): Sync just this block's dirty data.
   

[kudu-CR] rpc: some small cleanup in ConnectionId

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: rpc: some small cleanup in ConnectionId
..


rpc: some small cleanup in ConnectionId

Change-Id: I0788052f8c943ef102f3f551a85a8b219c65c361
Reviewed-on: http://gerrit.cloudera.org:8080/7686
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M src/kudu/rpc/connection_id.cc
M src/kudu/rpc/connection_id.h
M src/kudu/rpc/proxy.cc
M src/kudu/rpc/user_credentials.cc
M src/kudu/rpc/user_credentials.h
5 files changed, 13 insertions(+), 36 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7686
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0788052f8c943ef102f3f551a85a8b219c65c361
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2065, KUDU-2011: Release sidecars on cancellation or 
timeout
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5c8e294090279649082eebc4f6cfb6fe858bbbfc
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1407: reassign failed tablets

2017-08-17 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
..


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

PS7, Line 232: case tserver::TabletServerErrorPB::TABLET_FAILED: // fall-through
> the difference seems to be that TABLET_NOT_FOUND forces a new metadata fetc
Ah, thanks for pointing that out. It seems like the the staleness needs to be 
refreshed when there is a config change. In this case, even though we know a 
config change is impending, it may not have happened yet, but it's a fair point 
that we may still want to retry.

If we assume the time between a failure and the config change (maybe a bit 
longer than the heartbeat interval) is shorter than the time between the client 
staleness check and it doing the actual metadata re-fetch, this isn't too big a 
deal (and still not a big deal otherwise). Went with your suggestion of 
TABLET_NOT_FOUND.


http://gerrit.cloudera.org:8080/#/c/7440/8/src/kudu/consensus/consensus_peers.cc
File src/kudu/consensus/consensus_peers.cc:

Line 291: return;
> warning: do not use 'else' after 'return' [readability-else-after-return]
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS10, Line 618: ::str
> nit remove
Done


PS10, Line 623: 
> nit: explain that otherwise this notification is ignored
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 233: turn;
> doesn't this have to include FAILED too?
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS10, Line 176: SetupErrorAndRespond(resp->mutable_error(), s, 
TabletServerErrorPB::TABLET_NOT_RUNNING,
> likely slightly cleaner to do:
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS10, Line 659: usMessage("De
> more informative status message
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1407: reassign failed tablets

2017-08-17 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#11).

Change subject: KUDU-1407: reassign failed tablets
..

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there
may be an inconsistency on-disk. This has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest is added to test that
  failed tablets while running are evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 161 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-17 Thread Mike Percy (Code Review)
Hello David Ribeiro Alves, Todd Lipcon, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#7).

Change subject: KUDU-871. Support tombstoned voting
..

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.
* Get rid of tablet::FAILED state, instead go to SHUTDOWN with an error.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/delete_tablet-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
33 files changed, 1,397 insertions(+), 345 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] consensus: Tablet copy should clear last-logged opid from superblock

2017-08-17 Thread Mike Percy (Code Review)
Hello David Ribeiro Alves, Todd Lipcon, Alexey Serbin,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7718

to review the following change.

Change subject: consensus: Tablet copy should clear last-logged opid from 
superblock
..

consensus: Tablet copy should clear last-logged opid from superblock

Keeping around irrelevant information is bad hygiene. Plus, we don't
want to disk tombstoned voting using the wrong last-logged opid from the
superblock while there is a last-logged opid stored in the WAL.

Change-Id: Iaa84d59c63222e9ddb05dca492f9ecd47b5c63ea
---
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
3 files changed, 66 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/7718/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaa84d59c63222e9ddb05dca492f9ecd47b5c63ea
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

2017-08-17 Thread Mike Percy (Code Review)
Hello David Ribeiro Alves, Todd Lipcon, Alexey Serbin,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7717

to review the following change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
..

consensus: Improve contract for API to fetch last-logged op id

It's important that we differentiate between when we have a known
last-logged op and when we don't actually know what it is or whether we
have ever appended something to the local WAL. This applies both to the
TABLET_DATA_READY case, where this information is stored in the WAL, and
the TABLET_DATA_TOMBSTONED case, where this information is stored in the
superblock.

Cases where we are unable to determine the last-logged OpId from the WAL
when a replica is in TABLET_DATA_READY state:

* Early in the tablet replica lifecycle (before Raft is started).
* When a replica encounters an error during initialization.

Cases where we are unable to determine the last-logged OpId from the
TabletMetadata when a replica is in TABLET_DATA_TOMBSTONED state:

* If the replica was tombstoned while in a failed state.

Included in this patch are the following API improvements:

1. Delete Log::GetLatestEntryOpId(). Previously, this method would only
   return something other than MinimumOpId() if a log entry had been
   appended during the object's lifetime. It is abandoned in favor of
   RaftConsensus::GetLastOpId(RECEIVED_OPID) which delegates to
   PeerMessageQueue::GetLastOpIdInLog().
2. Merge PeerMessageQueue::Init() into the PeerMessageQueue constructor.
   This allows us to remove one lifecycle state and allows us to
   guarantee that, once the queue is constructed, we can always get a
   valid last-logged opid from it (see #1).
3. Make TabletMetadata::tombstone_last_logged_opid() return a
   boost::optional. We need to clearly differentiate between when
   we know the last-logged opid and when we don't. We also consider
   MinimumOpId() to be equal to boost::none at superblock load time,
   since previous versions of Kudu may have written (0,0) into the
   TabletMetadata 'tombstone_last_logged_opid' field.

Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
---
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/mt-log-test.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
18 files changed, 200 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/7717/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [docs] Deprecate Java 7 and Spark 1

2017-08-17 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has posted comments on this change.

Change subject: [docs] Deprecate Java 7 and Spark 1
..


Patch Set 1:

Oh and https://github.com/apache/kudu/tree/master/java#system-requirements

-- 
To view, visit http://gerrit.cloudera.org:8080/7699
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I050df790ca51729cc99866b9cd75933c3c907a4c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] [docs] Deprecate Java 7 and Spark 1

2017-08-17 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has posted comments on this change.

Change subject: [docs] Deprecate Java 7 and Spark 1
..


Patch Set 1:

Can you add a note to 
http://kudu.apache.org/docs/installation.html#build_java_client ?

-- 
To view, visit http://gerrit.cloudera.org:8080/7699
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I050df790ca51729cc99866b9cd75933c3c907a4c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1407: reassign failed tablets

2017-08-17 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
..


Patch Set 10:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

PS7, Line 232: case tserver::TabletServerErrorPB::TABLET_FAILED: // fall-through
> Hrm, maybe, but I'm keeping this as is for now. Reasoning here was that bef
the difference seems to be that TABLET_NOT_FOUND forces a new metadata fetch 
(as well as blacklisting the server) whereas TABLET_NOT_RUNNING only 
blacklists. not sure whether it wouldn't be best to re-fetch the metadata since 
the tablet would have to be forcibly re-replicated. up to you.


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS10, Line 618: std::
nit remove


PS10, Line 623: Use the current term to ensure the peer will be evicted.
nit: explain that otherwise this notification is ignored


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 233: state_ == QUIESCING || state_ == SHUTDOWN
doesn't this have to include FAILED too?


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS10, Line 176: SetupErrorAndRespond(resp->mutable_error(), s, 
TabletServerErrorPB::TABLET_FAILED, context);
likely slightly cleaner to do:
TabletServerErrorPB::Code code = TabletServerErrorPB::TABLET_NOT_RUNNING;
if (state == tablet::FAILED) {
  s = s.CloneAndAppend((*replica)->error().ToString());
  code = TabletServerErrorPB::TABLET_FAILED;
}
SetupErrorAndRespond(resp->mutable_error(), s, code, context);


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS10, Line 659: s.ToString();
more informative status message


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [docs] Deprecate Java 7 and Spark 1

2017-08-17 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: [docs] Deprecate Java 7 and Spark 1
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7699
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I050df790ca51729cc99866b9cd75933c3c907a4c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout

2017-08-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: KUDU-2065, KUDU-2011: Release sidecars on cancellation or 
timeout
..


Patch Set 5:

The verification failure may be infra related. Not sure if I have the 
permission to request a re-run.

13:52:46 F0817 20:52:08.278630 31810 master_main.cc:68] Check failed: _s.ok() 
Bad status: Network error: error binding socket to 0.0.0.0:40300: Address 
already in use (error 98)
13:52:46 *** Check failure stack trace: ***

-- 
To view, visit http://gerrit.cloudera.org:8080/7599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5c8e294090279649082eebc4f6cfb6fe858bbbfc
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout

2017-08-17 Thread Michael Ho (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7599

to look at the new patch set (#5).

Change subject: KUDU-2065, KUDU-2011: Release sidecars on cancellation or 
timeout
..

KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout

Previously, when an outbound call times out after transmission
of the request has begun but not finished, the RPC layer will
continue to send the entire payload (including any sidecars) to
the remote destination until the promised number of bytes specified
in the header are sent. This is problematic for the users of the
RPC layer as there is no well defined point in which the sidecars
are no longer referenced by the RPC layer. The only model in which
this would work is for the caller to either transfer or share ownership
of the sidecars with the RPC layer. This has caused some life cycle
issues with row batches in Impala before (e.g. IMPALA-5093).

This change fixes the problem above by modifying the RPC protocol
to allow a mid-transmission RPC call to be cancelled. Specifically,
a new footer is added to all outbound call requests. It contains
a flag, when true, indicates the outbound call was cancelled after
it has started but before it finished. The server will inspect this
flag and ignore inbound calls with this flag set. This footer enables
the caller to relinquish references to the sidecars early when an
outbound call is cancelled or timed-out. Once the call is cancelled
or timed-out, the RPC layer will send the remainder of the bytes for
the request with some dummy bytes. Since the server always ignores
cancelled call requests, it's okay to send random bytes.

To avoid breaking compatibility, a new RPC feature flag REQUEST_FOOTERS
is introduced in this change. During connection negotiation, the
client will always include this flag in its feature set. A server which
supports parsing footer will include REQUEST_FOOTERS in its feature set
if it sees the client also supports it. A client will only send the
footer if the remote server has the RPC feature flag REQUEST_FOOTERS.

Change-Id: I5c8e294090279649082eebc4f6cfb6fe858bbbfc
---
M docs/design-docs/rpc.md
M src/kudu/rpc/blocking_ops.cc
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/constants.cc
M src/kudu/rpc/inbound_call.cc
M src/kudu/rpc/inbound_call.h
M src/kudu/rpc/outbound_call.cc
M src/kudu/rpc/outbound_call.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_header.proto
M src/kudu/rpc/rtest.proto
M src/kudu/rpc/serialization.cc
M src/kudu/rpc/serialization.h
M src/kudu/rpc/server_negotiation.cc
M src/kudu/rpc/transfer.cc
M src/kudu/rpc/transfer.h
19 files changed, 532 insertions(+), 159 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/7599/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5c8e294090279649082eebc4f6cfb6fe858bbbfc
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2101 Include a table summary at the bottom

2017-08-17 Thread Will Berkeley (Code Review)
Will Berkeley has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7707

Change subject: KUDU-2101 Include a table summary at the bottom
..

KUDU-2101 Include a table summary at the bottom

This add a table summary to the bottom of ksck. For each table, the
table contains a status and information about the total number of
tablets and the number of healthy, under-replicated, and unhealthy
tablets. A tablet with consensus mismatch is counted as unhealthy, to
make the table easier for less experienced users to understand.

See ksck-test for sample output.

Change-Id: I1913352e3a1f91b4bb07e2f5001c8cc94d5155d4
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
3 files changed, 163 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7707/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I1913352e3a1f91b4bb07e2f5001c8cc94d5155d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 


[kudu-CR] disk failure: add persistent disk states

2017-08-17 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7270

to look at the new patch set (#15).

Change subject: disk failure: add persistent disk states
..

disk failure: add persistent disk states

This patch adds a new version of the path set so failed disks are not
used the next time the server is run. This is needed, as data on a
failed disk may be partially written and thus should not be used.

To accomplish this, disk states, path names, and a timestamp are added
to the path set. Additionally, if any disks are failed when loading
instances from disk, an 'unhealthy' instance with no path set metadata
will be created instead of failing the startup process.

CheckIntegrity() is updated to accomodate this. Rather than
comparing all instances against an agreed-upon set of UUIDs, the single
path set with the largest timestamp is used as the main one to compare
against (if only old path sets are available, the first healthy one is
used). If no instances are healthy, CheckIntegrity() will fail, as all
disks are failed.

Additionally, the notion of a unhealthy instance is added to allow
startup in the presence of disk failures, e.g. in failing to
canonicalize or in failing to read a path instance.

The main path set is checked to ensure its integrity (e.g. no
duplicates), after which it is upgraded with the extra metadata if
needed. It is then checked to ensure it is consistent with the healthy
instances (e.g. having the right UUIDs and paths).

Testing is done in a new iteration of CheckIntegrity(). Further testing
is done in data_dirs-test to ensure the directory manager can be opened
with failed disks. Testing is also added to fs_manager-test to ensure
the FS layout can be loaded with a failed directory.

Some notes:
- Disk failures during FS layout creation are not tolerated. In these
  cases, there is presumably no data on the server anyway, so operators
  should easily be able to fix their cluster.
- In the case of a server restart where all healthy disks fail to start
  up and some known failed disks start working again, the server will
  successfully start up with the bad disks (may have partially-written
  data).
- If there are any unhealthy instances when upgrading the path sets (i.e.
  adding disk states, paths, timestamp), a complete mapping of UUIDs to
  paths will not be available, and CheckIntegrity() will fail.
- The main path set's disk states are updated to reflect the failure of
  the instances. Since data on a failed disk cannot be trusted, disks
  that are successfully read from but are already marked FAILED by the
  main path set are not marked HEALTHY.

This patch is a part of a series of patches to handle disk failures. To
see how this fits, see 2.6 in:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
12 files changed, 1,048 insertions(+), 199 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/7270/15
-- 
To view, visit http://gerrit.cloudera.org:8080/7270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Add tablet state summary metrics and fix KUDU-2044

2017-08-17 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7618

to look at the new patch set (#4).

Change subject: Add tablet state summary metrics and fix KUDU-2044
..

Add tablet state summary metrics and fix KUDU-2044

This patch adds metrics for the number of tablets in each state
(i.e. the number of tablets for each value of TabletStatePB). To
avoid contention in the tablet manager, the frequency with which
the state numbers are computed is limited. Requests that come
too soon for an update will receive cached counts. The maximum
frequency of updates is controlled by a new flag.

It also addresses KUDU-2044: tombstoned tablets will no longer
produce metrics in /metrics output. A metric entity can now be
marked as hidden, so it will not print to /metrics, and
tablet metric entities are marked as hidden when the tablet is
tombstoned, and un-hidden if and when the tablet is revived.

Change-Id: I8c82987ffe4f37e8af95972bde97841e44c521d9
---
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/util/metrics.cc
M src/kudu/util/metrics.h
7 files changed, 307 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/7618/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c82987ffe4f37e8af95972bde97841e44c521d9
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-1942. Kerberos fails to log in on hostnames with capital letters

2017-08-17 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: KUDU-1942. Kerberos fails to log in on hostnames with capital 
letters
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7693/2//COMMIT_MSG
Commit Message:

PS2, Line 12: With this change I was able to set up a working cluster on my 
laptop
: with a capitalized hostname, where before it would fail as 
described in
: the JIRA.
> I wasn't sure how to mock out this stuff in a reasonable way that would acc
This may not be the best way, but it is possible to do it through the external 
mini cluster test:

https://github.com/apache/kudu/blob/master/src/kudu/integration-tests/external_mini_cluster.cc#L664


-- 
To view, visit http://gerrit.cloudera.org:8080/7693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5ef65dd827459476a2d225d8e3f7c80ff2fdf627
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1942. Kerberos fails to log in on hostnames with capital letters

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1942. Kerberos fails to log in on hostnames with capital 
letters
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7693/2//COMMIT_MSG
Commit Message:

PS2, Line 12: With this change I was able to set up a working cluster on my 
laptop
: with a capitalized hostname, where before it would fail as 
described in
: the JIRA.
> Is it worth adding a small test to catch a regression, if any?
I wasn't sure how to mock out this stuff in a reasonable way that would 
accurately capture the whole end-to-end including behavior of the krb5 lib, 
reverse DNS, etc.


-- 
To view, visit http://gerrit.cloudera.org:8080/7693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5ef65dd827459476a2d225d8e3f7c80ff2fdf627
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1942. Kerberos fails to log in on hostnames with capital letters

2017-08-17 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: KUDU-1942. Kerberos fails to log in on hostnames with capital 
letters
..


Patch Set 2: Code-Review+1

LGTM after Alexey's comments have been addressed.

-- 
To view, visit http://gerrit.cloudera.org:8080/7693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5ef65dd827459476a2d225d8e3f7c80ff2fdf627
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[kudu-CR] Add tablet state summary metrics and fix KUDU-2044

2017-08-17 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: Add tablet state summary metrics and fix KUDU-2044
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7618/2//COMMIT_MSG
Commit Message:

PS2, Line 11: The numbers are computed by the heartbeater. There's two reasons
: for this:
: 1. the heartbeater was already computing the number of RUNNING
: and BOOTSTRAPPING tablets by holding the appropriate lock and
: iterating over the tablet map
: 2. the alternative to having some thread periodically iterate
: over the tablet map is to increment and decrement the metrics
: when tablets transition states. This is error-prone,
: particularly if new states are added, and mistakes will
: accumulate until the metric is worse than useless.
> I'm not that worried about missing some places where calls to TransitionFro
Done


http://gerrit.cloudera.org:8080/#/c/7618/2/src/kudu/util/metrics.h
File src/kudu/util/metrics.h:

Line 533:   std::atomic hidden_;
> The rest of the file uses util/atomic.h for atomics; please do the same her
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c82987ffe4f37e8af95972bde97841e44c521d9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] security: only lookup hostname if HOST substitution is required

2017-08-17 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: security: only lookup hostname if _HOST substitution is required
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7694
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5de8647d6cf63ea70d880fa530fa289e8bae24fe
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[kudu-CR] Add tablet state summary metrics and fix KUDU-2044

2017-08-17 Thread Will Berkeley (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7618

to look at the new patch set (#3).

Change subject: Add tablet state summary metrics and fix KUDU-2044
..

Add tablet state summary metrics and fix KUDU-2044

This patch adds metrics for the number of tablets in each state
(i.e. the number of tablets for each value of TabletStatePB). To
avoid contention in the tablet manager, the frequency with which
the state numbers are computed is limited. Requests that come
too soon for an update will receive cached counts. The maximum
frequency of updates is controlled by a new flag.

It also addresses KUDU-2044: tombstoned tablets will no longer
produce metrics in /metrics output. A metric entity can now be
marked as hidden, so it will not print to /metrics, and
tablet metric entities are marked as hidden when the tablet is
tombstoned, and un-hidden if and when the tablet is revived.

Change-Id: I8c82987ffe4f37e8af95972bde97841e44c521d9
---
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/util/metrics.cc
M src/kudu/util/metrics.h
7 files changed, 305 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/7618/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c82987ffe4f37e8af95972bde97841e44c521d9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [gitignore] added *.autosave

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: [gitignore] added *.autosave
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7702
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99457244f007c757eb1a5a0aba235d8e33e4460e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] [gitignore] added *.autosave

2017-08-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: [gitignore] added *.autosave
..


[gitignore] added *.autosave

Change-Id: I99457244f007c757eb1a5a0aba235d8e33e4460e
Reviewed-on: http://gerrit.cloudera.org:8080/7702
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M .gitignore
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7702
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I99457244f007c757eb1a5a0aba235d8e33e4460e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [java] Update outdated dependencies

2017-08-17 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has submitted this change and it was merged.

Change subject: [java] Update outdated dependencies
..


[java] Update outdated dependencies

Includes the following version updates:
- Avro 1.8.1 -> 1.8.2
- Flume 1.6.0 -> 1.7.0
- Gradle 4.0.1 -> 4.1.0
- Jepsen 0.1.3 -> 0.1.5
- Many build plugin versions

Also moves and sorts the versions alphabetically.

Change-Id: I38ea098d3c4543916c43ddb76d4e127fee93ece6
Reviewed-on: http://gerrit.cloudera.org:8080/7647
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo 
Reviewed-by: Jean-Daniel Cryans 
---
M java/gradle/buildscript.gradle
M java/gradle/dependencies.gradle
M java/gradle/wrapper/gradle-wrapper.properties
M java/gradlew
M java/kudu-flume-sink/pom.xml
M java/kudu-jepsen/pom.xml
M java/kudu-spark-tools/pom.xml
M java/kudu-spark/pom.xml
M java/pom.xml
9 files changed, 42 insertions(+), 38 deletions(-)

Approvals:
  Jean-Daniel Cryans: Looks good to me, approved
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7647
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I38ea098d3c4543916c43ddb76d4e127fee93ece6
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] [iwyu] first pass

2017-08-17 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4738

to look at the new patch set (#18).

Change subject: [iwyu] first pass
..

[iwyu] first pass

Updated C++ source files in accordance with some of the recommendations
from include-what-you-use (IWYU) tool:
  * remove unused header files
  * add missing header files
  * use forward declarations when possible

As a result, time of compilation reduced ~10% if building with GNU make
in parallel with 48 jobs (make -j48) at ve0518.halxg.cloudera.com:

prior:
  real1m46.782s
  user47m1.633s
  sys 3m10.678s

after:
  real1m36.867s
  user42m17.340s
  sys 2m54.117s

The next step is automating the checks, so IWYU check would run
automatically (like the LINT build).

NOTE:
  As of now, not all recommendations from the tool are applied yet,
  especially for the test-related code.  That's because some
  recommendations produced by IWYU are incorrect.  Basically, every
  suggestion required manual inspection and judgment, at least it
  was necessary to verify the result code at least compiles.  As a
  result, the process of applying those recommendations is tedious
  and time consuming.  Hopefully, IWYU will get better in the future.
  Meanwhile, we can address the rest of the recommendations file-by-file
  or so in the short run.

Change-Id: I6923091be36a1c03fac34dd73e6b17e17ac9ceaa
---
M src/kudu/benchmarks/rle.cc
M src/kudu/benchmarks/tpch/line_item_tsv_importer.h
M src/kudu/benchmarks/tpch/rpc_line_item_dao-test.cc
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/benchmarks/tpch/rpc_line_item_dao.h
M src/kudu/benchmarks/tpch/tpch1.cc
M src/kudu/benchmarks/tpch/tpch_real_world.cc
M src/kudu/benchmarks/wal_hiccup.cc
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_cache-test.cc
M src/kudu/cfile/block_cache.cc
M src/kudu/cfile/block_cache.h
M src/kudu/cfile/block_compression.cc
M src/kudu/cfile/block_compression.h
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/bloomfile-test.cc
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/cfile/cfile_util.cc
M src/kudu/cfile/cfile_util.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/index-test.cc
M src/kudu/cfile/index_block.cc
M src/kudu/cfile/index_block.h
M src/kudu/cfile/index_btree.cc
M src/kudu/cfile/index_btree.h
M src/kudu/cfile/mt-bloomfile-test.cc
M src/kudu/cfile/type_encodings.cc
M src/kudu/cfile/type_encodings.h
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test-util.cc
M src/kudu/client/client-test-util.h
M src/kudu/client/client-test.cc
M src/kudu/client/client-unittest.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/client_builder-internal.h
M src/kudu/client/error-internal.cc
M src/kudu/client/error-internal.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/client/meta_cache.cc
M src/kudu/client/meta_cache.h
M src/kudu/client/partitioner-internal.cc
M src/kudu/client/partitioner-internal.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/replica-internal.cc
M src/kudu/client/replica-internal.h
M src/kudu/client/resource_metrics.cc
M src/kudu/client/resource_metrics.h
M src/kudu/client/samples/sample.cc
M src/kudu/client/scan_batch.cc
M src/kudu/client/scan_batch.h
M src/kudu/client/scan_configuration.cc
M src/kudu/client/scan_configuration.h
M src/kudu/client/scan_predicate.cc
M src/kudu/client/scan_predicate.h
M src/kudu/client/scan_token-internal.cc
M src/kudu/client/scan_token-internal.h
M src/kudu/client/scan_token-test.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/client/schema-internal.h
M src/kudu/client/schema.cc
M src/kudu/client/schema.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/shared_ptr.h
M src/kudu/client/table-internal.cc
M src/kudu/client/table-internal.h
M src/kudu/client/table_alterer-internal.cc
M src/kudu/client/table_alterer-internal.h
M src/kudu/client/table_creator-internal.cc
M src/kudu/client/table_creator-internal.h
M src/kudu/client/tablet-internal.h
M src/kudu/client/tablet_server-internal.cc
M src/kudu/client/tablet_server-internal.h
M src/kudu/client/value-internal.h
M src/kudu/client/value.cc
M src/kudu/client/value.h
M src/kudu/client/write_op.cc
M 

[kudu-CR] KUDU-2032 (part 2): propagate master hostnames into client

2017-08-17 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2032 (part 2): propagate master hostnames into client
..


Patch Set 2:

(1 comment)

looks good to me, just a nit

http://gerrit.cloudera.org:8080/#/c/7692/2/src/kudu/client/master_rpc.cc
File src/kudu/client/master_rpc.cc:

PS2, Line 92: /
nit: remove


-- 
To view, visit http://gerrit.cloudera.org:8080/7692
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie5838f22c96f757124112b505825a53740468ce1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] disk failure: add persistent disk states

2017-08-17 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7270

to look at the new patch set (#14).

Change subject: disk failure: add persistent disk states
..

disk failure: add persistent disk states

This patch adds a new version of the path set so failed disks are not
used the next time the server is run. This is needed, as data on a
failed disk may be partially written and thus should not be used.

To accomplish this, disk states, path names, and a timestamp are added
to the path set. Additionally, if any disks are failed when loading
instances from disk, an 'unhealthy' instance with no path set metadata
will be created instead of failing the startup process.

CheckIntegrity() is updated to accomodate this. Rather than
comparing all instances against an agreed-upon set of UUIDs, the single
path set with the largest timestamp is used as the main one to compare
against (if only old path sets are available, the first healthy one is
used). If no instances are healthy, CheckIntegrity() will fail, as all
disks are failed.

Additionally, the notion of a unhealthy instance is added to allow
startup in the presence of disk failures, e.g. in failing to
canonicalize or in failing to read a path instance.

The main path set is checked to ensure its integrity (e.g. no
duplicates), after which it is upgraded with the extra metadata if
needed. It is then checked to ensure it is consistent with the healthy
instances (e.g. having the right UUIDs and paths).

Testing is done in a new iteration of CheckIntegrity(). Further testing
is done in data_dirs-test to ensure the directory manager can be opened
with failed disks. Testing is also added to fs_manager-test to ensure
the FS layout can be loaded with a failed directory.

Some notes:
- In the case of a server restart where all healthy disks fail to start
  up and some known failed disks start working again, the server will
  successfully start up with the bad disks (may have partially-written
  data).
- If there are any unhealthy instances when upgrading the path sets (i.e.
  adding disk states, paths, timestamp), a complete mapping of UUIDs to
  paths will not be available, and CheckIntegrity() will fail.
- The main path set's disk states are updated to reflect the failure of
  the instances. Since data on a failed disk cannot be trusted, disks
  that are successfully read from but are already marked FAILED by the
  main path set are not marked HEALTHY.

This patch is a part of a series of patches to handle disk failures. To
see how this fits, see 2.6 in:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f
---
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs.proto
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
12 files changed, 1,055 insertions(+), 193 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/7270/14
-- 
To view, visit http://gerrit.cloudera.org:8080/7270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifddf0817fe1a82044077f5544c400c88de20769f
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [gitignore] added *.autosave

2017-08-17 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7702

Change subject: [gitignore] added *.autosave
..

[gitignore] added *.autosave

Change-Id: I99457244f007c757eb1a5a0aba235d8e33e4460e
---
M .gitignore
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/7702/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7702
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I99457244f007c757eb1a5a0aba235d8e33e4460e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [iwyu] first pass

2017-08-17 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4738

to look at the new patch set (#17).

Change subject: [iwyu] first pass
..

[iwyu] first pass

Updated C++ source files in accordance with some of the recommendations
from include-what-you-use (IWYU) tool:
  * remove unused header files
  * add missing header files
  * use forward declarations when possible

As a result, time of compilation reduced ~10% if building with GNU make
in parallel with 48 jobs (make -j48) at ve0518.halxg.cloudera.com:

prior:
  real1m46.782s
  user47m1.633s
  sys 3m10.678s

after:
  real1m36.867s
  user42m17.340s
  sys 2m54.117s

The next step is automating the checks, so IWYU check would run
automatically (like the LINT build).

NOTE:
  As of now, not all recommendations from the tool are applied yet,
  especially for the test-related code.  That's because some
  recommendations produced by IWYU are incorrect.  Basically, every
  suggestion required manual inspection and judgment, at least it
  was necessary to verify the result code at least compiles.  As a
  result, the process of applying those recommendations is tedious
  and time consuming.  Hopefully, IWYU will get better in the future.
  Meanwhile, we can address the rest of the recommendations file-by-file
  or so in the short run.

Change-Id: I6923091be36a1c03fac34dd73e6b17e17ac9ceaa
---
M src/kudu/benchmarks/rle.cc
M src/kudu/benchmarks/tpch/line_item_tsv_importer.h
M src/kudu/benchmarks/tpch/rpc_line_item_dao-test.cc
M src/kudu/benchmarks/tpch/rpc_line_item_dao.cc
M src/kudu/benchmarks/tpch/rpc_line_item_dao.h
M src/kudu/benchmarks/tpch/tpch1.cc
M src/kudu/benchmarks/tpch/tpch_real_world.cc
M src/kudu/benchmarks/wal_hiccup.cc
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/binary_prefix_block.cc
M src/kudu/cfile/binary_prefix_block.h
M src/kudu/cfile/block_cache-test.cc
M src/kudu/cfile/block_cache.cc
M src/kudu/cfile/block_cache.h
M src/kudu/cfile/block_compression.cc
M src/kudu/cfile/block_compression.h
M src/kudu/cfile/bloomfile-test-base.h
M src/kudu/cfile/bloomfile-test.cc
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/bshuf_block.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/cfile/cfile_util.cc
M src/kudu/cfile/cfile_util.h
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/cfile/encoding-test.cc
M src/kudu/cfile/index-test.cc
M src/kudu/cfile/index_block.cc
M src/kudu/cfile/index_block.h
M src/kudu/cfile/index_btree.cc
M src/kudu/cfile/index_btree.h
M src/kudu/cfile/mt-bloomfile-test.cc
M src/kudu/cfile/type_encodings.cc
M src/kudu/cfile/type_encodings.h
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test-util.cc
M src/kudu/client/client-test-util.h
M src/kudu/client/client-test.cc
M src/kudu/client/client-unittest.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/client_builder-internal.h
M src/kudu/client/error-internal.cc
M src/kudu/client/error-internal.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/client/meta_cache.cc
M src/kudu/client/meta_cache.h
M src/kudu/client/partitioner-internal.cc
M src/kudu/client/partitioner-internal.h
M src/kudu/client/predicate-test.cc
M src/kudu/client/replica-internal.cc
M src/kudu/client/replica-internal.h
M src/kudu/client/resource_metrics.cc
M src/kudu/client/resource_metrics.h
M src/kudu/client/samples/sample.cc
M src/kudu/client/scan_batch.cc
M src/kudu/client/scan_batch.h
M src/kudu/client/scan_configuration.cc
M src/kudu/client/scan_configuration.h
M src/kudu/client/scan_predicate.cc
M src/kudu/client/scan_predicate.h
M src/kudu/client/scan_token-internal.cc
M src/kudu/client/scan_token-internal.h
M src/kudu/client/scan_token-test.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/client/schema-internal.h
M src/kudu/client/schema.cc
M src/kudu/client/schema.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/shared_ptr.h
M src/kudu/client/table-internal.cc
M src/kudu/client/table-internal.h
M src/kudu/client/table_alterer-internal.cc
M src/kudu/client/table_alterer-internal.h
M src/kudu/client/table_creator-internal.cc
M src/kudu/client/table_creator-internal.h
M src/kudu/client/tablet-internal.h
M src/kudu/client/tablet_server-internal.cc
M src/kudu/client/tablet_server-internal.h
M src/kudu/client/value-internal.h
M src/kudu/client/value.cc
M src/kudu/client/value.h
M src/kudu/client/write_op.cc
M 

[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-17 Thread Hao Hao (Code Review)
Hao Hao has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7701

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..

KUDU-1726: Avoid fsync-per-block in tablet copy

This patch incorporates BlockTransaction API with tablet
copy, to avoid fsync() on each block during copying. Blocks
are sync to disk together once the table copy is complete.

I did a manual test to copy tablet with size of ~37GB.
With this change, the operation time dropped down from ~718s
to ~523s.

Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
---
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
3 files changed, 27 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/7701/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7701
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 


[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-17 Thread Hao Hao (Code Review)
Hao Hao has abandoned this change.

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..


Abandoned

-- 
To view, visit http://gerrit.cloudera.org:8080/7700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I73822754fa3537535478bdd9a017e1cdf5b03e9a
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-17 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#19).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost: --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 594 insertions(+), 310 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/19
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-17 Thread Hao Hao (Code Review)
Hao Hao has uploaded a new patch set (#2).

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..

KUDU-1726: Avoid fsync-per-block in tablet copy

This patch incorporates BlockTransaction API with tablet
copy, to avoid fsync() on each block during copying. Blocks
are sync to disk together once the table copy is complete.

I did a manual test to copy tablet with size of ~37GB.
With this change, the operation time dropped down from ~718s
to ~523s.

Change-Id: I73822754fa3537535478bdd9a017e1cdf5b03e9a
---
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
3 files changed, 27 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/7700/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I73822754fa3537535478bdd9a017e1cdf5b03e9a
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-17 Thread Hao Hao (Code Review)
Hao Hao has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7700

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..

KUDU-1726: Avoid fsync-per-block in tablet copy

This patch incorporates BlockTransaction API with tablet
copy, to avoid fsync() on each block during copying. Blocks
are sync to disk together once the table copy is complete.

I did a manual test to copy tablet with size of ~37GB.
With this change, the operation time dropped down from ~718s
to ~523s.

Change-Id: I73822754fa3537535478bdd9a017e1cdf5b03e9a
---
A src/1943.patch
A src/deletion.patch
A src/inject_failure_test.patch
A src/inject_fault.patch
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
A src/reorder.patch
A src/tablet_copy.patch
9 files changed, 3,513 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/7700/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I73822754fa3537535478bdd9a017e1cdf5b03e9a
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao