[kudu-CR] tablet copy: Allow voting from crashed initial tablet copies

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: tablet copy: Allow voting from crashed initial tablet copies
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7961/3/src/kudu/integration-tests/tablet_copy-itest.cc
File src/kudu/integration-tests/tablet_copy-itest.cc:

PS3, Line 1155: ASSERT_EVENTUALLY
By default, this times out in 30 seconds (which is exactly kTimeout for this 
test).  Would it make sense to set some custom timeout while awaiting for 
eventual success (i.e. multiple of kTimeout)?


PS3, Line 1161: ASSERT_OK
What it itest::AddServer() returns other than IllegalState, not-the-leader 
status?  Is there a risk to overlook some other bug in that case if using such 
this sort of 'blanket' retry?


PS3, Line 1187:   LOG(INFO) << "Restarting tablet servers. New replica TS UUID: 
" << new_replica_uuid
  : << ", tablet data state: "
  : << 
tablet::TabletDataState_Name(superblock.tablet_data_state())
  : << ", last-logged opid: " << 
superblock.tombstone_last_logged_opid();
nit: would it make sense to move this after ASSERT_OPID_EQ() and before 
ASSERT_OK(cluster_->master()->Restart()) ?


http://gerrit.cloudera.org:8080/#/c/7961/3/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

PS3, Line 401: RETURN_NOT_OK(meta_->ReplaceSuperBlock(*superblock_));
If not clearing the tombstone_last_logged_opid field in the superblock, would 
tombstone_last_logged_opid_ be still consistent after call of the 
DeleteTabletData() below?  Or it's not relevant here?


-- 
To view, visit http://gerrit.cloudera.org:8080/7961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [tests] fixed flake in delete table-itest

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new patch set (#2).

Change subject: [tests] fixed flake in delete_table-itest
..

[tests] fixed flake in delete_table-itest

Fixed flake in DeleteTableITest.TestNoDeleteTombstonedTablets: if leader
election happened in the middle of the AddServer/RemoveServer sequence,
the test failed with error messages like the following:

  src/kudu/integration-tests/delete_table-itest.cc:1217: Failure
  Failed
  Bad status: Illegal state: Replica 37783b00d5d34ffe87953cb90fa60e7c \
is not leader of this config. Role: FOLLOWER.

If running tests against a DEBUG build, it were about 3 failures
per 1K run:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504655654.10466

After the fix, there were no failures in multiple 1K runs:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504670739.3127

Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
---
M src/kudu/integration-tests/delete_table-itest.cc
1 file changed, 106 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/7972/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] [tests] fixed flake in delete table-itest

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7972

Change subject: [tests] fixed flake in delete_table-itest
..

[tests] fixed flake in delete_table-itest

Fixed flake in DeleteTableITest.TestNoDeleteTombstonedTablets: if leader
election happened in the middle of the AddServer/RemoveServer sequence,
the test failed with error messages like the following:

  src/kudu/integration-tests/delete_table-itest.cc:1217: Failure
  Failed
  Bad status: Illegal state: Replica 37783b00d5d34ffe87953cb90fa60e7c \
is not leader of this config. Role: FOLLOWER.

If running tests against a DEBUG build, it were about 3 failures
per 1K run:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504655654.10466

After the fix, there were no failures in multiple 1K runs:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504670739.3127

Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
---
M src/kudu/integration-tests/delete_table-itest.cc
1 file changed, 104 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/7972/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 


[kudu-CR] KUDU-1807: improve IsCreateInProgress performance

2017-09-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1807: improve IsCreateInProgress performance
..


Patch Set 2:

(11 comments)

> Is there any way we can add a debug-mode consistency check of some
 > sort that verifies that the map of counts maps the calculated
 > iteration? even if it's correct right now it seems like it may
 > regress and be really hard to debug at some point later. (iirc the
 > Java client just loops forever trying to fetch locations if it
 > thinks the table is in-progress)

I added a function to do this verification, and added a call to it in 
IsCreateInProgress. Let me know if you think I should call it elsewhere, or add 
something else.

http://gerrit.cloudera.org:8080/#/c/7957/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 637:   // Then, commit the mutations.
> is there some race window here where a caller ends up asking for GetTableLo
You're right that there's a race window here, and that it can cause 
GetTableLocations to think that a tablet isn't yet running. However, all of our 
client implementations will treat this as a transient state and retry with some 
backoff.

Given the above, I wasn't sure whether this is worth fixing. What ultimately 
convinced me was that it's needed for the correctness of a "compare the tablet 
state count map contents against the actual state counts" function.


PS2, Line 663: set_state
> perhaps this should be named swap_state to indicate its return value better
Done


Line 665: state_changes_[tablet->table()][old_state]--;
> can we DCHECK here that this never goes negative?
On the contrary: it probably will be negative, because state_changes_ 
represents the deltas that will be applied rather than the absolute counts.

In any case, this is moot now because we're no longer aggregating state changes 
for tables.


Line 672:   const PersistentTabletInfo& data(
> nit: is this wrapping necessary?
I like to wrap at 80, and these two lines were 84 and 86 respectively. But I'll 
unwrap since you found it distracting.


Line 681:   PersistentTabletInfo* mutable_data(
> nit: same (wrapping)?
Done


PS2, Line 3697: unlocker_in
> maybe 'committer' is a better name
Done, though I preserved the 'in' and 'out' suffixes to help navigate between 
these functions and ProcessPendingAssignments.


PS2, Line 4468: SysTabletsEntryPB::RUNNING
> don't we have to worry about REPLACED and DELETED? or do those actually get
I tried to maintain the same semantics as the old implementation, which 
returned true when it found any non-RUNNING tablet.

A tablet switches to REPLACED in ProcessPendingAssignments but is implicitly 
removed from the map when AddRemoveTablets is used to add the actual 
replacement tablet. It's interesting that here the mutation is committed before 
AddRemoveTablets is called, so there's a very brief window where a REPLACED 
tablet could be visible. I'm guessing this is harmless because it also implies 
a CREATING tablet, which means the table is definitely still being created.

A tablet switches to DELETED either in DeleteTable or in AlterTable (when a 
range partition is dropped). In the former, the entire table switches to 
DELETED, which causes IsCreateInProgress callers to short-circuit. In the 
latter, a DELETED tablet is explicitly dropped via AddRemoveTablets. And unlike 
ProcessPendingAssignments, the mutation is committed after the tablet is 
dropped, so the DELETED tablet is never visible.


PS2, Line 4591:   DCHECK_GT(it->second, 0);
> can you DCHECK_GE() vs 'delta'?
Moot, Increment/Decrement now always add/remove just one.


http://gerrit.cloudera.org:8080/#/c/7957/2/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

PS2, Line 123: get_state
> unusual naming for a getter
It's because the three callers are all tablet->metadata().state().get_state(). 
So if this were just state(), the calls would be 
tablet->metadata().state().state(), which I found confusing.

I can still change this if you prefer.


PS2, Line 244: TabletStateMap
> how about TabletStateCountMap? otherwise I would expect this type to be key
Done


PS2, Line 318:   void 
IncrementTabletStateCountUnlocked(SysTabletsEntryPB::State state,
 :  int64_t delta = 1);
 :   void 
DecrementTabletStateCountUnlocked(SysTabletsEntryPB::State state,
 :  int64_t delta = 1);
> if you are taking counts here, it seems a little odd to have the two separa
My original implementation used a single method which allowed negative deltas, 
but then I saw that the implementation was pretty different for negative deltas 
(due to additional invariant enforcement), and the symmetry with the 
SchemaVersionCount methods looked nice.

Anyway, this is moot now because per-table state change batching is no longer a 
thing, so we always 

[kudu-CR] KUDU-1807: improve IsCreateInProgress performance

2017-09-05 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7957

to look at the new patch set (#3).

Change subject: KUDU-1807: improve IsCreateInProgress performance
..

KUDU-1807: improve IsCreateInProgress performance

For some inexplicable reason, IsCreateInProgress is invoked when servicing a
GetTableSchema RPC. While it's attractive to change that, it's also
untenable without affecting backwards compatibility. So instead, here's a
patch that adds in-memory caching to the master so that IsCreateInProgress
needn't acquire N tablet locks.

Just like schema_version_counts_, the cached state is a map of all tablet
states to the number of tablets currently in that state. For this particular
problem a single count of "number of creating tablets" might suffice, but I
liked the consistency with schema_version_counts_ and I found it easier to
handle the various corner cases when accounting for all states.

Because the map contents reflect persistent state, the locking semantics
differ somewhat from schema_version_counts_:
1. Tablet state changes are usually wrapped in a tablet metadata lock, so
   care must be taken to only update the state if the lock commits.
2. Further, the count must be updated with the tablet metadata lock held.
   Why? So that count updates are serialized in the same order as state
   changes themselves. Consider an example where two threads are trying to
   update a tablet's state (currently X) to Y and Z respectively. The count
   of state X begins at 1 (C_X=1) and the counts of states Y and Z are 0
   (C_Y=C_Z=0). There's no defined order so whichever state change happens
   second "wins". However, if the count updates were allowed outside the
   metadata locks, it would be possible for the order of operations to be
   X->Y, Y->Z, (C_Y-- C_Z++), (C_X-- C_Y++), causing C_Y to momentarily go
   negative. If count updates are restricted to inside the metadata locks, a
   state change order of X->Y, Y->Z forces the count update order to be
   (C_X-- C_Y++), (C_Y--, C_Z++).
3. Lastly, the count must also be updated with the tablet metadata lock in
   commit mode. Otherwise it's possible for the state change to "leak" to
   another thread via the cached map (and thus affect IsCreateInProgress)
   but not via the actual tablet state.

To accommodate these semantics, the ScopedTabletInfoCommitter was modified
to support tracking state changes. There are two notable (and confusing)
exceptions which are documented in the code.

Change-Id: Idca379bb1249e54c0b7fbb0df408ed829ab9fbd2
---
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/sys_catalog-test.cc
M src/kudu/util/cow_object.h
4 files changed, 261 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/57/7957/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7957
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idca379bb1249e54c0b7fbb0df408ed829ab9fbd2
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2119. Fix failure in encoding-test

2017-09-05 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: KUDU-2119. Fix failure in encoding-test
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7967/1/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

PS1, Line 255: or '\0', but 
Will brought this up when we were discussing this earlier and thought I'd bring 
it up: do we miss out test coverage for binary blobs that do start with '\0' or 
are there other tests that cover?


-- 
To view, visit http://gerrit.cloudera.org:8080/7967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic46d2a993235e560475d931c7b023eb5b4faf7c2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] catalog manager: improve IsAlterInProgress performance

2017-09-05 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: catalog_manager: improve IsAlterInProgress performance
..


catalog_manager: improve IsAlterInProgress performance

To mentally prepare for KUDU-1807, I decided to also tackle the O(n)
behavior of IsAlterInProgress. It's not nearly as bad as IsCreateInProgress
(since we aren't taking cow locks on each tablet) but is still not great
since IsAlterInProgress is called repeatedly by clients.

The solution involves a new map in TableInfo to track the schema versions
of all tablets. Keeping the map up-to-date requires additional work when
adding/removing tablets and when tablet reports include a schema change, but
these are infrequent operations relative to IsAlterInProgress so I figured
the trade off is worth it.

I'm not particularly happy about the lock acquisition in
set_reported_schema_version() but couldn't see a good way to avoid it.

I looped 1000 runs of alter_table-randomized-test. The ~5 failures I saw
were due to KUDU-1539, which I don't think this patch makes us any more or
less vulnerable to.

Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
Reviewed-on: http://gerrit.cloudera.org:8080/7950
Reviewed-by: Dan Burkert 
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
2 files changed, 105 insertions(+), 22 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cow object: lazily copy dirty state

2017-09-05 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: cow_object: lazily copy dirty state
..


cow_object: lazily copy dirty state

This addresses a long-standing TODO that ought to improve performance of
tablet reports in the steady-state: within the COWed objects, defer the
creation of the "dirty" copy until it's actually needed (i.e. the first call
to mutable_dirty()).

The deferral introduces new branches in dirty() and mutable_dirty(), but
given the paucity of those calls for any given lock instance, I think this
is a reasonable trade-off.

Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
Reviewed-on: http://gerrit.cloudera.org:8080/7951
Reviewed-by: Dan Burkert 
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/master/catalog_manager.cc
M src/kudu/util/cow_object.h
2 files changed, 21 insertions(+), 15 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2119. Fix failure in encoding-test

2017-09-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-2119. Fix failure in encoding-test
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7967/1//COMMIT_MSG
Commit Message:

Line 16: The fix just replaces any '\0' characters in the random format string.
Does this also fix 
http://dist-test.cloudera.org:8080/diagnose?key=a32eeefe-906c-11e7-8752-0242ac11000f?
 That's a slightly different failure:

  [ RUN  ] TestEncoding.TestBinaryPrefixBlockBuilderRoundTrip
  I0902 22:56:26.483465  8078 test_util.cc:195] Using random seed: -1483122939
  I0902 22:56:26.483530  8078 encoding-test.cc:288] Block: 00: 1000 3930 
1000  ..90..
  /data/jenkins-workspace/kudu-workspace/src/kudu/cfile/encoding-test.cc:295: 
Failure
  Failed
  Bad status: Not found: no keys in data block


-- 
To view, visit http://gerrit.cloudera.org:8080/7967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic46d2a993235e560475d931c7b023eb5b4faf7c2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2134: Defer block transaction commit to the end of a tablet copy

2017-09-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-2134: Defer block transaction commit to the end of a 
tablet copy
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7966/1//COMMIT_MSG
Commit Message:

PS1, Line 10: expires
to expire


http://gerrit.cloudera.org:8080/#/c/7966/1/src/kudu/tserver/tablet_copy_client.h
File src/kudu/tserver/tablet_copy_client.h:

Line 125:   Status Commit(fs::BlockTransaction* transaction);
Why not do this as part of Finish? You could make the BlockTransaction part of 
TabletCopyClient to avoid passing it back and forth too.


-- 
To view, visit http://gerrit.cloudera.org:8080/7966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4feabc08c2199aad8d08be56f09ac06924345f2b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2119. Fix failure in encoding-test

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2119. Fix failure in encoding-test
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7967/1/src/kudu/cfile/encoding-test.cc
File src/kudu/cfile/encoding-test.cc:

PS1, Line 277: LOG(INFO) << "format string: " << format_string;
nit: is it important to have this information to be output during the test even 
in case of success?  If not, consider replaces this LOG(INFO) and LOG(INFO) at 
like 289 below with signle

SCOPED_TRACE(strings::Substitute("format string: $0; block: $1", format_string, 
HexDump(s)));


-- 
To view, visit http://gerrit.cloudera.org:8080/7967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic46d2a993235e560475d931c7b023eb5b4faf7c2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] tablet copy: Allow voting from crashed initial tablet copies

2017-09-05 Thread Mike Percy (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7961

to look at the new patch set (#3).

Change subject: tablet copy: Allow voting from crashed initial tablet copies
..

tablet copy: Allow voting from crashed initial tablet copies

The change in c35e4f0093dcdec625d1f647924d854c7bc9f3de missed an
important case, which is when the target tablet server in a tablet copy
operation crashes while in the middle of a tablet copy. In that case,
the 'tombstone_last_logged_opid' of 1.0 was not persisted in the
superblock. This manifested as a very flaky
TabletCopyITest.TestTabletCopyNewReplicaFailureCanVote test, which would
fail about 15% of the time.

With the change in this patch, that test now passes reliably:

http://dist-test.cloudera.org/job?job_id=mpercy.1504654266.31446

Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 90 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/7961/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR](branch-1.2.x) KUDU-2083. Decrement running maintenance ops on failed prepare

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7970

Change subject: KUDU-2083. Decrement running maintenance ops on failed prepare
..

KUDU-2083. Decrement running maintenance ops on failed prepare

There is currently a bug where we don't decrement the number of
running ops when an op->Prepare() fails. Although rare, when this
bug is hit, it will decrease the number of simultaneous mm ops
that can run until none can, causing the tserver to run OOM.

Change-Id: I8022bcd4c6470dfef2dece0cbefede916a752291
Reviewed-on: http://gerrit.cloudera.org:8080/7610
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
(cherry picked from commit b365ad0bd6372d459f547da0f1cb82e36148c541)
---
M src/kudu/util/maintenance_manager.cc
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/7970/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7970
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8022bcd4c6470dfef2dece0cbefede916a752291
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.2.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 


[kudu-CR](branch-1.3.x) KUDU-2083. Decrement running maintenance ops on failed prepare

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7969

Change subject: KUDU-2083. Decrement running maintenance ops on failed prepare
..

KUDU-2083. Decrement running maintenance ops on failed prepare

There is currently a bug where we don't decrement the number of
running ops when an op->Prepare() fails. Although rare, when this
bug is hit, it will decrease the number of simultaneous mm ops
that can run until none can, causing the tserver to run OOM.

Change-Id: I8022bcd4c6470dfef2dece0cbefede916a752291
Reviewed-on: http://gerrit.cloudera.org:8080/7610
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
(cherry picked from commit b365ad0bd6372d459f547da0f1cb82e36148c541)
---
M src/kudu/util/maintenance_manager.cc
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/69/7969/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8022bcd4c6470dfef2dece0cbefede916a752291
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.3.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 


[kudu-CR](branch-1.4.x) KUDU-2083. Decrement running maintenance ops on failed prepare

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7968

Change subject: KUDU-2083. Decrement running maintenance ops on failed prepare
..

KUDU-2083. Decrement running maintenance ops on failed prepare

There is currently a bug where we don't decrement the number of
running ops when an op->Prepare() fails. Although rare, when this
bug is hit, it will decrease the number of simultaneous mm ops
that can run until none can, causing the tserver to run OOM.

Change-Id: I8022bcd4c6470dfef2dece0cbefede916a752291
Reviewed-on: http://gerrit.cloudera.org:8080/7610
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
(cherry picked from commit b365ad0bd6372d459f547da0f1cb82e36148c541)
---
M src/kudu/util/maintenance_manager.cc
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/7968/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8022bcd4c6470dfef2dece0cbefede916a752291
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 


[kudu-CR] KUDU-2119. Fix failure in encoding-test

2017-09-05 Thread Todd Lipcon (Code Review)
Hello Will Berkeley,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7967

to review the following change.

Change subject: KUDU-2119. Fix failure in encoding-test
..

KUDU-2119. Fix failure in encoding-test

In commit d1f53cc32 I introduced randomization for the format string
used for the generated string data in this test. The random format
string could sometimes incorporate '\0' bytes, which, in the worst case,
could result in a string of length 0 or 1. This would then cause a later
assertion to fail that was checking that the encoded data be at least
two bytes per string.

The fix just replaces any '\0' characters in the random format string.

Change-Id: Ic46d2a993235e560475d931c7b023eb5b4faf7c2
---
M src/kudu/cfile/encoding-test.cc
1 file changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/7967/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic46d2a993235e560475d931c7b023eb5b4faf7c2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2130 (part 2): more fixes for ITClientStress

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has submitted this change and it was merged.

Change subject: KUDU-2130 (part 2): more fixes for ITClientStress
..


KUDU-2130 (part 2): more fixes for ITClientStress

This fixes some more race conditions in connection termination in the
same vein as part 1.  It also filters benign SSLException from being
returned back to callers.

Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
Reviewed-on: http://gerrit.cloudera.org:8080/7964
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
M java/kudu-client/src/test/java/org/apache/kudu/client/ITClientStress.java
2 files changed, 22 insertions(+), 8 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2134: Defer block transaction commit to the end of a tablet copy

2017-09-05 Thread Hao Hao (Code Review)
Hao Hao has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7966

Change subject: KUDU-2134: Defer block transaction commit to the end of a 
tablet copy
..

KUDU-2134: Defer block transaction commit to the end of a tablet copy

KUDU-2131 uncovered a regression that causes a tablet copy session
expires reliably if the copied tablet is large and the tserver already
has a high number of containers. Even though the issue is mitigated by
LIFO container selection, we can completely avoid it by deferring the
block transaction commit to the end of tablet copy seesion. There are
mainly two reasons we may still want to do so:
 1) If DownloadWALs fails there's no reason to pay the fsync() price
and commit all those blocks.
 2) While DownloadWALs is running the kernel has more time to eagerly
flush the blocks, so the fsync()s at the end could be cheaper.

This patch moves the block transaction commit out from FetchAll() and
commits all downloaded blocks before replacing the superblock.

Change-Id: I4feabc08c2199aad8d08be56f09ac06924345f2b
---
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tools/tool_action_local_replica.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/ts_tablet_manager.cc
6 files changed, 58 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/7966/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4feabc08c2199aad8d08be56f09ac06924345f2b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 


[kudu-CR] Implement a lock table

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Implement a lock table
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7918/2/src/kudu/util/lock_table.h
File src/kudu/util/lock_table.h:

PS2, Line 40: T
I think calling this KEY or K would be clearer (at first wasn't sure what it 
was).

Also the docs should clarify what requirements there are of the 'T' type (eg 
naturally comparable with operator ==? must be hashable with std::hash? etc)


PS2, Line 46: std::shared_ptr
hrm, why is this shared instead of just a move-only type combined with 
boost::optional<> to indicate failure


Line 48: auto result = slots_.insert(std::move(key));
how about using InsertIfNotPresent? and not doing the std::move here but rather 
down below when instantiating the token?


PS2, Line 52: this->s
'this->' is redundant right?


PS2, Line 69: slots_
to me the word 'slots' indicates that multiple entries may fall into the same 
'slot' or something (ie that there are some fixed number of slots). Maybe just 
'locks' or 'keys_' or something?


Line 98:   const T& key() {
can be a const method


-- 
To view, visit http://gerrit.cloudera.org:8080/7918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0404478675e079ba147fbdb4b0bac9db0cdc87e5
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2130 (part 2): more fixes for ITClientStress

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2130 (part 2): more fixes for ITClientStress
..


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7964/1/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 354: lock.lock();
> Yep, found the following in an ITClientStress log:
Aha, I see.  Thanks!


PS1, Line 427: error = new RecoverableExceptio
> Done
Thank you for adding the explanatory comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/7964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1807: improve IsCreateInProgress performance

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1807: improve IsCreateInProgress performance
..


Patch Set 2:

(11 comments)

I only partially followed this patch... it's one of those ones that's hard to 
verify by looking at it visually.

Is there any way we can add a debug-mode consistency check of some sort that 
verifies that the map of counts maps the calculated iteration? even if it's 
correct right now it seems like it may regress and be really hard to debug at 
some point later. (iirc the Java client just loops forever trying to fetch 
locations if it thinks the table is in-progress)

http://gerrit.cloudera.org:8080/#/c/7957/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 637:   // Then, commit the mutations.
is there some race window here where a caller ends up asking for 
GetTableLocations and it says that create is not in-progress, but in fact some 
of the tablets still haven't gotten to the Commit() step below, and hence are 
not returned?


PS2, Line 663: set_state
perhaps this should be named swap_state to indicate its return value better


Line 665: state_changes_[tablet->table()][old_state]--;
can we DCHECK here that this never goes negative?


Line 672:   const PersistentTabletInfo& data(
nit: is this wrapping necessary?


Line 681:   PersistentTabletInfo* mutable_data(
nit: same (wrapping)?


PS2, Line 3697: unlocker_in
maybe 'committer' is a better name
(same below)


PS2, Line 4468: SysTabletsEntryPB::RUNNING
don't we have to worry about REPLACED and DELETED? or do those actually get 
entirely removed?


PS2, Line 4591:   DCHECK_GT(it->second, 0);
can you DCHECK_GE() vs 'delta'?


http://gerrit.cloudera.org:8080/#/c/7957/2/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

PS2, Line 123: get_state
unusual naming for a getter


PS2, Line 244: TabletStateMap
how about TabletStateCountMap? otherwise I would expect this type to be keyed 
by tablet id like the TabletInfo one.


PS2, Line 318:   void 
IncrementTabletStateCountUnlocked(SysTabletsEntryPB::State state,
 :  int64_t delta = 1);
 :   void 
DecrementTabletStateCountUnlocked(SysTabletsEntryPB::State state,
 :  int64_t delta = 1);
if you are taking counts here, it seems a little odd to have the two separate 
methods instead of just having one, and allowing a negative 'delta'


-- 
To view, visit http://gerrit.cloudera.org:8080/7957
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idca379bb1249e54c0b7fbb0df408ed829ab9fbd2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] cow object: lazily copy dirty state

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: cow_object: lazily copy dirty state
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] catalog manager: improve IsAlterInProgress performance

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: catalog_manager: improve IsAlterInProgress performance
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-2130 (part 2): more fixes for ITClientStress

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-2130 (part 2): more fixes for ITClientStress
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7964/1/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 216: try {
> Would it make sense to add the short-circuiting
I'm not sure, but it couldn't hurt, so I'll add it.


Line 354: return;
> I'm curious -- did it happen in the wild that the message arrived in here b
Yep, found the following in an ITClientStress log:

21:47:36.158 [ERROR - New I/O worker #1119] (Connection.java:423) [peer 
master-127.23.143.1:64034] unexpected exception from downstream on [id: 
0xb7be1b31, /127.0.0.1:55649 :> /127.23.143.1:64034]
java.lang.IllegalStateException
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:429)
at 
org.apache.kudu.client.Connection.messageReceived(Connection.java:350)
at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)


PS1, Line 427: assert !explicitlyDisconnected;
> I'm not sure it's good to have this assertion here -- this case covers not 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2130 (part 2): more fixes for ITClientStress

2017-09-05 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7964

to look at the new patch set (#2).

Change subject: KUDU-2130 (part 2): more fixes for ITClientStress
..

KUDU-2130 (part 2): more fixes for ITClientStress

This fixes some more race conditions in connection termination in the
same vein as part 1.  It also filters benign SSLException from being
returned back to callers.

Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
M java/kudu-client/src/test/java/org/apache/kudu/client/ITClientStress.java
2 files changed, 22 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/7964/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] tablet copy: Allow voting from crashed initial tablet copies

2017-09-05 Thread Mike Percy (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7961

to look at the new patch set (#2).

Change subject: tablet copy: Allow voting from crashed initial tablet copies
..

tablet copy: Allow voting from crashed initial tablet copies

The change in c35e4f0093dcdec625d1f647924d854c7bc9f3de missed an
important case, which is when the target tablet server in a tablet copy
operation crashes while in the middle of a tablet copy. In that case,
the 'tombstone_last_logged_opid' of 1.0 was not persisted in the
superblock. This manifested as a very flaky
TabletCopyITest.TestTabletCopyNewReplicaFailureCanVote test, which would
fail about 15% of the time.

With the change in this patch, that test now passes reliably:

http://dist-test.cloudera.org/job?job_id=mpercy.1504654266.31446

Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 87 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/7961/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2130 (part 2): more fixes for ITClientStress

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2130 (part 2): more fixes for ITClientStress
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7964/1/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 216: try {
Would it make sense to add the short-circuiting

  if (state == State.TERMINATED) {
return;
  }

check here as well?  Or we have some guarantees from Netty that it would not be 
the case?


Line 354: return;
I'm curious -- did it happen in the wild that the message arrived in here but 
the connection has been terminated already?


PS1, Line 427: assert !explicitlyDisconnected;
I'm not sure it's good to have this assertion here -- this case covers not only 
SSLException case, as I can see.

Could you add a comment with reasoning behind adding this assertion if you 
think it's worth adding this assertion?


-- 
To view, visit http://gerrit.cloudera.org:8080/7964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2130 (part 2): more fixes for ITClientStress

2017-09-05 Thread Dan Burkert (Code Review)
Hello Adar Dembo, Todd Lipcon, Alexey Serbin,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7964

to review the following change.

Change subject: KUDU-2130 (part 2): more fixes for ITClientStress
..

KUDU-2130 (part 2): more fixes for ITClientStress

This fixes some more race conditions in connection termination in the
same vein as part 1.  It also filters benign SSLException from being
returned back to callers.

Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
M java/kudu-client/src/test/java/org/apache/kudu/client/ITClientStress.java
2 files changed, 17 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/7964/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic3f518513931c660fd93b4272d1b1fceb268f191
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..


KUDU-2130: java client: handle termination during negotiation edge case

There was an edge case where a Connection can be terminated while negotiation
is completing. This would result in a scary looking log message:

  18:24:07.776 [DEBUG - New I/O worker #8112] (Connection.java:649) [peer 
master-127.32.133.1:64032] cleaning up while in state NEGOTIATING due to: 
connection disconnected
  18:24:07.781 [ERROR - New I/O worker #8112] (Connection.java:418) [peer 
master-127.32.133.1:64032] unexpected exception from downstream on [id: 
0xdd52bacc, /127.0.0.1:55318 :> /127.32.133.1:64032]
  java.lang.NullPointerException
 at org.apache.kudu.client.Connection.messageReceived(Connection.java:271)
  at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
  at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)

but in reality the error message is harmless; it just indicates that the
connection has been terminated while the connection's messageReceived handler
is clearing the message queue. This interruption is possible because of
82a8e9f99, which fixed a deadlock in Connection. This commit recognizes when
this race has occured, and early exits from messageReceived.

Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Reviewed-on: http://gerrit.cloudera.org:8080/7960
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
1 file changed, 9 insertions(+), 4 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] tablet copy: Allow voting from crashed initial tablet copies

2017-09-05 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: tablet copy: Allow voting from crashed initial tablet copies
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7961/1//COMMIT_MSG
Commit Message:

PS1, Line 9: The change in c35e4f0093dcdec625d1f647924d854c7bc9f3de missed an
   : important case, which is when the target tablet server in a tablet 
copy
   : operation crashes while in the middle of a tablet copy. In that 
case,
   : the 'tombstone_last_logged_opid' of 1.0 was not persisted in the
   : superblock. This manifested as a very flaky
   : TabletCopyITest.TestTabletCopyNewReplicaFailureCanVote test, which 
would
   : fail about 15% of the time.
> can you clarify what effect this has outside of tests? ie is the change you
Right - the initial implementation was incomplete, so this is not a regression. 
It's an incremental improvement, so it is really a "nice to have" and not what 
we would typically consider a release blocker.


-- 
To view, visit http://gerrit.cloudera.org:8080/7961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..


Patch Set 3: Code-Review+2

Results prior to applying this patch:
  http://dist-test.cloudera.org//job?job_id=aserbin.1504646631.14226

Results after applying PS3 of this patch:
  http://dist-test.cloudera.org//job?job_id=aserbin.1504647755.7179

The results show the issue has been fixed.  The failures in the after-patch 
runs are attributed to SSLException in the log, not NullPointerException.

We can fix the SSLException issue separately.

-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] tablet copy: Allow voting from crashed initial tablet copies

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: tablet copy: Allow voting from crashed initial tablet copies
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7961/1//COMMIT_MSG
Commit Message:

PS1, Line 9: The change in c35e4f0093dcdec625d1f647924d854c7bc9f3de missed an
   : important case, which is when the target tablet server in a tablet 
copy
   : operation crashes while in the middle of a tablet copy. In that 
case,
   : the 'tombstone_last_logged_opid' of 1.0 was not persisted in the
   : superblock. This manifested as a very flaky
   : TabletCopyITest.TestTabletCopyNewReplicaFailureCanVote test, which 
would
   : fail about 15% of the time.
can you clarify what effect this has outside of tests? ie is the change you 
referenced just incomplete? (ie it was still an improvement but didn't improve 
all the cases it could have) or was it actually buggy such that it was a 
regression?

put another way, should this be a blocker for the 1.5 release, or is this an 
incremental improvement on top of what's there?


-- 
To view, visit http://gerrit.cloudera.org:8080/7961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] tablet copy: Allow voting from crashed initial tablet copies

2017-09-05 Thread Mike Percy (Code Review)
Hello Todd Lipcon, Alexey Serbin,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7961

to review the following change.

Change subject: tablet copy: Allow voting from crashed initial tablet copies
..

tablet copy: Allow voting from crashed initial tablet copies

The change in c35e4f0093dcdec625d1f647924d854c7bc9f3de missed an
important case, which is when the target tablet server in a tablet copy
operation crashes while in the middle of a tablet copy. In that case,
the 'tombstone_last_logged_opid' of 1.0 was not persisted in the
superblock. This manifested as a very flaky
TabletCopyITest.TestTabletCopyNewReplicaFailureCanVote test, which would
fail about 15% of the time.

With the change in this patch, that test now passes reliably:

http://dist-test.cloudera.org/job?job_id=mpercy.1504646763.18140

Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet_bootstrap-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/ts_tablet_manager.cc
10 files changed, 58 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/7961/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Idbc9cecac701c24ebe05759adbbb1fbe2b140506
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Dan Burkert (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7960

to look at the new patch set (#3).

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..

KUDU-2130: java client: handle termination during negotiation edge case

There was an edge case where a Connection can be terminated while negotiation
is completing. This would result in a scary looking log message:

  18:24:07.776 [DEBUG - New I/O worker #8112] (Connection.java:649) [peer 
master-127.32.133.1:64032] cleaning up while in state NEGOTIATING due to: 
connection disconnected
  18:24:07.781 [ERROR - New I/O worker #8112] (Connection.java:418) [peer 
master-127.32.133.1:64032] unexpected exception from downstream on [id: 
0xdd52bacc, /127.0.0.1:55318 :> /127.32.133.1:64032]
  java.lang.NullPointerException
 at org.apache.kudu.client.Connection.messageReceived(Connection.java:271)
  at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
  at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)

but in reality the error message is harmless; it just indicates that the
connection has been terminated while the connection's messageReceived handler
is clearing the message queue. This interruption is possible because of
82a8e9f99, which fixed a deadlock in Connection. This commit recognizes when
this race has occured, and early exits from messageReceived.

Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
1 file changed, 9 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/7960/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Dan Burkert (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7960

to look at the new patch set (#2).

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..

KUDU-2130: java client: handle termination during negotiation edge case

There was an edge case where a Connection can be terminated while negotiation
is completing. This would result in a scary looking log message:

  18:24:07.776 [DEBUG - New I/O worker #8112] (Connection.java:649) [peer 
master-127.32.133.1:64032] cleaning up while in state NEGOTIATING due to: 
connection disconnected
  18:24:07.781 [ERROR - New I/O worker #8112] (Connection.java:418) [peer 
master-127.32.133.1:64032] unexpected exception from downstream on [id: 
0xdd52bacc, /127.0.0.1:55318 :> /127.32.133.1:64032]
  java.lang.NullPointerException
 at org.apache.kudu.client.Connection.messageReceived(Connection.java:271)
  at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
  at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)

but in reality the error message is harmless; it just indicates that the
connection has been terminated while the connection's messageReceived handler
is clearing the message queue. This interruption is possible because of
82a8e9f99, which fixed a deadlock in Connection. This commit recognizes when
this race has occured, and early exits from messageReceived.

Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
1 file changed, 9 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/7960/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7960/1/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 264
> If you believe this is not the case, then the next line of Preconditions.ch
I've moved these to after the while loop.


Line 296
> I would update it to be:
I removed this because it was statically true; it comes after a while loop with 
the negation of this condition.


PS1, Line 270: tate != State.TERMINATED
> And adding an extra check
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] catalog manager: improve IsAlterInProgress performance

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: catalog_manager: improve IsAlterInProgress performance
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] cow object: lazily copy dirty state

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: cow_object: lazily copy dirty state
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

2017-09-05 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
..


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 690: Status DiskRowSet::CountRows(rowid_t *count) const {
> 1. I think you are right, but the pattern in the class is to use component_
On #1, I'm pretty certain that we need to acquire the lock to get a ref to 
base_data_, but what I was wondering was whether we need to hold 
component_lock_ while calling base_data_->OnDiskSize().

#2 is not done in rev 13, as far as I can tell.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 645: 
  : Status TabletRepli
> Adar had misgivings about this form. The comment was in PS10 but I'll repro
> TabletReplica::Shutdown() resets consensus_ with lock_ held.

How old was that comment? That is no longer the case.


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] cow object: lazily copy dirty state

2017-09-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: cow_object: lazily copy dirty state
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7951/1/src/kudu/util/cow_object.h
File src/kudu/util/cow_object.h:

Line 73: CHECK(dirty_state_);
> maybe not worth removing, but this check is now redundant.
Done


PS1, Line 97: DCHECK_NOTNULL
> likewise
Done


PS1, Line 104: DCHECK_NOTNULL
> likewise
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1807: improve IsCreateInProgress performance

2017-09-05 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7957

to look at the new patch set (#2).

Change subject: KUDU-1807: improve IsCreateInProgress performance
..

KUDU-1807: improve IsCreateInProgress performance

For some inexplicable reason, IsCreateInProgress is invoked when servicing a
GetTableSchema RPC. While it's attractive to change that, it's also
untenable without affecting backwards compatibility. So instead, here's a
patch that adds in-memory caching to the master so that IsCreateInProgress
needn't acquire N tablet locks.

Just like schema_version_counts_, the cached state is a map of all tablet
states to the number of tablets currently in that state. For this particular
problem a single count of "number of creating tablets" might suffice, but I
liked the consistency with schema_version_counts_ and I found it easier to
handle the various corner cases when accounting for all states.

Because the map contents reflect persistent state, the locking semantics
differ somewhat from schema_version_counts_:
1. Tablet state changes are usually wrapped in a tablet metadata lock, so
   care must be taken to only update the state if the lock commits.
2. Further, the count must be updated with the tablet metadata lock held.
   Why? So that count updates are serialized in the same order as state
   changes themselves. Consider an example where two threads are trying to
   update a tablet's state (currently X) to Y and Z respectively. The count
   of state X begins at 1 (C_X=1) and the counts of states Y and Z are 0
   (C_Y=C_Z=0). There's no defined order so whichever state change happens
   second "wins". However, if the count updates were allowed outside the
   metadata locks, it would be possible for the order of operations to be
   X->Y, Y->Z, (C_Y-- C_Z++), (C_X-- C_Y++), causing C_Y to momentarily go
   negative. If count updates are restricted to inside the metadata locks, a
   state change order of X->Y, Y->Z forces the count update order to be
   (C_X-- C_Y++), (C_Y--, C_Z++).

To accommodate these semantics, the ScopedTabletInfoCommitter was modified
to support tracking state changes. There are two notable (and confusing)
exceptions which are documented in the code.

Change-Id: Idca379bb1249e54c0b7fbb0df408ed829ab9fbd2
---
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
2 files changed, 225 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/57/7957/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7957
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idca379bb1249e54c0b7fbb0df408ed829ab9fbd2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] catalog manager: improve IsAlterInProgress performance

2017-09-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: catalog_manager: improve IsAlterInProgress performance
..


Patch Set 2:

(3 comments)

> What's your thoughts on test coverage, are you assuming we have
 > enough coverage of this already?

My take was that since this patch doesn't change any externally-visible 
semantics, existing test coverage should suffice. That's also why I looped 
alter_table-randomized-test.

That said, I do think our stress testing of AlterTable is somewhat lacking. 
There's some coverage in master-stress-test, but that's about it. If you can 
suggest an interesting concurrent test or two, I would consider adding it.

 > Only thing I might like to see is
 > more debug-enabled 'sanity checks' that the versions map is
 > consistent in certain scenarios.  E.g. maybe add a check that if a
 > tablet is dropped and the # of tablets is 0, the versions map is
 > empty.

I added the one you suggested. If you have others, let me know.

http://gerrit.cloudera.org:8080/#/c/7950/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 4381:   if (it->first == TabletInfo::NOT_YET_REPORTED) {
> Can't this check be skipped?
Done


Line 4389:   return it->first < version;
> I recommend casting version to an int64_t explicitly.  It's ambiguous which
Done


http://gerrit.cloudera.org:8080/#/c/7950/2/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

Line 161:   // Simple accessor for reported_schema_version_.
> Should this doc line be moved to be aove reported_schema_version()?
I was doing the stylistic thing where if an enum/typdef only applies to one 
method, I like to lump it in with the comment block for that method.

But since it's proven to be confusing here, I'll separate them.


-- 
To view, visit http://gerrit.cloudera.org:8080/7950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] cow object: lazily copy dirty state

2017-09-05 Thread Adar Dembo (Code Review)
Hello Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7951

to look at the new patch set (#2).

Change subject: cow_object: lazily copy dirty state
..

cow_object: lazily copy dirty state

This addresses a long-standing TODO that ought to improve performance of
tablet reports in the steady-state: within the COWed objects, defer the
creation of the "dirty" copy until it's actually needed (i.e. the first call
to mutable_dirty()).

The deferral introduces new branches in dirty() and mutable_dirty(), but
given the paucity of those calls for any given lock instance, I think this
is a reasonable trade-off.

Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
---
M src/kudu/master/catalog_manager.cc
M src/kudu/util/cow_object.h
2 files changed, 21 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/7951/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] catalog manager: improve IsAlterInProgress performance

2017-09-05 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7950

to look at the new patch set (#3).

Change subject: catalog_manager: improve IsAlterInProgress performance
..

catalog_manager: improve IsAlterInProgress performance

To mentally prepare for KUDU-1807, I decided to also tackle the O(n)
behavior of IsAlterInProgress. It's not nearly as bad as IsCreateInProgress
(since we aren't taking cow locks on each tablet) but is still not great
since IsAlterInProgress is called repeatedly by clients.

The solution involves a new map in TableInfo to track the schema versions
of all tablets. Keeping the map up-to-date requires additional work when
adding/removing tablets and when tablet reports include a schema change, but
these are infrequent operations relative to IsAlterInProgress so I figured
the trade off is worth it.

I'm not particularly happy about the lock acquisition in
set_reported_schema_version() but couldn't see a good way to avoid it.

I looped 1000 runs of alter_table-randomized-test. The ~5 failures I saw
were due to KUDU-1539, which I don't think this patch makes us any more or
less vulnerable to.

Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
---
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
2 files changed, 105 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/7950/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7960/1/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

PS1, Line 270: queuedMessages.isEmpty()
> i.e., consider changing to 
And adding an extra check

if (state == State.TERMINATED) {
  return;
}

just after the while() loop


-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7960/1/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 296
I would update it to be:

assert queuedMessages == null || queuedMessages.isEmpty();


PS1, Line 270: queuedMessages.isEmpty()
> I think NPE could still happen here.  It would be necessary to check for st
i.e., consider changing to 

while (state != State.TERMINATED && !queuedMessages.isEmpty()) {
  ...
}


-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7960/1/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 264
If you believe this is not the case, then the next line of 
Preconditions.checkState(inflightMessages.isEmpty()); should be updated to be

Preconditions.checkState(inflightMessages == null || inflightMessages.isEmpty())


PS1, Line 270: queuedMessages.isEmpty()
I think NPE could still happen here.  It would be necessary to check for state 
= State.TERMINATED first.


-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2130: java client: handle termination during negotiation edge case

2017-09-05 Thread Dan Burkert (Code Review)
Hello Adar Dembo, Todd Lipcon, Alexey Serbin,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7960

to review the following change.

Change subject: KUDU-2130: java client: handle termination during negotiation 
edge case
..

KUDU-2130: java client: handle termination during negotiation edge case

There was an edge case where a Connection can be terminated while negotiation
is completing. This would result in a scary looking log message:

  18:24:07.781 [ERROR - New I/O worker #8112] (Connection.java:418) [peer 
master-127.32.133.1:64032] unexpected exception from downstream on [id: 
0xdd52bacc, /127.0.0.1:55318 :> /127.32.133.1:64032]
  java.lang.NullPointerException
 at org.apache.kudu.client.Connection.messageReceived(Connection.java:271)
  at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
  at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
  at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
  at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)

but in reality the error message is harmless; it just indicates that the
connection has been terminated while the connection's messageReceived handler
is clearing the message queue. This interruption is possible because of
82a8e9f99, which fixed a deadlock in Connection. This commit recognizes when
this race has occured, and early exits from messageReceived.

Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
1 file changed, 6 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/7960/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3e9d4a6535ae82973743e4ac1071de0aac92b08b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [tidy] updated the 'tidy' target

2017-09-05 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: [tidy] updated the 'tidy' target
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7947/2/build-support/clang_tidy_gerrit.py
File build-support/clang_tidy_gerrit.py:

Line 169: print >>sys.stderr, "--rev-range works only with --no-gerrit"
> Because I wanted that to work also for non-committed changes.  Specifying '
oh, and it seems the description for the '--rev-range' is outdated -- I didn't 
reflect the fact it picks up local changes.


-- 
To view, visit http://gerrit.cloudera.org:8080/7947
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f8db9076d5113ac44fb8200d00965f120126ac4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] [tidy] updated the 'tidy' target

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: [tidy] updated the 'tidy' target
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7947/2/build-support/clang_tidy_gerrit.py
File build-support/clang_tidy_gerrit.py:

Line 169: print >>sys.stderr, "--rev-range works only with --no-gerrit"
I'm a little late to the party, but why not just look for ".." in the revision 
string, and if so, use 'git diff', otherwise use 'git show'?


-- 
To view, visit http://gerrit.cloudera.org:8080/7947
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2f8db9076d5113ac44fb8200d00965f120126ac4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] kudu-tool-test: find path to false at runtime

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has submitted this change and it was merged.

Change subject: kudu-tool-test: find path to false at runtime
..


kudu-tool-test: find path to false at runtime

On macos the false binary is in /usr/bin/false.

Change-Id: I2e2cd346b48c61cfe27fc77ee3cabb681be94ab2
Reviewed-on: http://gerrit.cloudera.org:8080/7959
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M src/kudu/tools/kudu-tool-test.cc
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I2e2cd346b48c61cfe27fc77ee3cabb681be94ab2
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] kudu-tool-test: find path to false at runtime

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: kudu-tool-test: find path to false at runtime
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2e2cd346b48c61cfe27fc77ee3cabb681be94ab2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] kudu-tool-test: find path to false at runtime

2017-09-05 Thread Dan Burkert (Code Review)
Hello Todd Lipcon,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7959

to review the following change.

Change subject: kudu-tool-test: find path to false at runtime
..

kudu-tool-test: find path to false at runtime

On macos the false binary is in /usr/bin/false.

Change-Id: I2e2cd346b48c61cfe27fc77ee3cabb681be94ab2
---
M src/kudu/tools/kudu-tool-test.cc
1 file changed, 3 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/7959/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I2e2cd346b48c61cfe27fc77ee3cabb681be94ab2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc: allow setting --rpc tls min protocol on older RHEL versions

2017-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: rpc: allow setting --rpc_tls_min_protocol on older RHEL versions
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7821/2/src/kudu/security/tls_context.cc
File src/kudu/security/tls_context.cc:

PS2, Line 71: 0x1000U
> Here is a fun thing I discovered:
That's an interesting discovery, Henry. Let's try to avoid diverting the code 
between Impala and Kudu here, so if you go with runtime detection we should do 
the same.


-- 
To view, visit http://gerrit.cloudera.org:8080/7821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic61f31788d63072fae609c6a2186e52d5e2467b7
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

2017-09-05 Thread Will Berkeley (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#13).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
..

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 261 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] cow object: lazily copy dirty state

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: cow_object: lazily copy dirty state
..


Patch Set 1: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7951/1/src/kudu/util/cow_object.h
File src/kudu/util/cow_object.h:

Line 73: CHECK(dirty_state_);
maybe not worth removing, but this check is now redundant.


PS1, Line 97: DCHECK_NOTNULL
likewise


PS1, Line 104: DCHECK_NOTNULL
likewise


-- 
To view, visit http://gerrit.cloudera.org:8080/7951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic00c573717e7cb8df1942ba0760726fd0961e1d6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] catalog manager: improve IsAlterInProgress performance

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: catalog_manager: improve IsAlterInProgress performance
..


Patch Set 2:

(3 comments)

What's your thoughts on test coverage, are you assuming we have enough coverage 
of this already?  Only thing I might like to see is more debug-enabled 'sanity 
checks' that the versions map is consistent in certain scenarios.  E.g. maybe 
add a check that if a tablet is dropped and the # of tablets is 0, the versions 
map is empty.

http://gerrit.cloudera.org:8080/#/c/7950/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 4381:   if (it->first == TabletInfo::NOT_YET_REPORTED) {
Can't this check be skipped?


Line 4389:   return it->first < version;
I recommend casting version to an int64_t explicitly.  It's ambiguous which one 
is getting cast right now, and I think it's brittle to have it cast to 
uint32_t, even though 'it->first' should be in range given the check above.


http://gerrit.cloudera.org:8080/#/c/7950/2/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

Line 161:   // Simple accessor for reported_schema_version_.
Should this doc line be moved to be aove reported_schema_version()?


-- 
To view, visit http://gerrit.cloudera.org:8080/7950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8c1f48c0febad038833edd555ee88f1db83249d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] feat: add task interruption checking for RowIterator

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: feat: add task interruption checking for RowIterator
..


Patch Set 5:

Yep, looks like it was just a flaky test.  Thanks caiconghui!

-- 
To view, visit http://gerrit.cloudera.org:8080/7753
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b4284f2c0a40cd7ba8cf2b76e0403592552c814
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: caiconghui 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: caiconghui 
Gerrit-HasComments: No


[kudu-CR] feat: add task interruption checking for RowIterator

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has submitted this change and it was merged.

Change subject: feat: add task interruption checking for RowIterator
..


feat: add task interruption checking for RowIterator

Once we the submit a spark job to spark cluster, we cannot cancel or
stop the kudu-spark job immediately before checking task interruption,
we can kill the job immediately by clicking the "kill" link on
the spark Web UI.

Change-Id: I0b4284f2c0a40cd7ba8cf2b76e0403592552c814
Reviewed-on: http://gerrit.cloudera.org:8080/7753
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert 
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
1 file changed, 3 insertions(+), 0 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7753
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0b4284f2c0a40cd7ba8cf2b76e0403592552c814
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: caiconghui 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: caiconghui 


[kudu-CR] feat: add task interruption checking for RowIterator

2017-09-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: feat: add task interruption checking for RowIterator
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7753
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b4284f2c0a40cd7ba8cf2b76e0403592552c814
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: caiconghui 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: caiconghui 
Gerrit-HasComments: No


[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

2017-09-05 Thread Will Berkeley (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#12).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
..

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 259 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

2017-09-05 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
..


Patch Set 11:

Odd...I compiled and ran some tests just before pushing this to review. And 
TSAN worked.

-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

2017-09-05 Thread Will Berkeley (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6968

to look at the new patch set (#11).

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
..

KUDU-1755 Part 2: Improve tablet on-disk size metric

This adds WAL segment and cmeta sizes to the tablet on-disk
metric. It also fixes the tablet superblock on disk size,
which was being underestimated because the size was measured
as the PB size instead of counting the container file the PB
is stored in.

It also exposes a second metric, Tablet Data Size On Disk, which
measures just the size of data blocks and does not include
metadata or WAL segments.

Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
---
M src/kudu/consensus/consensus_meta-test.cc
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_copy-itest.cc
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
24 files changed, 258 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/6968/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-1755 Part 2: Improve tablet on-disk size metric

2017-09-05 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change.

Change subject: KUDU-1755 Part 2: Improve tablet on-disk size metric
..


Patch Set 10:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/consensus_meta.h
File src/kudu/consensus/consensus_meta.h:

Line 153: return on_disk_size_;
> Can we say this is thread-safe and use an atomic?
Done


PS10, Line 241: uint64_t
> how about atomic and int64_t because there is just no way we are going to n
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

PS10, Line 922: CHECK_OK
> Do we need this CHECK? Also, can we make this safe to run even if log_ is s
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

PS10, Line 201: TotalSize
> maybe name this OnDiskSize() for consistency?
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 2645:   UniqueLock lock(lock_);
> If we make cmeta_->on_disk_size() thread-safe, then we can avoid taking the
Done


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

Line 690:   shared_lock l(component_lock_);
> 1. Can we just take a ref to base_data_ instead of holding the component_lo
1. I think you are right, but the pattern in the class is to use 
component_lock_ whenever doing anything with base_data_. Is there a reason to 
prefer taking a ref over acquiring the lock briefly?
2. Done.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

Line 115:   virtual Status DebugDump(std::vector *lines = NULL) = 
0;
> warning: default arguments on virtual or override methods are prohibited [g
Pre-existing; ignored.


http://gerrit.cloudera.org:8080/#/c/6968/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 377: OnDiskSizeNoCMetaUnlocked
> why the "no cmeta" version? could use a comment
I think that was a mistake, a leftover from a previous version. There's no need 
for that method at all now.


PS10, Line 645:   auto consensus = consensus_;
  :   if (consensus) {
> because consensus_ is set-once, this can be written as:
Adar had misgivings about this form. The comment was in PS10 but I'll reproduce 
it here for convenience:

"This isn't safe; you need to take a local ref of consensus_ and then test it 
for nullitude. Or you need to make the call with lock_ held, which guarantees 
that consensus_ isn't modified.

Okay, I went and read the comments in TabletReplica which talk about how 
consensus_ is a "set-once" object. That seems bogus to me, though admittedly 
I'm not an expert on this code. TabletReplica::Shutdown() resets consensus_ 
with lock_ held. Can we guarantee that when Shutdown() is called, no other 
thread (besides the one calling Shutdown()) will invoke a TabletReplica 
function? I don't see how we can (hence the fragility). And if we can't, then 
consensus_ access needs to be atomic, either by taking a local ref then 
operating on it, or by acquiring lock_ first."


PS10, Line 656: size_t
> Can we use int64_t instead of size_t? We are never going to have 64-bit len
Done


PS10, Line 660:   if (tablet_) {
  : ret += tablet_->OnDiskSize();
  :   }
  :   // WAL segments.
  :   if (state_ == RUNNING) {
  : ret += log_->TotalSize();
  :   }
> Instead of doing this work while holding TabletReplica::lock_, can this be 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83f8aae0a544abbd19f5a34520b4828b89b6315
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1807: improve IsCreateInProgress performance

2017-09-05 Thread Adar Dembo (Code Review)
Hello Dan Burkert, Todd Lipcon,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/7957

to review the following change.

Change subject: KUDU-1807: improve IsCreateInProgress performance
..

KUDU-1807: improve IsCreateInProgress performance

For some inexplicable reason, IsCreateInProgress is invoked when servicing a
GetTableSchema RPC. While it's attractive to change that, it's also
untenable without affecting backwards compatibility. So instead, here's a
patch that adds in-memory caching to the master so that IsCreateInProgress
needn't acquire N tablet locks.

Just like schema_version_counts_, the cached state is a map of all tablet
states to the number of tablets currently in that state. For this particular
problem a single count of "number of creating tablets" might suffice, but I
liked the consistency with schema_version_counts_ and I found it easier to
handle the various corner cases when accounting for all states.

Because the map contents reflect persistent state, the locking semantics
differ somewhat from schema_version_counts_:
1. Tablet state changes are usually wrapped in a tablet metadata lock, so
   care must be taken to only update the state if the lock commits.
2. Further, the count must be updated with the tablet metadata lock held.
   Why? So that count updates are serialized in the same order as state
   changes themselves. Consider an example where two threads are trying to
   update a tablet's state (currently X) to Y and Z respectively. The count
   of state X begins at 1 (C_X=1) and the counts of states Y and Z are 0
   (C_Y=C_Z=0). There's no defined order so whichever state change happens
   second "wins". However, if the count updates were allowed outside the
   metadata locks, it would be possible for the order of operations to be
   X->Y, Y->Z, (C_Y-- C_Z++), (C_X-- C_Y++), causing C_Y to momentarily go
   negative. If count updates are restricted to inside the metadata locks, a
   state change order of X->Y, Y->Z forces the count update order to be
   (C_X-- C_Y++), (C_Y--, C_Z++).

To accommodate these semantics, the ScopedTabletInfoCommitter was modified
to support tracking state changes. There are two notable (and confusing)
exceptions which are documented in the code.

Change-Id: Idca379bb1249e54c0b7fbb0df408ed829ab9fbd2
---
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
2 files changed, 224 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/57/7957/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7957
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Idca379bb1249e54c0b7fbb0df408ed829ab9fbd2
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon