[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-23 Thread Hao Hao (Code Review)
Hello Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7701

to look at the new patch set (#3).

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..

KUDU-1726: Avoid fsync-per-block in tablet copy

This patch incorporates BlockTransaction API into tablet
copy, to avoid fsync() on each block during copying. Blocks
are synced to disk together once the tablet copy is complete.
It also introduces a new block manager metric 'total_disk_sync'
to keep track of block data disk synchronization count.

I did a manual test to copy tablet with size of ~37GB from one
tablet server to another on el6 with ext4.
'kudu remote_replica copy c53ffbc2ede84b6d9af2da607024d131
localhost:3334 localhost:3335'
With this change, the operation time dropped down from ~718s
to ~523s.

Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
---
M src/kudu/fs/block_manager_metrics.cc
M src/kudu/fs/block_manager_metrics.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
7 files changed, 72 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/7701/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7701
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1894 fixed deadlock in client.Connection

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1894 fixed deadlock in client.Connection
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 524: sendCallToWire(msg);
> Even if Channels.write() is async, I think there is an underlying queue in 
Another thought is that the sequence anyway could go non-ascending order when 
the call_id counter overflows.


-- 
To view, visit http://gerrit.cloudera.org:8080/7765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] subprocess: even more robust fix for asynchronous signals

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: subprocess: even more robust fix for asynchronous signals
..


subprocess: even more robust fix for asynchronous signals

Another x1000 loop of raft_consensus-itest yielded the following failure in
TestElectPendingVoter:

  /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:2035: 
Failure
  Value of: s.IsTimedOut()
Actual: false
  Expected: true
  Expected AddServer() to time out. Result: OK

The fix from commit 27435da doesn't work well for AddServer because it's not
an idempotent operation. Instead, let's take a more robust tack and add some
waiting behavior directly into Subprocess, using the /proc//stat
process state as a guide.

Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
Reviewed-on: http://gerrit.cloudera.org:8080/7561
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
M src/kudu/util/subprocess.h
4 files changed, 113 insertions(+), 11 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7701/1//COMMIT_MSG
Commit Message:

PS1, Line 9: into
> Nit: into
Done


PS1, Line 11: sync
> synced
Done


PS1, Line 13: to keep track of block data disk synchronization count.
: 
: I did a m
> We talked about this in person a little bit, but could you provide more det
Done


http://gerrit.cloudera.org:8080/#/c/7701/1/src/kudu/tserver/tablet_copy_client-test.cc
File src/kudu/tserver/tablet_copy_client-test.cc:

Line 196: 
> I like this idea
Done


http://gerrit.cloudera.org:8080/#/c/7701/1/src/kudu/tserver/tablet_copy_client.h
File src/kudu/tserver/tablet_copy_client.h:

Line 27: #include "kudu/gutil/macros.h"
> Why do you need this? Can't you forward declare BlockTransaction?
Done


PS1, Line 164: ogging
> belonging
Done


Line 177:   // Download a single block.
> Update this (and maybe the entire comment).
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7701
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy

2017-08-23 Thread Hao Hao (Code Review)
Hello Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7701

to look at the new patch set (#2).

Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy
..

KUDU-1726: Avoid fsync-per-block in tablet copy

This patch incorporates BlockTransaction API into tablet
copy, to avoid fsync() on each block during copying. Blocks
are synced to disk together once the tablet copy is complete.
It also introduced a new block manager metric 'total_sync_disk'
to keep track of block data disk synchronization count.

I did a manual test to copy tablet with size of ~37GB from one
tablet server to another on el6 with ext4.
'kudu remote_replica copy c53ffbc2ede84b6d9af2da607024d131
localhost:3334 localhost:3335'
With this change, the operation time dropped down from ~718s
to ~523s.

Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
---
M src/kudu/fs/block_manager_metrics.cc
M src/kudu/fs/block_manager_metrics.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
7 files changed, 72 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/7701/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7701
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1894 fixed deadlock in client.Connection

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1894 fixed deadlock in client.Connection
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

Line 524: sendCallToWire(msg);
> Yes, that can happen now.  But I don't see how it could break any assumptio
Even if Channels.write() is async, I think there is an underlying queue in 
Netty. However, I don't see how we are about to see a problem here in case of 
concurrent calls after this change.


-- 
To view, visit http://gerrit.cloudera.org:8080/7765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2039 Fix the table count in the /tables page of master webUI

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: KUDU-2039 Fix the table count in the /tables page of master 
webUI
..


KUDU-2039 Fix the table count in the /tables page of master webUI

The table count that was displayed on the webUI does not eliminate
the ones that are deleted. Added a check to see if the table's state
is running and then updated the count.

Change-Id: I03c2fa1e3f26a9927d4b748a49dd5cee22d8f9d5
Reviewed-on: http://gerrit.cloudera.org:8080/7568
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M src/kudu/master/master-path-handlers.cc
1 file changed, 3 insertions(+), 2 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I03c2fa1e3f26a9927d4b748a49dd5cee22d8f9d5
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Abhishek Talluri 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1894 fixed deadlock in client.Connection

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-1894 fixed deadlock in client.Connection
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

PS4, Line 274:  while (true) {
 :   List queued = 
Preconditions.checkNotNull(queuedMessages);
 :   if (queued.isEmpty()) {
 : break;
> even though we'll lose the Preconditions check here, I think this would be 
Done


PS4, Line 290: lock.unlock();
 :   needs_unlock = false;
 :   // Send out the enqueued messages while not holding 
the lock. This is to avoid
 :   // deadlock if channelDisconnected/channelClosed event 
happens and cleanup() is called.
 :   for (final QueuedMessage qm : queued) {
 : sendCallToWire(qm.message);
 :   }
 :   lock.lock();
 :   needs_unlock = true;
> I think it's slightly more idiomatic to do:
I like it better, thanks!


Line 304: queuedMessages = null;
> can you add an 'assert queuedMessages.empty()' here?
Done


Line 507:   Preconditions.checkState(state == State.READY);
> seems more like an assert to me
Done


Line 524: sendCallToWire(msg);
> does this now risk that calls come across the wire in non-ascending callId 
Yes, that can happen now.  But I don't see how it could break any assumptions 
in our code.  Also, since we were not controlling the ordering of the messages 
at the Netty level (Channels.write() sends the message asynchronously) even 
before this change, I don't expect anything else to emerge here.


-- 
To view, visit http://gerrit.cloudera.org:8080/7765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1894 fixed deadlock in client.Connection

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new patch set (#5).

Change subject: KUDU-1894 fixed deadlock in client.Connection
..

KUDU-1894 fixed deadlock in client.Connection

Due to the reverse order of acquiring of the Connection.lock and some
lower-level Netty locks, Connection.enqueueMesage() could deadlock if
a ChannelDisconnected/ChannelClosed event arrived in the middle.

The idea is to not hold the Connection.lock while invoking the
Connection.sendCallToWire() method and, overall, avoid doing any
calls to Netty while holding that lock.

To test the changes, I ran the ITClient test via dist-test apllying
Todd's WIP patch on top: https://gerrit.cloudera.org/#/c/7579/

The test passed 3572 out of 3572 times:
  http://dist-test.cloudera.org/job?job_id=aserbin.1503526685.24101
  http://dist-test.cloudera.org/job?job_id=aserbin.1503527340.4787
  http://dist-test.cloudera.org/job?job_id=aserbin.1503527848.6921

Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a
---
M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
1 file changed, 54 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/7765/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] disk failure: don't open tablets on failed disks

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: disk failure: don't open tablets on failed disks
..


Patch Set 6:

I retriggered this one too

-- 
To view, visit http://gerrit.cloudera.org:8080/7766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 6:

I retriggered the build

-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR](branch-1.3.x) log block manager: use unsigned int for next block id

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.3.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-871. Support tombstoned voting
..


KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Persist ConsensusMetadata before returning from
  TabletCopyClient::Start() because we need cmeta to Init()
  RaftConsensus, which happens when registering the replica in
  TSTabletManager.
* TSTabletManager::DeleteTablet() should not consider FAILED == deleted,
  since we no longer destroy RaftConsensus when tombstoning a replica.
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Reviewed-on: http://gerrit.cloudera.org:8080/6960
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
Reviewed-by: Todd Lipcon 
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
27 files changed, 1,439 insertions(+), 296 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Alexey Serbin: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR](branch-1.4.x) log block manager: use unsigned int for next block id

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7797
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 13:

I did an analysis of failed raft_consensus-itest failures before and after this 
patch. There was no change in the trend, except that 
RaftConsensusITest.TestAutoCreateReplica stopped being flaky. :)

I looked into it, and I think it was just coincidence. That test doesn't even 
have failure detection enabled, plus it doesn't tombstone any tablets.

I think this patch is ready to ship.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 13: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] [java] KUDU-2103 Canonicalize hostnames in client

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: [java] KUDU-2103 Canonicalize hostnames in client
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7757/4/java/kudu-client/src/test/java/org/apache/kudu/client/FakeDNS.java
File java/kudu-client/src/test/java/org/apache/kudu/client/FakeDNS.java:

PS4, Line 56:   public synchronized void addForwardResolution(String hostname, 
InetAddress ip) {
: forwardResolutions.put(hostname, ip);
:   }
: 
:   public synchronized void addReverseResolution(InetAddress ip, 
String hostname) {
: reverseResolutions.put(ip, hostname);
:   }
sorry, one more nit I missed: we always define all members up front before any 
methods, so these below down below (probably below getInstance() for best 
organization)

Did you see my note about checkstyle? I think it would probably flag this.


-- 
To view, visit http://gerrit.cloudera.org:8080/7757
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77204a185c1ab8e21cc833afb645543cbc0e340f
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Attila Bukor 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 6:

Failure seems to be a bunch of clock sync errors in DEBUG mode.

-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 6: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] subprocess: even more robust fix for asynchronous signals

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: subprocess: even more robust fix for asynchronous signals
..


Patch Set 7: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR](branch-1.2.x) log block manager: use unsigned int for next block id

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7799

Change subject: log block manager: use unsigned int for next_block_id_
..

log block manager: use unsigned int for next_block_id_

KUDU-1538 introduced 'next_block_id_' to keep track of unique block
ID that should be used for block creation. Currently, it is defined
as int64_t. However, it could be updated based on the value of
'max_block_id' which is uint64_t. Since block IDs are defined as
uint64_t both on disk (fs.proto) and in memory (block_id.h), it
makes more sense to treat 'next_block_id_' as uint64_t rather than
to convert it correctly to int64_t everywhere.

This patch changes the type of 'next_block_id_' to uint64_t to
avoid overflow due to conversion of unsigned int to int, which
can result in the reuse of an existing block ID. It does not add
a standalone test case because the failure is most likely to occur
in specific test-only scenarios.

Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Reviewed-on: http://gerrit.cloudera.org:8080/7796
Reviewed-by: Adar Dembo 
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
(cherry picked from commit b37bde72c47370293827a306532d9be15b1c5f40)
---
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.h
2 files changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/7799/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.2.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Hao Hao 


[kudu-CR](branch-1.4.x) log block manager: use unsigned int for next block id

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7797

Change subject: log block manager: use unsigned int for next_block_id_
..

log block manager: use unsigned int for next_block_id_

KUDU-1538 introduced 'next_block_id_' to keep track of unique block
ID that should be used for block creation. Currently, it is defined
as int64_t. However, it could be updated based on the value of
'max_block_id' which is uint64_t. Since block IDs are defined as
uint64_t both on disk (fs.proto) and in memory (block_id.h), it
makes more sense to treat 'next_block_id_' as uint64_t rather than
to convert it correctly to int64_t everywhere.

This patch changes the type of 'next_block_id_' to uint64_t to
avoid overflow due to conversion of unsigned int to int, which
can result in the reuse of an existing block ID. It does not add
a standalone test case because the failure is most likely to occur
in specific test-only scenarios.

Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Reviewed-on: http://gerrit.cloudera.org:8080/7796
Reviewed-by: Adar Dembo 
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
(cherry picked from commit b37bde72c47370293827a306532d9be15b1c5f40)
---
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.h
2 files changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/7797/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7797
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Hao Hao 


[kudu-CR](branch-1.3.x) log block manager: use unsigned int for next block id

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7798

Change subject: log block manager: use unsigned int for next_block_id_
..

log block manager: use unsigned int for next_block_id_

KUDU-1538 introduced 'next_block_id_' to keep track of unique block
ID that should be used for block creation. Currently, it is defined
as int64_t. However, it could be updated based on the value of
'max_block_id' which is uint64_t. Since block IDs are defined as
uint64_t both on disk (fs.proto) and in memory (block_id.h), it
makes more sense to treat 'next_block_id_' as uint64_t rather than
to convert it correctly to int64_t everywhere.

This patch changes the type of 'next_block_id_' to uint64_t to
avoid overflow due to conversion of unsigned int to int, which
can result in the reuse of an existing block ID. It does not add
a standalone test case because the failure is most likely to occur
in specific test-only scenarios.

Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Reviewed-on: http://gerrit.cloudera.org:8080/7796
Reviewed-by: Adar Dembo 
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
(cherry picked from commit b37bde72c47370293827a306532d9be15b1c5f40)
---
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.h
2 files changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/7798/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.3.x
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Hao Hao 


[kudu-CR] KUDU-1894 fixed deadlock in client.Connection

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1894 fixed deadlock in client.Connection
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java
File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java:

PS4, Line 274:  while (true) {
 :   List queued = 
Preconditions.checkNotNull(queuedMessages);
 :   if (queued.isEmpty()) {
 : break;
even though we'll lose the Preconditions check here, I think this would be 
easier to follow with just a : while (!queuedMessages.empty()) { ...}

in terms of the exception, it'll throw an NPE either way, so I dont think it's 
much of a loss.


PS4, Line 290: lock.unlock();
 :   needs_unlock = false;
 :   // Send out the enqueued messages while not holding 
the lock. This is to avoid
 :   // deadlock if channelDisconnected/channelClosed event 
happens and cleanup() is called.
 :   for (final QueuedMessage qm : queued) {
 : sendCallToWire(qm.message);
 :   }
 :   lock.lock();
 :   needs_unlock = true;
I think it's slightly more idiomatic to do:

lock.unlock();
try {
} finally {
  lock.lock();
}

for the 'unlocked' periodic here, even though in some rare circumstance in 
which there is an exception thrown, it would cause an extra lock/unlock


Line 304: queuedMessages = null;
can you add an 'assert queuedMessages.empty()' here?


Line 507:   Preconditions.checkState(state == State.READY);
seems more like an assert to me


Line 524: sendCallToWire(msg);
does this now risk that calls come across the wire in non-ascending callId 
order? Does that break any assumptions elsewhere in the code?


-- 
To view, visit http://gerrit.cloudera.org:8080/7765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 6: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] disk failure: don't open tablets on failed disks

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: disk failure: don't open tablets on failed disks
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: log block manager: use unsigned int for next_block_id_
..


log block manager: use unsigned int for next_block_id_

KUDU-1538 introduced 'next_block_id_' to keep track of unique block
ID that should be used for block creation. Currently, it is defined
as int64_t. However, it could be updated based on the value of
'max_block_id' which is uint64_t. Since block IDs are defined as
uint64_t both on disk (fs.proto) and in memory (block_id.h), it
makes more sense to treat 'next_block_id_' as uint64_t rather than
to convert it correctly to int64_t everywhere.

This patch changes the type of 'next_block_id_' to uint64_t to
avoid overflow due to conversion of unsigned int to int, which
can result in the reuse of an existing block ID. It does not add
a standalone test case because the failure is most likely to occur
in specific test-only scenarios.

Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Reviewed-on: http://gerrit.cloudera.org:8080/7796
Reviewed-by: Adar Dembo 
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.h
2 files changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] subprocess: even more robust fix for asynchronous signals

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: subprocess: even more robust fix for asynchronous signals
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7561/6/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

Line 302:   ASSERT_EVENTUALLY([&]{
> why do these need EVENTUALLY? isn't the point of the changes you made to Ki
Oops. An earlier iteration of the patch used kill(2) directly, and so 
ASSERT_EVENTUALLY was relevant. I'll remove it.


-- 
To view, visit http://gerrit.cloudera.org:8080/7561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] subprocess: even more robust fix for asynchronous signals

2017-08-23 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7561

to look at the new patch set (#7).

Change subject: subprocess: even more robust fix for asynchronous signals
..

subprocess: even more robust fix for asynchronous signals

Another x1000 loop of raft_consensus-itest yielded the following failure in
TestElectPendingVoter:

  /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:2035: 
Failure
  Value of: s.IsTimedOut()
Actual: false
  Expected: true
  Expected AddServer() to time out. Result: OK

The fix from commit 27435da doesn't work well for AddServer because it's not
an idempotent operation. Instead, let's take a more robust tack and add some
waiting behavior directly into Subprocess, using the /proc//stat
process state as a guide.

Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
M src/kudu/util/subprocess.h
4 files changed, 113 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/7561/7
-- 
To view, visit http://gerrit.cloudera.org:8080/7561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] subprocess: even more robust fix for asynchronous signals

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: subprocess: even more robust fix for asynchronous signals
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7561/6/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

Line 302:   ASSERT_EVENTUALLY([&]{
why do these need EVENTUALLY? isn't the point of the changes you made to Kill 
that, as soon as kill returns, you're guaranteed it's in the right state? 
similarly when Start() returns it should definitely be running


-- 
To view, visit http://gerrit.cloudera.org:8080/7561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] subprocess: even more robust fix for asynchronous signals

2017-08-23 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7561

to look at the new patch set (#6).

Change subject: subprocess: even more robust fix for asynchronous signals
..

subprocess: even more robust fix for asynchronous signals

Another x1000 loop of raft_consensus-itest yielded the following failure in
TestElectPendingVoter:

  /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:2035: 
Failure
  Value of: s.IsTimedOut()
Actual: false
  Expected: true
  Expected AddServer() to time out. Result: OK

The fix from commit 27435da doesn't work well for AddServer because it's not
an idempotent operation. Instead, let's take a more robust tack and add some
waiting behavior directly into Subprocess, using the /proc//stat
process state as a guide.

Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
---
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
M src/kudu/util/subprocess.h
3 files changed, 113 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/7561/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 30: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 30
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] Give more context on errors reading cfiles

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: Give more context on errors reading cfiles
..


Give more context on errors reading cfiles

Recently, a user reported an error loading a tablet in which the only
reported message was "bad magic". This wasn't useful for pinpointing the
id of the bad block or what might have happened to cause the problem.

This patch adds more context info in such circumstances: we now include
DebugString output for "bad magic" errors as well as the block ID in all
cases.

The unit test is updated to check that block IDs show up in all
Corruption status messages.

Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af
Reviewed-on: http://gerrit.cloudera.org:8080/7620
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/index_btree.cc
M src/kudu/tablet/deltafile.cc
4 files changed, 52 insertions(+), 19 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7620
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 30:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/29/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS29, Line 1229:  it would
> Seems like 'it' accidentally got dropped. This sentence should read as:
Oops, my bad...Thanks for catching it. Updated.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 30
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#30).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 599 insertions(+), 434 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/30
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 30
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/29/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS29, Line 1229:  would be
Seems like 'it' accidentally got dropped. This sentence should read as:

  Here is the reasoning as to why it would be safe to do so.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/block_manager_util.cc
File src/kudu/fs/block_manager_util.cc:

PS5, Line 68: 
> nit: this is not needed, right?  Usually those macros are used as 
Done


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h
File src/kudu/fs/block_manager_util.h:

PS4, Line 56:   // On success, either 'metadata_' is overwritten with the 
contents of the
:   // file, or, in the case of disk failure, returns OK and sets
:   // 'health_status_' to a non-OK value.
:   Status LoadFromDisk();
> I'm not sure I fully agree. For one the error message is confusing: "LOG(ER
Hrm, I'd prefer to keep this internal. If anything of the PIMF fails, it gets 
set to failed. I'll improve the error message though.


PS4, Line 56:   // On success, either 'metadata_' is overwritten with the 
contents of the
:   // file, or, in the case of disk failure, returns OK and sets
:   // 'health_status_' to a non-OK value.
:   Status LoadFromDisk();
> Andrew and I previously discussed this approach. You can see our comments h
Thanks for resurfacing this.


http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/data_dirs-test.cc
File src/kudu/fs/data_dirs-test.cc:

PS5, Line 244:   Status s = dd_manager_->MarkDataDirFailed(kNumDirs - 1);
 :   ASSERT_STR_CONTAINS(s.ToString(), "All data dirs have failed");
> As I see the call to CreateDataDirGroup() is removed.  Is that correct?  I 
Ah, good point, I suppose I can add it in again. Done


PS5, Line 363: // manager.
 : KuduTest::SetUp();
 :   }
 : 
> nit: consider making this method returning Status and then using ASSERT_OK(
Done


http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS5, Line 1315:   ASSERT_NE(data_dir, test_dirs[failed_idx]);
> paranoid nit: does it make sense to check that the failed dir is exactly te
Sure, although there is test coverage for this in data_dirs-test


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 29:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7207/28/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS28, Line 352: possibly sync
> possibly synchronizing
Done


PS28, Line 1227: cou
> could
Done


PS28, Line 1229: saf
> would be
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] disk failure: don't open tablets on failed disks

2017-08-23 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: disk failure: don't open tablets on failed disks
..


Patch Set 5:

(6 comments)

I'm using a cluster verifier, but it doesn't quite work if you mix around the 
block manager type. I've deparameterized the test for this reason.

http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/integration-tests/disk_failure-itest.cc
File src/kudu/integration-tests/disk_failure-itest.cc:

Line 31: METRIC_DECLARE_gauge_uint64(data_dirs_failed);
> not used in an external minicluster test
As per our hipchat convo, this is needed to get metric.


Line 33: DECLARE_string(block_manager);
> same
Done


PS4, Line 42: 
> does 30 work?
Done


Line 55: StartCluster(GetClusterFlags(), {}, 3, 5);
> I dont think you need this when you can just use cluster_->tablet_server(i)
Done


PS4, Line 96:   ExternalTabletServer* ts = cluster_->tablet_server(0);
:   ASSERT_OK(cluster_->WaitForTabletsRunning(ts, kNumTablets, 
kAgreementTimeout));
: 
> how about:
I'm going to keep this here since it's just a way to ensure we don't restart 
before things are persisted. I'll use the cluster verifier below.


PS4, Line 116: }
 : 
> additionally, how about adding:
See above comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/7766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] disk failure: don't open tablets on failed disks

2017-08-23 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7766

to look at the new patch set (#6).

Change subject: disk failure: don't open tablets on failed disks
..

disk failure: don't open tablets on failed disks

Currently Kudu servers open the FS layout with failed disks. However,
the moment tablets attempt to bootstrap (i.e. open blocks, etc.), they
will attempt to read from the failed disk and fail. This can be avoided
by checking whether a tablet's disk group contains a failed disk before
attempting to read data from the tablet. If so, the tablet should be
marked as having an error so it can be reassigned.

The default behavior of the 'fs_target_data_dirs_per_tablet' flag is
updated to take into account disk state when assigning new directory
groups. This allows the tablet to be reassigned to a server without
being spread across a failed directory.

Testing is done by loading data into a cluster configured to use
multiple directories for data blocks, failing a single directory on one
of the tablet servers, and ensuring that the tablets with blocks on the
failed directory get re-replicated at startup time. The test uses a
cluster verifier to verify the healthy end-state of the cluster.
Necessary changes have been made to do this on a cluster comprising of
multiple data directories.

Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/integration-tests/external_mini_cluster_fs_inspector.cc
M src/kudu/tserver/ts_tablet_manager.cc
8 files changed, 185 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/7766/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Andrew Wong (Code Review)
Hello Mike Percy, Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7784

to look at the new patch set (#6).

Change subject: open FS layout in presence of disk failure
..

open FS layout in presence of disk failure

Currently, if a Kudu server starts up with a failed disk, the server
will crash. There are a number of reasons for this, but the pressing
one is that the path instance files may not be readable, meaning the
directories' UUIDs may not be available.

This patch changes this by introducing an "unhealthy" instance in-memory
for each instance that fails to lock, load, canonicalize, etc. Such
instances are ignored when it comes to checking the integrity of the FS
layout, and are simply marked failed by the directory manager.

Testing is done in data_dirs-test, log_block_manager-test, and
fs_manager-test to ensure failed directories do not impede the managers'
startups.

Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
13 files changed, 645 insertions(+), 173 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/7784/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#29).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 599 insertions(+), 434 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/29
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h
File src/kudu/fs/block_manager_util.h:

PS4, Line 56:   // On success, either 'metadata_' is overwritten with the 
contents of the
:   // file, or, in the case of disk failure, returns OK and sets
:   // 'health_status_' to a non-OK value.
:   Status LoadFromDisk();
> Andrew and I previously discussed this approach. You can see our comments h
Even if internalizing the failure would be ok (which I still think it isn't, 
though I don't feel super strong about it), the error message that we're 
ignoring the failure is still wrong, since we arent'


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7796/2//COMMIT_MSG
Commit Message:

PS2, Line 12: Since block IDs are
> "Since block IDs..."
Done


PS2, Line 13: t
> drop this
Done


PS2, Line 14: her 
> rather than
Done


PS2, Line 15: it corre
> correctly to int64_t everywhere
Done


PS2, Line 18: unsigned 
> typo, should be unsigned
Done


PS2, Line 20: is 
> is
Done


PS2, Line 20: o occ
> to occur
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7796

to look at the new patch set (#3).

Change subject: log block manager: use unsigned int for next_block_id_
..

log block manager: use unsigned int for next_block_id_

KUDU-1538 introduced 'next_block_id_' to keep track of unique block
ID that should be used for block creation. Currently, it is defined
as int64_t. However, it could be updated based on the value of
'max_block_id' which is uint64_t. Since block IDs are defined as
uint64_t both on disk (fs.proto) and in memory (block_id.h), it
makes more sense to treat 'next_block_id_' as uint64_t rather than
to convert it correctly to int64_t everywhere.

This patch changes the type of 'next_block_id_' to uint64_t to
avoid overflow due to conversion of unsigned int to int, which
can result in the reuse of an existing block ID. It does not add
a standalone test case because the failure is most likely to occur
in specific test-only scenarios.

Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
---
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.h
2 files changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/7796/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

PS5, Line 1315: ASSERT_EQ(1, dd_manager_->GetFailedDataDirs().size());
paranoid nit: does it make sense to check that the failed dir is exactly 
test_dirs[failed_idx] ?


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7796/2/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

PS2, Line 411: AtomicInt
> I'd prefer if we kept the scope of this patch limited, and we already use u
Sure, that makes sense.


-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] add outputmetric collecting for the kudu data writing

2017-08-23 Thread caiconghui (Code Review)
caiconghui has posted comments on this change.

Change subject:  add outputmetric collecting for the kudu data writing
..


Patch Set 1:

@Kudu Jenkins, ok.I try to find the public api for collecting from spark, but 
failed. The scope of api for task metric collecting operation is private in 
spark package, other external data source cannot access the interfaces except 
they are integrated into the spark package, But now it seems impossible.

-- 
To view, visit http://gerrit.cloudera.org:8080/7754
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3ca4389090f25834d165d36b83f77f1cfa3f726
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: caiconghui 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: caiconghui 
Gerrit-HasComments: No


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 2:

(1 comment)

The test failures are unrelated; some test machine clocks are unsynchronized 
for some reason.

http://gerrit.cloudera.org:8080/#/c/7796/2/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

PS2, Line 411: AtomicInt
> maybe, to be on the safe side, use std::atomic here to have well-
I'd prefer if we kept the scope of this patch limited, and we already use 
util/atomic throughout the block manager code.


-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7796/2//COMMIT_MSG
Commit Message:

PS2, Line 18: unsinged 
typo, should be unsigned


http://gerrit.cloudera.org:8080/#/c/7796/2/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

PS2, Line 411: AtomicInt
maybe, to be on the safe side, use std::atomic here to have 
well-defined behavior in case of overflow?


-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] gradle: target JRE 7 compatibility

2017-08-23 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change.

Change subject: gradle: target JRE 7 compatibility
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7785/1/java/gradle/compile.gradle
File java/gradle/compile.gradle:

Line 23:   targetCompatibility = javaSourceCompatibility
Since source and target versions will likely always match perhaps we should 
just change javaSourceCompatibility to javaCompatibility here and in the 
gradle.properties file.


-- 
To view, visit http://gerrit.cloudera.org:8080/7785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib836a748f0f000b07460dd3141940655dbfb7a29
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 13:

Well, actually, let me hit raft_consensus-itest a little harder. It's possible 
that I made it flaky and hadn't previously noticed.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 13:

The new tests still look good. 100% reliable after 100 loops w/ TSAN + 8x 
stress:

tombstoned_voting-itest: 
http://dist-test.cloudera.org/job?job_id=mpercy.1503540656.2157
tombstoned_voting-stress-test: 
http://dist-test.cloudera.org/job?job_id=mpercy.1503540557.1392

If other tests get flaky, they'll show up on the flaky test dashboard and I'll 
fix them.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets
..


Patch Set 4: Verified+1

unrelated flake in

AdminCliTest.TestMoveTablet

-- 
To view, visit http://gerrit.cloudera.org:8080/
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged.

Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets
..


[cmake] introduce 'pb-gen' and 'krpc-gen' targets

Introduced new targets to generate protobuf stubs and KRPC service/proxy
files.

The 'iwyu' target now depends on the newly introduced 'pb-gen' and
'krpc-gen' targets.  That's because IWYU needs to have all protobuf
and KRPC headers available when running.

Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1
Reviewed-on: http://gerrit.cloudera.org:8080/
Reviewed-by: Adar Dembo 
Tested-by: Alexey Serbin 
---
M CMakeLists.txt
M cmake_modules/FindKRPC.cmake
M cmake_modules/FindProtobuf.cmake
3 files changed, 23 insertions(+), 0 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Alexey Serbin: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/block_manager_util.cc
File src/kudu/fs/block_manager_util.cc:

PS5, Line 68: ;
nit: this is not needed, right?  Usually those macros are used as 

RETURN_NOT_OK_FAIL_INSTANCE_PREPEND(...);

so the semicolon is added at the call site.


http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/data_dirs-test.cc
File src/kudu/fs/data_dirs-test.cc:

PS5, Line 244:   Status s = dd_manager_->MarkDataDirFailed(kNumDirs - 1);
 :   ASSERT_STR_CONTAINS(s.ToString(), "All data dirs have failed");
As I see the call to CreateDataDirGroup() is removed.  Is that correct?  I 
mean, I would expect to see an attempt to do something with the object after 
marking all directories as failed (that looks like just a preparation for some 
meaningful action).


PS5, Line 363:   void OpenDataDirManager() {
 : ASSERT_OK(DataDirManager::OpenExistingForTests(env_, 
test_roots_,
 : DataDirManagerOptions(), _manager_));
 :   }
nit: consider making this method returning Status and then using ASSERT_OK() -- 
that way it would be more easier to find the place where it fails, if it does 
so.


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7796/2//COMMIT_MSG
Commit Message:

PS2, Line 12: Moreover, block IDs
"Since block IDs..."


PS2, Line 13: , so
drop this


PS2, Line 14: than
rather than


PS2, Line 15: properly
correctly to int64_t everywhere


PS2, Line 20: can
is


PS2, Line 20: occur
to occur


-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 5: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 28:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7207/28/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS28, Line 352: synchronizing
possibly synchronizing


PS28, Line 1227: can
could


PS28, Line 1229:  is
would be


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 13:

> lgtm. Did you do any last test loop of the relevant tests after the
 > latest round of changes?

Thanks for the review. I'll double-check that now and will post results. I did 
see some unrelated failures on some unrelated tests, but we've have a lot of 
code go in this week.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] disk failure: don't open tablets on failed disks

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: disk failure: don't open tablets on failed disks
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/integration-tests/disk_failure-itest.cc
File src/kudu/integration-tests/disk_failure-itest.cc:

Line 31: METRIC_DECLARE_gauge_uint64(data_dirs_failed);
not used in an external minicluster test


Line 33: DECLARE_string(block_manager);
same


PS4, Line 42: 120
does 30 work?


Line 55:   void SetupDefaultCluster(vector* 
ext_tservers) {
I dont think you need this when you can just use cluster_->tablet_server(i) to 
get what you are loading into this vector


PS4, Line 96:   // Ensure the tablets get to a running state.
:   
ASSERT_OK(cluster_->WaitForTabletsRunning(cluster_->tablet_server(0),
: kNumTablets, 
kAgreementTimeout));
how about:

  auto& ts0 = ts_map_[cluster_->tablet_server(0)->uuid()];
  string tablet_id;
  ASSERT_EVENTUALLY([&] {
vector tablet_ids;
ASSERT_OK(ListRunningTabletIds(ts0, kTimeout, _ids));
ASSERT_EQ(1, tablet_ids.size());
tablet_id = tablet_ids[0];
  }
  ASSERT_OK(WaitForServersToAgree(kTimeout, ts_map_, tablet_id, 
workload.batches_completed()));


PS4, Line 116:   
ASSERT_OK(cluster_->WaitForTabletsRunning(cluster_->tablet_server(0),
 : kNumTablets, 
kAgreementTimeout));
additionally, how about adding:

  ASSERT_OK(WaitForServersToAgree(kTimeout, ts_map_, tablet_id, 
workload.batches_completed()));


-- 
To view, visit http://gerrit.cloudera.org:8080/7766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h
File src/kudu/fs/block_manager_util.h:

PS4, Line 56:   // On success, either 'metadata_' is overwritten with the 
contents of the
:   // file, or, in the case of disk failure, returns OK and sets
:   // 'health_status_' to a non-OK value.
:   Status LoadFromDisk();
> I'm not sure I fully agree. For one the error message is confusing: "LOG(ER
Andrew and I previously discussed this approach. You can see our comments here: 
https://gerrit.cloudera.org/#/c/7270/6/src/kudu/fs/data_dirs.cc@408. FWIW, I 
find the idea of "internalizing" the disk failure to PIMF to be reasonable.


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7796/1//COMMIT_MSG
Commit Message:

Line 14: it makes more sense to treat 'next_block_id_' as uint64_t than to
> Should probably explain why we're switching it to uint64_t instead of switc
Done


PS1, Line 15: 
> conversion
Done


PS1, Line 16: 
> in the reuse of an existing block ID.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#28).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 599 insertions(+), 434 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/28
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7796

to look at the new patch set (#2).

Change subject: log block manager: use unsigned int for next_block_id_
..

log block manager: use unsigned int for next_block_id_

KUDU-1538 introduced 'next_block_id_' to keep track of unique block
ID that should be used for block creation. Currently, it is defined
as int64_t. However, it could be updated based on the value of
'max_block_id' which is uint64_t. Moreover, block IDs are defined
as uint64_t both on disk (fs.proto) and in memory (block_id.h), so
it makes more sense to treat 'next_block_id_' as uint64_t than to
convert it properly.

This patch changes the type of 'next_block_id_' to uint64_t to
avoid overflow due to conversion of unsinged int to int, which
can result in the reuse of an existing block ID. It does not add
a standalone test case because the failure can most likely occur
in specific test-only scenarios.

Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
---
M src/kudu/fs/log_block_manager.h
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/7796/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h
File src/kudu/fs/block_manager_util.h:

PS4, Line 56:   // On success, either 'metadata_' is overwritten with the 
contents of the
:   // file, or, in the case of disk failure, returns OK and sets
:   // 'health_status_' to a non-OK value.
:   Status LoadFromDisk();
> Disk failure errors here would be ignored anyway. By adding this to the con
I'm not sure I fully agree. For one the error message is confusing: "LOG(ERROR) 
<< "Ignoring error with status: " << _s_prepended.ToString(); \" seems like 
we're ignoring the error when we really aren't. 
Then I'd expect that upon getting an OK here the metadata had been loaded, when 
it's not the case. Given there is a single call site I think it would make 
sense to be explicit. That is to return a non-ok status and to special case the 
isDiskFailure() status along with logging the error message there. Otherwise 
you're pushing logic from the call site onto this class which is a bit dirtier.


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR](branch-1.3.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has submitted this change and it was merged.

Change subject: [build-support] IWYU build configuration for Jenkins
..


[build-support] IWYU build configuration for Jenkins

In this branch, the IWYU target is just a no-op.  The idea is that
we have common configuration matrix for gerrit pre-commit Jenkins jobs,
so in particular branches the IWYU is run for real (as of now, at least
in the main trunk).

Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Reviewed-on: http://gerrit.cloudera.org:8080/7750
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
(cherry picked from commit 8f6812047a2fe50dabfcb26093385b33f9960a7c)
Reviewed-on: http://gerrit.cloudera.org:8080/7779
Reviewed-by: Jean-Daniel Cryans 
---
M build-support/jenkins/build-and-test.sh
1 file changed, 15 insertions(+), 6 deletions(-)

Approvals:
  Jean-Daniel Cryans: Looks good to me, approved
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: branch-1.3.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR](branch-1.4.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has submitted this change and it was merged.

Change subject: [build-support] IWYU build configuration for Jenkins
..


[build-support] IWYU build configuration for Jenkins

In this branch, the IWYU target is just a no-op.  The idea is that
we have common configuration matrix for gerrit pre-commit Jenkins jobs,
so in particular branches the IWYU is run for real (as of now, at least
in the main trunk).

Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Reviewed-on: http://gerrit.cloudera.org:8080/7750
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
(cherry picked from commit 8f6812047a2fe50dabfcb26093385b33f9960a7c)
Reviewed-on: http://gerrit.cloudera.org:8080/7780
Reviewed-by: Jean-Daniel Cryans 
---
M build-support/jenkins/build-and-test.sh
1 file changed, 15 insertions(+), 6 deletions(-)

Approvals:
  Jean-Daniel Cryans: Looks good to me, approved
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7780
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR](branch-1.3.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has posted comments on this change.

Change subject: [build-support] IWYU build configuration for Jenkins
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.3.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR](branch-1.2.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has posted comments on this change.

Change subject: [build-support] IWYU build configuration for Jenkins
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.2.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR](branch-1.4.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has posted comments on this change.

Change subject: [build-support] IWYU build configuration for Jenkins
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7780
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR](branch-1.2.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Jean-Daniel Cryans (Code Review)
Jean-Daniel Cryans has submitted this change and it was merged.

Change subject: [build-support] IWYU build configuration for Jenkins
..


[build-support] IWYU build configuration for Jenkins

In this branch, the IWYU target is just a no-op.  The idea is that
we have common configuration matrix for gerrit pre-commit Jenkins jobs,
so in particular branches the IWYU is run for real (as of now, at least
in the main trunk).

Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Reviewed-on: http://gerrit.cloudera.org:8080/7750
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
(cherry picked from commit 8f6812047a2fe50dabfcb26093385b33f9960a7c)
Reviewed-on: http://gerrit.cloudera.org:8080/7778
Reviewed-by: Jean-Daniel Cryans 
---
M build-support/jenkins/build-and-test.sh
1 file changed, 15 insertions(+), 6 deletions(-)

Approvals:
  Jean-Daniel Cryans: Looks good to me, approved
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: branch-1.2.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] disk failure: don't open tablets on failed disks

2017-08-23 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: disk failure: don't open tablets on failed disks
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

PS4, Line 597: TODO(awong): currently the FsReport only reports on containers 
for the
 : // log block manager. Implement some sort of reporting for 
failed
 : // directories as well.
> mind creating a jira to track this and assigning to yourself?
Done


http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/integration-tests/disk_failure-itest.cc
File src/kudu/integration-tests/disk_failure-itest.cc:

PS4, Line 104:   Substitute("--env_inject_eio_globs=$0", 
JoinPathSegments(failed_dir, "**")),
 :   "--env_inject_eio=1.0",
 :   "--suicide_on_eio=false",
> what if you just append to the current flags and not replace them all?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] disk failure: don't open tablets on failed disks

2017-08-23 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7766

to look at the new patch set (#5).

Change subject: disk failure: don't open tablets on failed disks
..

disk failure: don't open tablets on failed disks

Currently Kudu servers open the FS layout with failed disks. However,
the moment tablets attempt to bootstrap (i.e. open blocks, etc.), they
will attempt to read from the failed disk and fail. This can be avoided
by checking whether a tablet's disk group contains a failed disk before
attempting to read data from the tablet. If so, the tablet should be
marked as having an error so it can be reassigned.

The default behavior of the 'fs_target_data_dirs_per_tablet' flag is
updated to take into account disk state when assigning new directory
groups. This allows the tablet to be reassigned to a server without
being spread across a failed directory.

Testing is done by loading data into a cluster configured to use
multiple directories for data blocks, failing a single directory on one
of the tablet servers, and ensuring that the tablets with blocks on the
failed directory get re-replicated at startup time.

Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
---
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/disk_failure-itest.cc
M src/kudu/tserver/ts_tablet_manager.cc
7 files changed, 201 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/7766/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 4:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/7784/3//COMMIT_MSG
Commit Message:

PS3, Line 20: mangers'
> looks like a typo
It is, indeed, "managers'," referring to the startups of the directory manager, 
log block manager, and fs manager.


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.cc
File src/kudu/fs/block_manager_util.cc:

PS4, Line 189: it's
> nit its
Done


PS4, Line 189: them
> nit s/them/it
Done


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h
File src/kudu/fs/block_manager_util.h:

PS4, Line 56:   // On success, either 'metadata_' is overwritten with the 
contents of the
:   // file, or, in the case of disk failure, returns OK and sets
:   // 'health_status_' to a non-OK value.
:   Status LoadFromDisk();
> this is weird. whats the reason for this not to return a non-ok status righ
Disk failure errors here would be ignored anyway. By adding this to the 
contract, we don't have to sully the call-site with disk-handling code, and we 
can keep the error handling (i.e. setting health_status_) internal.


PS4, Line 83:   // Whether or not the instance is healthy. If the instance file 
lives on a
:   // disk that has failed, this should return false.
:   bool healthy() const {
: return health_status_.ok();
:   }
> we usually don't have this extra getter (i.e. consider just doing pimf.heal
As per Mike's comment, I think it helps for readability.

Also, I get the feeling that we don't use this elsewhere (at least for 
tablets/replicas) because other components have multiple states. In this case, 
it's pretty cut-and-dry: healthy, not healthy.


PS4, Line 83:   // Whether or not the instance is healthy. If the instance file 
lives on a
:   // disk that has failed, this should return false.
:   bool healthy() const {
: return health_status_.ok();
:   }
> I had the same initial reaction but it seems to make the code that uses it 
Done


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

Line 116: using std::make_pair;
> warning: using decl 'make_pair' is unused [misc-unused-using-decls]
Done


PS4, Line 794:   if (FindUuidByRoot(root, )) {
 : return FindUuidIndexByUuid(uuid, uuid_idx);
 :   }
 :   return false;
> there are no chances of races here, right?
Nope, the first call is guarded by a scoped lock, as is the second. These 
values don't change after they're first set in Open().


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

PS4, Line 366: Finds the UUID for the given canonicalized root directory name.
> nit, for consistency append the same suffix as the previous new method (ret
Done


http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

Line 279:   FLAGS_env_inject_eio = 0;
> nit: shouldn't need this due to flag saver in KuduTest unless I'm missing s
See Adar's comment.


Line 279:   FLAGS_env_inject_eio = 0;
> Unfortunately, KuduTest doesn't restore the flags until _after_ TearDown() 
Thanks for the explanation.


Line 302:   FLAGS_env_inject_eio = 0;
> same
See above.


http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

PS3, Line 180:   set all_roots;
 :   all_roots.insert(wal_fs_root_);
 :   for (const string& data_fs_root : data_fs_roots_) {
 : all_roots.insert(data_fs_root);
 :   }
> nit: consider
Done


PS3, Line 228:   if (!ContainsKey(unique_roots, root.path)) {
 : canonicalized_data_fs_roots_.emplace_back(root);
 : canonicalized_all_fs_roots_.emplace_back(root);
 : InsertOrDie(_roots, root.path);
 :   }
> Consider using InsertIfNotPresent() instead to avoid double look-ups:
Done


PS3, Line 248: string wal_root
> Is it necessary to make a copy here?  Why not to use const reference here i
Done


PS3, Line 251: string meta_root
> ditto
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 

[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Andrew Wong (Code Review)
Hello Mike Percy, Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7784

to look at the new patch set (#5).

Change subject: open FS layout in presence of disk failure
..

open FS layout in presence of disk failure

Currently, if a Kudu server starts up with a failed disk, the server
will crash. There are a number of reasons for this, but the pressing
one is that the path instance files may not be readable, meaning the
directories' UUIDs may not be available.

This patch changes this by introducing an "unhealthy" instance in-memory
for each instance that fails to lock, load, canonicalize, etc. Such
instances are ignored when it comes to checking the integrity of the FS
layout, and are simply marked failed by the directory manager.

Testing is done in data_dirs-test, log_block_manager-test, and
fs_manager-test to ensure failed directories do not impede the managers'
startups.

Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
---
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager_util-test.cc
M src/kudu/fs/block_manager_util.cc
M src/kudu/fs/block_manager_util.h
M src/kudu/fs/data_dirs-test.cc
M src/kudu/fs/data_dirs.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/fs_manager-test.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
13 files changed, 635 insertions(+), 174 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/7784/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 13: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-871. Support tombstoned voting

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
..


Patch Set 13: Code-Review+1

lgtm. Did you do any last test loop of the relevant tests after the latest 
round of changes?

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 4: Code-Review+1

Don't have anything to add to David/Mike's comments.

-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets

2017-08-23 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/

to look at the new patch set (#4).

Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets
..

[cmake] introduce 'pb-gen' and 'krpc-gen' targets

Introduced new targets to generate protobuf stubs and KRPC service/proxy
files.

The 'iwyu' target now depends on the newly introduced 'pb-gen' and
'krpc-gen' targets.  That's because IWYU needs to have all protobuf
and KRPC headers available when running.

Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1
---
M CMakeLists.txt
M cmake_modules/FindKRPC.cmake
M cmake_modules/FindProtobuf.cmake
3 files changed, 23 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/77//4
-- 
To view, visit http://gerrit.cloudera.org:8080/
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Give more context on errors reading cfiles

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: Give more context on errors reading cfiles
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7620
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] server: fix crash when Shutdown is called before Init

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: server: fix crash when Shutdown is called before Init
..


server: fix crash when Shutdown is called before Init

This SIGSEGV suggests unconditional access to Messenger::Shutdown is bad.

I don't know why MiniMaster::Start failed in the test but I suspect it's due
to clock unsynchronization, because that would cause ServerBase::Init to
fail before creating the Messenger.

*** SIGSEGV (@0x18) received by PID 24219 (TID 0x7fd0b2486c40) from PID 24; 
stack trace: ***
@ 0x7fd0ad48c330 (unknown) at ??:0
@ 0x7fd0ac46e24f kudu::percpu_rwlock::lock() at ??:0
@ 0x7fd0aab3ee59 kudu::rpc::Messenger::ShutdownInternal() at ??:0
@ 0x7fd0aab3f0bc kudu::rpc::Messenger::Shutdown() at ??:0
@ 0x7fd0aff36bc8 kudu::server::ServerBase::Shutdown() at ??:0
@ 0x7fd0aff33749 kudu::server::ServerBase::~ServerBase() at ??:0
@ 0x7fd0b1451a08 kudu::kserver::KuduServer::~KuduServer() at ??:0
@ 0x7fd0b144f5be kudu::master::Master::~Master() at ??:0
@ 0x7fd0b144f72a kudu::master::Master::~Master() at ??:0
@ 0x7fd0b14687c4 kudu::master::MiniMaster::Start() at ??:0
@   0x4d10a4 kudu::master::SysCatalogTest::SetUp() at 
/home/jenkins-slave/workspace/kudu-master/0/src/kudu/master/sys_catalog-test.cc:?
@ 0x7fd0b112fb3a 
testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0
@ 0x7fd0b110fb27 testing::Test::Run() at ??:0
@ 0x7fd0b037 testing::TestInfo::Run() at ??:0
@ 0x7fd0bda7 testing::TestCase::Run() at ??:0
@ 0x7fd0b111d897 testing::internal::UnitTestImpl::RunAllTests() at ??:0
@ 0x7fd0b1130a1a 
testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0
@ 0x7fd0b111d2e2 testing::UnitTest::Run() at ??:0
@ 0x7fd0b208f3dc RUN_ALL_TESTS() at ??:0
@ 0x7fd0b208eb97 main at ??:0
@ 0x7fd0a5733f45 __libc_start_main at ??:0
@   0x4311a7 (unknown) at ??:?
@0x0 (unknown)

Change-Id: I1e36318d93cabd9d98e43951cee2075272f8f799
Reviewed-on: http://gerrit.cloudera.org:8080/7795
Reviewed-by: Todd Lipcon 
Tested-by: Kudu Jenkins
---
M src/kudu/kserver/kserver.cc
M src/kudu/server/server_base.cc
2 files changed, 6 insertions(+), 2 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1e36318d93cabd9d98e43951cee2075272f8f799
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] log block manager: use unsigned int for next block id

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: log block manager: use unsigned int for next_block_id_
..


Patch Set 1:

(3 comments)

I've been wondering whether this deserves a standalone test. I'm inclined to 
say 'no' since I think the pain is only felt in specific test-only scenarios. 
Namely:
1) You have to be running in a gtest, so either you're using an 
InternalMiniCluster or a lower level unit test.
2) You have to be using the block cache singleton.
3) You have to flush some data.
4) You have to restart the block manager some number of times.

BTW, KUDU-1538 is related and perhaps should be referenced in the commit 
description.

http://gerrit.cloudera.org:8080/#/c/7796/1//COMMIT_MSG
Commit Message:

Line 14: This patch changes the type of next_block_id_ to uint64_t to
Should probably explain why we're switching it to uint64_t instead of switching 
every other site to int64_t (since the convention in Kudu is to use int64_t for 
large integer values).

The reason is that block IDs are defined as uint64_t both on disk (fs.proto) 
and in memory (block_id.h), so it makes more sense to treat them as such in the 
LBM than to convert them properly.


PS1, Line 15: conversation
conversion


PS1, Line 16: in block id reused.
in the reuse of an existing block ID.


-- 
To view, visit http://gerrit.cloudera.org:8080/7796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] Give more context on errors reading cfiles

2017-08-23 Thread Todd Lipcon (Code Review)
Hello David Ribeiro Alves, Andrew Wong, Grant Henke, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7620

to look at the new patch set (#2).

Change subject: Give more context on errors reading cfiles
..

Give more context on errors reading cfiles

Recently, a user reported an error loading a tablet in which the only
reported message was "bad magic". This wasn't useful for pinpointing the
id of the bad block or what might have happened to cause the problem.

This patch adds more context info in such circumstances: we now include
DebugString output for "bad magic" errors as well as the block ID in all
cases.

The unit test is updated to check that block IDs show up in all
Corruption status messages.

Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af
---
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/index_btree.cc
M src/kudu/tablet/deltafile.cc
4 files changed, 52 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/7620/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7620
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout"

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation 
or timeout"
..


Patch Set 1: Code-Review+2

Going to just +2 this since it's a revert

-- 
To view, visit http://gerrit.cloudera.org:8080/7788
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout"

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation 
or timeout"
..


Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout"

This reverts commit 351337ee2d92361e2c87699d1c87a49774a3ab4d.

This patch seemed to cause a problem in test cluster that runs YCSB
stress tests. An RPC client and server got "out of sync" somehow on a
connection so that the footer of one call was interpreted as the length
prefix of the next call by the server. It then expected a 40MB+ RPC and
just sat there consuming RPC calls for many hours assuming it was all
part of one large RPC. The other end just kept timing out since it was
sending calls and not getting any responses.

Michael and I looked into it for a while and found one issue but still
couldn't fully explain how this happened, so we'll revert for now.

Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f
Reviewed-on: http://gerrit.cloudera.org:8080/7788
Tested-by: Todd Lipcon 
Reviewed-by: Todd Lipcon 
---
M docs/design-docs/rpc.md
M src/kudu/rpc/blocking_ops.cc
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/constants.cc
M src/kudu/rpc/inbound_call.cc
M src/kudu/rpc/inbound_call.h
M src/kudu/rpc/outbound_call.cc
M src/kudu/rpc/outbound_call.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
M src/kudu/rpc/rpc_header.proto
M src/kudu/rpc/rtest.proto
M src/kudu/rpc/serialization.cc
M src/kudu/rpc/serialization.h
M src/kudu/rpc/server_negotiation.cc
M src/kudu/rpc/transfer.cc
M src/kudu/rpc/transfer.h
19 files changed, 157 insertions(+), 530 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7788
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout"

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation 
or timeout"
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7788
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] server: fix crash when Shutdown is called before Init

2017-08-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: server: fix crash when Shutdown is called before Init
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e36318d93cabd9d98e43951cee2075272f8f799
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h
File src/kudu/fs/block_manager_util.h:

PS4, Line 83:   // Whether or not the instance is healthy. If the instance file 
lives on a
:   // disk that has failed, this should return false.
:   bool healthy() const {
: return health_status_.ok();
:   }
> we usually don't have this extra getter (i.e. consider just doing pimf.heal
I had the same initial reaction but it seems to make the code that uses it a 
little more reader-friendly.


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

Line 279:   FLAGS_env_inject_eio = 0;
> nit: shouldn't need this due to flag saver in KuduTest unless I'm missing s
Unfortunately, KuduTest doesn't restore the flags until _after_ TearDown() 
runs, and env_inject_eio!=0 can cause test directory cleanup to fail.

That said, having to do this for every test is annoying, so maybe we should 
manually destroy the FlagSaver at the beginning of TearDown instead?


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 3:

(5 comments)

A quick first pass

http://gerrit.cloudera.org:8080/#/c/7784/3//COMMIT_MSG
Commit Message:

PS3, Line 20: mangers'
looks like a typo

Is that supposed to be "directory managers'" or "managers'"  ?


http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

PS3, Line 180:   set all_roots;
 :   all_roots.insert(wal_fs_root_);
 :   for (const string& data_fs_root : data_fs_roots_) {
 : all_roots.insert(data_fs_root);
 :   }
nit: consider

set all_roots(data_fs_roots_.begin(), data_fs_roots_.end());
all_roots.insert(wal_fs_root_);


PS3, Line 228:   if (!ContainsKey(unique_roots, root.first)) {
 : canonicalized_data_fs_roots_.emplace_back(root);
 : canonicalized_all_fs_roots_.emplace_back(root);
 : InsertOrDie(_roots, root.first);
 :   }
Consider using InsertIfNotPresent() instead to avoid double look-ups:

if (InsertIfNotPresent(_roots, root.first)) {
  canonicalized_data_fs_roots_.emplace_back(root);
  canonicalized_all_fs_roots_.emplace_back(root);
}


PS3, Line 248: string wal_root
Is it necessary to make a copy here?  Why not to use const reference here 
instead?


PS3, Line 251: string meta_root
ditto


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR](branch-1.2.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: [build-support] IWYU build configuration for Jenkins
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.2.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR](branch-1.3.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: [build-support] IWYU build configuration for Jenkins
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.3.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR](branch-1.4.x) [build-support] IWYU build configuration for Jenkins

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: [build-support] IWYU build configuration for Jenkins
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7780
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: branch-1.4.x
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 3:

(3 comments)

couple other minor nits, overall lgtm

http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

PS3, Line 831: dirs_to_update
unused?


http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

Line 279:   FLAGS_env_inject_eio = 0;
nit: shouldn't need this due to flag saver in KuduTest unless I'm missing 
something


Line 302:   FLAGS_env_inject_eio = 0;
same


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] [java] Gradle test task improvements

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: [java] Gradle test task improvements
..


[java] Gradle test task improvements

Previously some tests would be extremely flaky whenrunning via Gradle. 
Especially secure tests with tokenrenewal. This fixes that issue by forking a 
new jvm foreach test class to isolate the tests.I also add a few system 
properties to match theMaven build and support filtering tests from the root
directory with the --tests argument.

Change-Id: Ifb660b15423b68c299518b7ba54a15135972a007
Reviewed-on: http://gerrit.cloudera.org:8080/7782
Reviewed-by: Adar Dembo 
Tested-by: Adar Dembo 
---
M java/gradle/tests.gradle
M java/pom.xml
2 files changed, 21 insertions(+), 2 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7782
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ifb660b15423b68c299518b7ba54a15135972a007
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] open FS layout in presence of disk failure

2017-08-23 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: open FS layout in presence of disk failure
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.cc
File src/kudu/fs/block_manager_util.cc:

PS4, Line 189: them
nit s/them/it


PS4, Line 189: it's
nit its


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h
File src/kudu/fs/block_manager_util.h:

PS4, Line 56:   // On success, either 'metadata_' is overwritten with the 
contents of the
:   // file, or, in the case of disk failure, returns OK and sets
:   // 'health_status_' to a non-OK value.
:   Status LoadFromDisk();
this is weird. whats the reason for this not to return a non-ok status right 
away?


PS4, Line 83:   // Whether or not the instance is healthy. If the instance file 
lives on a
:   // disk that has failed, this should return false.
:   bool healthy() const {
: return health_status_.ok();
:   }
we usually don't have this extra getter (i.e. consider just doing 
pimf.healt_status().ok())


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

PS4, Line 794:   if (FindUuidByRoot(root, )) {
 : return FindUuidIndexByUuid(uuid, uuid_idx);
 :   }
 :   return false;
there are no chances of races here, right?


http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

PS4, Line 366: Finds the UUID for the given canonicalized root directory name.
nit, for consistency append the same suffix as the previous new method 
(returning false...)


-- 
To view, visit http://gerrit.cloudera.org:8080/7784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


  1   2   >