[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy
Hello Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7701 to look at the new patch set (#3). Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy .. KUDU-1726: Avoid fsync-per-block in tablet copy This patch incorporates BlockTransaction API into tablet copy, to avoid fsync() on each block during copying. Blocks are synced to disk together once the tablet copy is complete. It also introduces a new block manager metric 'total_disk_sync' to keep track of block data disk synchronization count. I did a manual test to copy tablet with size of ~37GB from one tablet server to another on el6 with ext4. 'kudu remote_replica copy c53ffbc2ede84b6d9af2da607024d131 localhost:3334 localhost:3335' With this change, the operation time dropped down from ~718s to ~523s. Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76 --- M src/kudu/fs/block_manager_metrics.cc M src/kudu/fs/block_manager_metrics.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/log_block_manager.cc M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h 7 files changed, 72 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/7701/3 -- To view, visit http://gerrit.cloudera.org:8080/7701 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-1894 fixed deadlock in client.Connection
Alexey Serbin has posted comments on this change. Change subject: KUDU-1894 fixed deadlock in client.Connection .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java: Line 524: sendCallToWire(msg); > Even if Channels.write() is async, I think there is an underlying queue in Another thought is that the sequence anyway could go non-ascending order when the call_id counter overflows. -- To view, visit http://gerrit.cloudera.org:8080/7765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] subprocess: even more robust fix for asynchronous signals
Adar Dembo has submitted this change and it was merged. Change subject: subprocess: even more robust fix for asynchronous signals .. subprocess: even more robust fix for asynchronous signals Another x1000 loop of raft_consensus-itest yielded the following failure in TestElectPendingVoter: /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:2035: Failure Value of: s.IsTimedOut() Actual: false Expected: true Expected AddServer() to time out. Result: OK The fix from commit 27435da doesn't work well for AddServer because it's not an idempotent operation. Instead, let's take a more robust tack and add some waiting behavior directly into Subprocess, using the /proc//stat process state as a guide. Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 Reviewed-on: http://gerrit.cloudera.org:8080/7561 Reviewed-by: Todd LipconTested-by: Kudu Jenkins --- M src/kudu/integration-tests/raft_consensus-itest.cc M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc M src/kudu/util/subprocess.h 4 files changed, 113 insertions(+), 11 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy
Hao Hao has posted comments on this change. Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy .. Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/7701/1//COMMIT_MSG Commit Message: PS1, Line 9: into > Nit: into Done PS1, Line 11: sync > synced Done PS1, Line 13: to keep track of block data disk synchronization count. : : I did a m > We talked about this in person a little bit, but could you provide more det Done http://gerrit.cloudera.org:8080/#/c/7701/1/src/kudu/tserver/tablet_copy_client-test.cc File src/kudu/tserver/tablet_copy_client-test.cc: Line 196: > I like this idea Done http://gerrit.cloudera.org:8080/#/c/7701/1/src/kudu/tserver/tablet_copy_client.h File src/kudu/tserver/tablet_copy_client.h: Line 27: #include "kudu/gutil/macros.h" > Why do you need this? Can't you forward declare BlockTransaction? Done PS1, Line 164: ogging > belonging Done Line 177: // Download a single block. > Update this (and maybe the entire comment). Done -- To view, visit http://gerrit.cloudera.org:8080/7701 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-1726: Avoid fsync-per-block in tablet copy
Hello Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7701 to look at the new patch set (#2). Change subject: KUDU-1726: Avoid fsync-per-block in tablet copy .. KUDU-1726: Avoid fsync-per-block in tablet copy This patch incorporates BlockTransaction API into tablet copy, to avoid fsync() on each block during copying. Blocks are synced to disk together once the tablet copy is complete. It also introduced a new block manager metric 'total_sync_disk' to keep track of block data disk synchronization count. I did a manual test to copy tablet with size of ~37GB from one tablet server to another on el6 with ext4. 'kudu remote_replica copy c53ffbc2ede84b6d9af2da607024d131 localhost:3334 localhost:3335' With this change, the operation time dropped down from ~718s to ~523s. Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76 --- M src/kudu/fs/block_manager_metrics.cc M src/kudu/fs/block_manager_metrics.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/log_block_manager.cc M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h 7 files changed, 72 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/01/7701/2 -- To view, visit http://gerrit.cloudera.org:8080/7701 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7534699f589c7060ffe32d7ac67546476cf21e76 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-1894 fixed deadlock in client.Connection
Alexey Serbin has posted comments on this change. Change subject: KUDU-1894 fixed deadlock in client.Connection .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java: Line 524: sendCallToWire(msg); > Yes, that can happen now. But I don't see how it could break any assumptio Even if Channels.write() is async, I think there is an underlying queue in Netty. However, I don't see how we are about to see a problem here in case of concurrent calls after this change. -- To view, visit http://gerrit.cloudera.org:8080/7765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2039 Fix the table count in the /tables page of master webUI
Adar Dembo has submitted this change and it was merged. Change subject: KUDU-2039 Fix the table count in the /tables page of master webUI .. KUDU-2039 Fix the table count in the /tables page of master webUI The table count that was displayed on the webUI does not eliminate the ones that are deleted. Added a check to see if the table's state is running and then updated the count. Change-Id: I03c2fa1e3f26a9927d4b748a49dd5cee22d8f9d5 Reviewed-on: http://gerrit.cloudera.org:8080/7568 Reviewed-by: Adar DemboTested-by: Kudu Jenkins --- M src/kudu/master/master-path-handlers.cc 1 file changed, 3 insertions(+), 2 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7568 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I03c2fa1e3f26a9927d4b748a49dd5cee22d8f9d5 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Abhishek Talluri Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1894 fixed deadlock in client.Connection
Alexey Serbin has posted comments on this change. Change subject: KUDU-1894 fixed deadlock in client.Connection .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java: PS4, Line 274: while (true) { : List queued = Preconditions.checkNotNull(queuedMessages); : if (queued.isEmpty()) { : break; > even though we'll lose the Preconditions check here, I think this would be Done PS4, Line 290: lock.unlock(); : needs_unlock = false; : // Send out the enqueued messages while not holding the lock. This is to avoid : // deadlock if channelDisconnected/channelClosed event happens and cleanup() is called. : for (final QueuedMessage qm : queued) { : sendCallToWire(qm.message); : } : lock.lock(); : needs_unlock = true; > I think it's slightly more idiomatic to do: I like it better, thanks! Line 304: queuedMessages = null; > can you add an 'assert queuedMessages.empty()' here? Done Line 507: Preconditions.checkState(state == State.READY); > seems more like an assert to me Done Line 524: sendCallToWire(msg); > does this now risk that calls come across the wire in non-ascending callId Yes, that can happen now. But I don't see how it could break any assumptions in our code. Also, since we were not controlling the ordering of the messages at the Netty level (Channels.write() sends the message asynchronously) even before this change, I don't expect anything else to emerge here. -- To view, visit http://gerrit.cloudera.org:8080/7765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1894 fixed deadlock in client.Connection
Alexey Serbin has uploaded a new patch set (#5). Change subject: KUDU-1894 fixed deadlock in client.Connection .. KUDU-1894 fixed deadlock in client.Connection Due to the reverse order of acquiring of the Connection.lock and some lower-level Netty locks, Connection.enqueueMesage() could deadlock if a ChannelDisconnected/ChannelClosed event arrived in the middle. The idea is to not hold the Connection.lock while invoking the Connection.sendCallToWire() method and, overall, avoid doing any calls to Netty while holding that lock. To test the changes, I ran the ITClient test via dist-test apllying Todd's WIP patch on top: https://gerrit.cloudera.org/#/c/7579/ The test passed 3572 out of 3572 times: http://dist-test.cloudera.org/job?job_id=aserbin.1503526685.24101 http://dist-test.cloudera.org/job?job_id=aserbin.1503527340.4787 http://dist-test.cloudera.org/job?job_id=aserbin.1503527848.6921 Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a --- M java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java 1 file changed, 54 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/65/7765/5 -- To view, visit http://gerrit.cloudera.org:8080/7765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Todd Lipcon
[kudu-CR] disk failure: don't open tablets on failed disks
Mike Percy has posted comments on this change. Change subject: disk failure: don't open tablets on failed disks .. Patch Set 6: I retriggered this one too -- To view, visit http://gerrit.cloudera.org:8080/7766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] open FS layout in presence of disk failure
Mike Percy has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 6: I retriggered the build -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR](branch-1.3.x) log block manager: use unsigned int for next block id
Adar Dembo has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.3.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-871. Support tombstoned voting
Todd Lipcon has submitted this change and it was merged. Change subject: KUDU-871. Support tombstoned voting .. KUDU-871. Support tombstoned voting This patch makes it possible for tombstoned tablet replicas to vote in Raft elections. Changes: * Add Stop() method to TabletReplica + Consensus lifecycle. * Includes new STOPPED state. * Tombstoning a replica should call Stop(). * Deleting a replica should call Shutdown(). * Persist ConsensusMetadata before returning from TabletCopyClient::Start() because we need cmeta to Init() RaftConsensus, which happens when registering the replica in TSTabletManager. * TSTabletManager::DeleteTablet() should not consider FAILED == deleted, since we no longer destroy RaftConsensus when tombstoning a replica. * Add positive and negative tests for tombstoned voting. * Add a stress test that induces lots of tombstoned voting while running TabletCopy, TabletBootstrap, and DeleteTablet. * Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned voting changed its assumption that tombstoned tablets would not vote. * Fix several tests that expected tombstoned tablets to be SHUTDOWN when now they are STOPPED. Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Reviewed-on: http://gerrit.cloudera.org:8080/6960 Tested-by: Kudu Jenkins Reviewed-by: Alexey SerbinReviewed-by: Todd Lipcon --- M src/kudu/consensus/consensus-test-util.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/consensus/raft_consensus_quorum-test.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/delete_table-itest.cc M src/kudu/integration-tests/external_mini_cluster-itest-base.cc A src/kudu/integration-tests/tombstoned_voting-itest.cc A src/kudu/integration-tests/tombstoned_voting-stress-test.cc M src/kudu/master/sys_catalog.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/tablet_metadata.h M src/kudu/tablet/tablet_replica-test.cc M src/kudu/tablet/tablet_replica.cc M src/kudu/tablet/tablet_replica.h M src/kudu/tools/kudu-admin-test.cc M src/kudu/tools/kudu-ts-cli-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/tablet_copy_source_session-test.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h M src/kudu/tserver/tserver-path-handlers.cc M src/kudu/util/make_shared.h 27 files changed, 1,439 insertions(+), 296 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Alexey Serbin: Looks good to me, but someone else must approve Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 14 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-1.4.x) log block manager: use unsigned int for next block id
Adar Dembo has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.4.x Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-871. Support tombstoned voting
Mike Percy has posted comments on this change. Change subject: KUDU-871. Support tombstoned voting .. Patch Set 13: I did an analysis of failed raft_consensus-itest failures before and after this patch. There was no change in the trend, except that RaftConsensusITest.TestAutoCreateReplica stopped being flaky. :) I looked into it, and I think it was just coincidence. That test doesn't even have failure detection enabled, plus it doesn't tombstone any tablets. I think this patch is ready to ship. -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-871. Support tombstoned voting
Todd Lipcon has posted comments on this change. Change subject: KUDU-871. Support tombstoned voting .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] [java] KUDU-2103 Canonicalize hostnames in client
Todd Lipcon has posted comments on this change. Change subject: [java] KUDU-2103 Canonicalize hostnames in client .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7757/4/java/kudu-client/src/test/java/org/apache/kudu/client/FakeDNS.java File java/kudu-client/src/test/java/org/apache/kudu/client/FakeDNS.java: PS4, Line 56: public synchronized void addForwardResolution(String hostname, InetAddress ip) { : forwardResolutions.put(hostname, ip); : } : : public synchronized void addReverseResolution(InetAddress ip, String hostname) { : reverseResolutions.put(ip, hostname); : } sorry, one more nit I missed: we always define all members up front before any methods, so these below down below (probably below getInstance() for best organization) Did you see my note about checkstyle? I think it would probably flag this. -- To view, visit http://gerrit.cloudera.org:8080/7757 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I77204a185c1ab8e21cc833afb645543cbc0e340f Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Attila BukorGerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] open FS layout in presence of disk failure
Andrew Wong has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 6: Failure seems to be a bunch of clock sync errors in DEBUG mode. -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] open FS layout in presence of disk failure
Alexey Serbin has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] subprocess: even more robust fix for asynchronous signals
Todd Lipcon has posted comments on this change. Change subject: subprocess: even more robust fix for asynchronous signals .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR](branch-1.2.x) log block manager: use unsigned int for next block id
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/7799 Change subject: log block manager: use unsigned int for next_block_id_ .. log block manager: use unsigned int for next_block_id_ KUDU-1538 introduced 'next_block_id_' to keep track of unique block ID that should be used for block creation. Currently, it is defined as int64_t. However, it could be updated based on the value of 'max_block_id' which is uint64_t. Since block IDs are defined as uint64_t both on disk (fs.proto) and in memory (block_id.h), it makes more sense to treat 'next_block_id_' as uint64_t rather than to convert it correctly to int64_t everywhere. This patch changes the type of 'next_block_id_' to uint64_t to avoid overflow due to conversion of unsigned int to int, which can result in the reuse of an existing block ID. It does not add a standalone test case because the failure is most likely to occur in specific test-only scenarios. Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Reviewed-on: http://gerrit.cloudera.org:8080/7796 Reviewed-by: Adar DemboReviewed-by: Todd Lipcon Tested-by: Kudu Jenkins (cherry picked from commit b37bde72c47370293827a306532d9be15b1c5f40) --- M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager.h 2 files changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/7799/1 -- To view, visit http://gerrit.cloudera.org:8080/7799 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.2.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Hao Hao
[kudu-CR](branch-1.4.x) log block manager: use unsigned int for next block id
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/7797 Change subject: log block manager: use unsigned int for next_block_id_ .. log block manager: use unsigned int for next_block_id_ KUDU-1538 introduced 'next_block_id_' to keep track of unique block ID that should be used for block creation. Currently, it is defined as int64_t. However, it could be updated based on the value of 'max_block_id' which is uint64_t. Since block IDs are defined as uint64_t both on disk (fs.proto) and in memory (block_id.h), it makes more sense to treat 'next_block_id_' as uint64_t rather than to convert it correctly to int64_t everywhere. This patch changes the type of 'next_block_id_' to uint64_t to avoid overflow due to conversion of unsigned int to int, which can result in the reuse of an existing block ID. It does not add a standalone test case because the failure is most likely to occur in specific test-only scenarios. Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Reviewed-on: http://gerrit.cloudera.org:8080/7796 Reviewed-by: Adar DemboReviewed-by: Todd Lipcon Tested-by: Kudu Jenkins (cherry picked from commit b37bde72c47370293827a306532d9be15b1c5f40) --- M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager.h 2 files changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/7797/1 -- To view, visit http://gerrit.cloudera.org:8080/7797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.4.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Hao Hao
[kudu-CR](branch-1.3.x) log block manager: use unsigned int for next block id
Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/7798 Change subject: log block manager: use unsigned int for next_block_id_ .. log block manager: use unsigned int for next_block_id_ KUDU-1538 introduced 'next_block_id_' to keep track of unique block ID that should be used for block creation. Currently, it is defined as int64_t. However, it could be updated based on the value of 'max_block_id' which is uint64_t. Since block IDs are defined as uint64_t both on disk (fs.proto) and in memory (block_id.h), it makes more sense to treat 'next_block_id_' as uint64_t rather than to convert it correctly to int64_t everywhere. This patch changes the type of 'next_block_id_' to uint64_t to avoid overflow due to conversion of unsigned int to int, which can result in the reuse of an existing block ID. It does not add a standalone test case because the failure is most likely to occur in specific test-only scenarios. Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Reviewed-on: http://gerrit.cloudera.org:8080/7796 Reviewed-by: Adar DemboReviewed-by: Todd Lipcon Tested-by: Kudu Jenkins (cherry picked from commit b37bde72c47370293827a306532d9be15b1c5f40) --- M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager.h 2 files changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/7798/1 -- To view, visit http://gerrit.cloudera.org:8080/7798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.3.x Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Hao Hao
[kudu-CR] KUDU-1894 fixed deadlock in client.Connection
Todd Lipcon has posted comments on this change. Change subject: KUDU-1894 fixed deadlock in client.Connection .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/7765/4/java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java File java/kudu-client/src/main/java/org/apache/kudu/client/Connection.java: PS4, Line 274: while (true) { : List queued = Preconditions.checkNotNull(queuedMessages); : if (queued.isEmpty()) { : break; even though we'll lose the Preconditions check here, I think this would be easier to follow with just a : while (!queuedMessages.empty()) { ...} in terms of the exception, it'll throw an NPE either way, so I dont think it's much of a loss. PS4, Line 290: lock.unlock(); : needs_unlock = false; : // Send out the enqueued messages while not holding the lock. This is to avoid : // deadlock if channelDisconnected/channelClosed event happens and cleanup() is called. : for (final QueuedMessage qm : queued) { : sendCallToWire(qm.message); : } : lock.lock(); : needs_unlock = true; I think it's slightly more idiomatic to do: lock.unlock(); try { } finally { lock.lock(); } for the 'unlocked' periodic here, even though in some rare circumstance in which there is an exception thrown, it would cause an extra lock/unlock Line 304: queuedMessages = null; can you add an 'assert queuedMessages.empty()' here? Line 507: Preconditions.checkState(state == State.READY); seems more like an assert to me Line 524: sendCallToWire(msg); does this now risk that calls come across the wire in non-ascending callId order? Does that break any assumptions elsewhere in the code? -- To view, visit http://gerrit.cloudera.org:8080/7765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1a0edc3e3accbcff60f2cde641ee470312bbd27a Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] open FS layout in presence of disk failure
Mike Percy has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] disk failure: don't open tablets on failed disks
Mike Percy has posted comments on this change. Change subject: disk failure: don't open tablets on failed disks .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] log block manager: use unsigned int for next block id
Adar Dembo has submitted this change and it was merged. Change subject: log block manager: use unsigned int for next_block_id_ .. log block manager: use unsigned int for next_block_id_ KUDU-1538 introduced 'next_block_id_' to keep track of unique block ID that should be used for block creation. Currently, it is defined as int64_t. However, it could be updated based on the value of 'max_block_id' which is uint64_t. Since block IDs are defined as uint64_t both on disk (fs.proto) and in memory (block_id.h), it makes more sense to treat 'next_block_id_' as uint64_t rather than to convert it correctly to int64_t everywhere. This patch changes the type of 'next_block_id_' to uint64_t to avoid overflow due to conversion of unsigned int to int, which can result in the reuse of an existing block ID. It does not add a standalone test case because the failure is most likely to occur in specific test-only scenarios. Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Reviewed-on: http://gerrit.cloudera.org:8080/7796 Reviewed-by: Adar DemboReviewed-by: Todd Lipcon Tested-by: Kudu Jenkins --- M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager.h 2 files changed, 2 insertions(+), 2 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] subprocess: even more robust fix for asynchronous signals
Adar Dembo has posted comments on this change. Change subject: subprocess: even more robust fix for asynchronous signals .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/7561/6/src/kudu/util/subprocess-test.cc File src/kudu/util/subprocess-test.cc: Line 302: ASSERT_EVENTUALLY([&]{ > why do these need EVENTUALLY? isn't the point of the changes you made to Ki Oops. An earlier iteration of the patch used kill(2) directly, and so ASSERT_EVENTUALLY was relevant. I'll remove it. -- To view, visit http://gerrit.cloudera.org:8080/7561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] subprocess: even more robust fix for asynchronous signals
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7561 to look at the new patch set (#7). Change subject: subprocess: even more robust fix for asynchronous signals .. subprocess: even more robust fix for asynchronous signals Another x1000 loop of raft_consensus-itest yielded the following failure in TestElectPendingVoter: /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:2035: Failure Value of: s.IsTimedOut() Actual: false Expected: true Expected AddServer() to time out. Result: OK The fix from commit 27435da doesn't work well for AddServer because it's not an idempotent operation. Instead, let's take a more robust tack and add some waiting behavior directly into Subprocess, using the /proc//stat process state as a guide. Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 --- M src/kudu/integration-tests/raft_consensus-itest.cc M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc M src/kudu/util/subprocess.h 4 files changed, 113 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/7561/7 -- To view, visit http://gerrit.cloudera.org:8080/7561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] subprocess: even more robust fix for asynchronous signals
Todd Lipcon has posted comments on this change. Change subject: subprocess: even more robust fix for asynchronous signals .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/7561/6/src/kudu/util/subprocess-test.cc File src/kudu/util/subprocess-test.cc: Line 302: ASSERT_EVENTUALLY([&]{ why do these need EVENTUALLY? isn't the point of the changes you made to Kill that, as soon as kill returns, you're guaranteed it's in the right state? similarly when Start() returns it should definitely be running -- To view, visit http://gerrit.cloudera.org:8080/7561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] subprocess: even more robust fix for asynchronous signals
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7561 to look at the new patch set (#6). Change subject: subprocess: even more robust fix for asynchronous signals .. subprocess: even more robust fix for asynchronous signals Another x1000 loop of raft_consensus-itest yielded the following failure in TestElectPendingVoter: /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:2035: Failure Value of: s.IsTimedOut() Actual: false Expected: true Expected AddServer() to time out. Result: OK The fix from commit 27435da doesn't work well for AddServer because it's not an idempotent operation. Instead, let's take a more robust tack and add some waiting behavior directly into Subprocess, using the /proc//stat process state as a guide. Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 --- M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc M src/kudu/util/subprocess.h 3 files changed, 113 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/7561/6 -- To view, visit http://gerrit.cloudera.org:8080/7561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I99d400e971d6f9b22cc7b4483db94a98ec306e10 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] log block manager: use unsigned int for next block id
Todd Lipcon has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Adar Dembo has posted comments on this change. Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. Patch Set 30: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 30 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] Give more context on errors reading cfiles
Todd Lipcon has submitted this change and it was merged. Change subject: Give more context on errors reading cfiles .. Give more context on errors reading cfiles Recently, a user reported an error loading a tablet in which the only reported message was "bad magic". This wasn't useful for pinpointing the id of the bad block or what might have happened to cause the problem. This patch adds more context info in such circumstances: we now include DebugString output for "bad magic" errors as well as the block ID in all cases. The unit test is updated to check that block IDs show up in all Corruption status messages. Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af Reviewed-on: http://gerrit.cloudera.org:8080/7620 Reviewed-by: Adar DemboTested-by: Kudu Jenkins --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/index_btree.cc M src/kudu/tablet/deltafile.cc 4 files changed, 52 insertions(+), 19 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7620 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Hao Hao has posted comments on this change. Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. Patch Set 30: (1 comment) http://gerrit.cloudera.org:8080/#/c/7207/29/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS29, Line 1229: it would > Seems like 'it' accidentally got dropped. This sentence should read as: Oops, my bad...Thanks for catching it. Updated. -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 30 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7207 to look at the new patch set (#30). Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. KUDU-1943: Add BlockTransaction to Block Manager This adds a new layer of abstraction 'BlockTransaction' to Block Manager, to accumulate new blocks in a logical operation, e.g. flush/compaction. Avoid using fsync() per block can potentionaly reduce a lot I/O overhead. This patch also add a new API, Finalize(), to Block Manager, so that an LBM container can be reused without flushing in-flight writable blocks to disk. This can bring down the log block container number and disk space consumption for first few flush/compaction operations when using log block manager. I performed manual test which creates a table with two tablets, 'kudu perf loadgen localhost --num-rows-per-thread=2000 --num-threads=10 --keep-auto-table --table_num_buckets=2'. With this patch, 'log_block_manager_containers' dropped from 56 to 7. A design doc can be found here: https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd --- M src/kudu/cfile/bloomfile.cc M src/kudu/cfile/bloomfile.h M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/cfile_writer.h M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltafile.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/diskrowset.h M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h 20 files changed, 599 insertions(+), 434 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/30 -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 30 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Adar Dembo has posted comments on this change. Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. Patch Set 29: (1 comment) http://gerrit.cloudera.org:8080/#/c/7207/29/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS29, Line 1229: would be Seems like 'it' accidentally got dropped. This sentence should read as: Here is the reasoning as to why it would be safe to do so. -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 29 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] open FS layout in presence of disk failure
Andrew Wong has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/block_manager_util.cc File src/kudu/fs/block_manager_util.cc: PS5, Line 68: > nit: this is not needed, right? Usually those macros are used as Done http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS4, Line 56: // On success, either 'metadata_' is overwritten with the contents of the : // file, or, in the case of disk failure, returns OK and sets : // 'health_status_' to a non-OK value. : Status LoadFromDisk(); > I'm not sure I fully agree. For one the error message is confusing: "LOG(ER Hrm, I'd prefer to keep this internal. If anything of the PIMF fails, it gets set to failed. I'll improve the error message though. PS4, Line 56: // On success, either 'metadata_' is overwritten with the contents of the : // file, or, in the case of disk failure, returns OK and sets : // 'health_status_' to a non-OK value. : Status LoadFromDisk(); > Andrew and I previously discussed this approach. You can see our comments h Thanks for resurfacing this. http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/data_dirs-test.cc File src/kudu/fs/data_dirs-test.cc: PS5, Line 244: Status s = dd_manager_->MarkDataDirFailed(kNumDirs - 1); : ASSERT_STR_CONTAINS(s.ToString(), "All data dirs have failed"); > As I see the call to CreateDataDirGroup() is removed. Is that correct? I Ah, good point, I suppose I can add it in again. Done PS5, Line 363: // manager. : KuduTest::SetUp(); : } : > nit: consider making this method returning Status and then using ASSERT_OK( Done http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS5, Line 1315: ASSERT_NE(data_dir, test_dirs[failed_idx]); > paranoid nit: does it make sense to check that the failed dir is exactly te Sure, although there is test coverage for this in data_dirs-test -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Hao Hao has posted comments on this change. Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. Patch Set 29: (3 comments) http://gerrit.cloudera.org:8080/#/c/7207/28/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS28, Line 352: possibly sync > possibly synchronizing Done PS28, Line 1227: cou > could Done PS28, Line 1229: saf > would be Done -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 29 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] disk failure: don't open tablets on failed disks
Andrew Wong has posted comments on this change. Change subject: disk failure: don't open tablets on failed disks .. Patch Set 5: (6 comments) I'm using a cluster verifier, but it doesn't quite work if you mix around the block manager type. I've deparameterized the test for this reason. http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/integration-tests/disk_failure-itest.cc File src/kudu/integration-tests/disk_failure-itest.cc: Line 31: METRIC_DECLARE_gauge_uint64(data_dirs_failed); > not used in an external minicluster test As per our hipchat convo, this is needed to get metric. Line 33: DECLARE_string(block_manager); > same Done PS4, Line 42: > does 30 work? Done Line 55: StartCluster(GetClusterFlags(), {}, 3, 5); > I dont think you need this when you can just use cluster_->tablet_server(i) Done PS4, Line 96: ExternalTabletServer* ts = cluster_->tablet_server(0); : ASSERT_OK(cluster_->WaitForTabletsRunning(ts, kNumTablets, kAgreementTimeout)); : > how about: I'm going to keep this here since it's just a way to ensure we don't restart before things are persisted. I'll use the cluster verifier below. PS4, Line 116: } : > additionally, how about adding: See above comment. -- To view, visit http://gerrit.cloudera.org:8080/7766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] disk failure: don't open tablets on failed disks
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7766 to look at the new patch set (#6). Change subject: disk failure: don't open tablets on failed disks .. disk failure: don't open tablets on failed disks Currently Kudu servers open the FS layout with failed disks. However, the moment tablets attempt to bootstrap (i.e. open blocks, etc.), they will attempt to read from the failed disk and fail. This can be avoided by checking whether a tablet's disk group contains a failed disk before attempting to read data from the tablet. If so, the tablet should be marked as having an error so it can be reassigned. The default behavior of the 'fs_target_data_dirs_per_tablet' flag is updated to take into account disk state when assigning new directory groups. This allows the tablet to be reassigned to a server without being spread across a failed directory. Testing is done by loading data into a cluster configured to use multiple directories for data blocks, failing a single directory on one of the tablet servers, and ensuring that the tablets with blocks on the failed directory get re-replicated at startup time. The test uses a cluster verifier to verify the healthy end-state of the cluster. Necessary changes have been made to do this on a cluster comprising of multiple data directories. Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 --- M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/log_block_manager.cc M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk_failure-itest.cc M src/kudu/integration-tests/external_mini_cluster_fs_inspector.cc M src/kudu/tserver/ts_tablet_manager.cc 8 files changed, 185 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/7766/6 -- To view, visit http://gerrit.cloudera.org:8080/7766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] open FS layout in presence of disk failure
Hello Mike Percy, Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7784 to look at the new patch set (#6). Change subject: open FS layout in presence of disk failure .. open FS layout in presence of disk failure Currently, if a Kudu server starts up with a failed disk, the server will crash. There are a number of reasons for this, but the pressing one is that the path instance files may not be readable, meaning the directories' UUIDs may not be available. This patch changes this by introducing an "unhealthy" instance in-memory for each instance that fails to lock, load, canonicalize, etc. Such instances are ignored when it comes to checking the integrity of the FS layout, and are simply marked failed by the directory manager. Testing is done in data_dirs-test, log_block_manager-test, and fs_manager-test to ensure failed directories do not impede the managers' startups. Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager_util-test.cc M src/kudu/fs/block_manager_util.cc M src/kudu/fs/block_manager_util.h M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc 13 files changed, 645 insertions(+), 173 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/7784/6 -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7207 to look at the new patch set (#29). Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. KUDU-1943: Add BlockTransaction to Block Manager This adds a new layer of abstraction 'BlockTransaction' to Block Manager, to accumulate new blocks in a logical operation, e.g. flush/compaction. Avoid using fsync() per block can potentionaly reduce a lot I/O overhead. This patch also add a new API, Finalize(), to Block Manager, so that an LBM container can be reused without flushing in-flight writable blocks to disk. This can bring down the log block container number and disk space consumption for first few flush/compaction operations when using log block manager. I performed manual test which creates a table with two tablets, 'kudu perf loadgen localhost --num-rows-per-thread=2000 --num-threads=10 --keep-auto-table --table_num_buckets=2'. With this patch, 'log_block_manager_containers' dropped from 56 to 7. A design doc can be found here: https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd --- M src/kudu/cfile/bloomfile.cc M src/kudu/cfile/bloomfile.h M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/cfile_writer.h M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltafile.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/diskrowset.h M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h 20 files changed, 599 insertions(+), 434 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/29 -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 29 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] open FS layout in presence of disk failure
David Ribeiro Alves has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS4, Line 56: // On success, either 'metadata_' is overwritten with the contents of the : // file, or, in the case of disk failure, returns OK and sets : // 'health_status_' to a non-OK value. : Status LoadFromDisk(); > Andrew and I previously discussed this approach. You can see our comments h Even if internalizing the failure would be ok (which I still think it isn't, though I don't feel super strong about it), the error message that we're ignoring the failure is still wrong, since we arent' -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] log block manager: use unsigned int for next block id
Adar Dembo has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] log block manager: use unsigned int for next block id
Hao Hao has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/7796/2//COMMIT_MSG Commit Message: PS2, Line 12: Since block IDs are > "Since block IDs..." Done PS2, Line 13: t > drop this Done PS2, Line 14: her > rather than Done PS2, Line 15: it corre > correctly to int64_t everywhere Done PS2, Line 18: unsigned > typo, should be unsigned Done PS2, Line 20: is > is Done PS2, Line 20: o occ > to occur Done -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] log block manager: use unsigned int for next block id
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7796 to look at the new patch set (#3). Change subject: log block manager: use unsigned int for next_block_id_ .. log block manager: use unsigned int for next_block_id_ KUDU-1538 introduced 'next_block_id_' to keep track of unique block ID that should be used for block creation. Currently, it is defined as int64_t. However, it could be updated based on the value of 'max_block_id' which is uint64_t. Since block IDs are defined as uint64_t both on disk (fs.proto) and in memory (block_id.h), it makes more sense to treat 'next_block_id_' as uint64_t rather than to convert it correctly to int64_t everywhere. This patch changes the type of 'next_block_id_' to uint64_t to avoid overflow due to conversion of unsigned int to int, which can result in the reuse of an existing block ID. It does not add a standalone test case because the failure is most likely to occur in specific test-only scenarios. Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 --- M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager.h 2 files changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/7796/3 -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] open FS layout in presence of disk failure
Alexey Serbin has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: PS5, Line 1315: ASSERT_EQ(1, dd_manager_->GetFailedDataDirs().size()); paranoid nit: does it make sense to check that the failed dir is exactly test_dirs[failed_idx] ? -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] log block manager: use unsigned int for next block id
Alexey Serbin has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7796/2/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: PS2, Line 411: AtomicInt > I'd prefer if we kept the scope of this patch limited, and we already use u Sure, that makes sense. -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] add outputmetric collecting for the kudu data writing
caiconghui has posted comments on this change. Change subject: add outputmetric collecting for the kudu data writing .. Patch Set 1: @Kudu Jenkins, ok.I try to find the public api for collecting from spark, but failed. The scope of api for task metric collecting operation is private in spark package, other external data source cannot access the interfaces except they are integrated into the spark package, But now it seems impossible. -- To view, visit http://gerrit.cloudera.org:8080/7754 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3ca4389090f25834d165d36b83f77f1cfa3f726 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: caiconghuiGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: caiconghui Gerrit-HasComments: No
[kudu-CR] log block manager: use unsigned int for next block id
Adar Dembo has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 2: (1 comment) The test failures are unrelated; some test machine clocks are unsynchronized for some reason. http://gerrit.cloudera.org:8080/#/c/7796/2/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: PS2, Line 411: AtomicInt > maybe, to be on the safe side, use std::atomic here to have well- I'd prefer if we kept the scope of this patch limited, and we already use util/atomic throughout the block manager code. -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] log block manager: use unsigned int for next block id
Alexey Serbin has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/7796/2//COMMIT_MSG Commit Message: PS2, Line 18: unsinged typo, should be unsigned http://gerrit.cloudera.org:8080/#/c/7796/2/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: PS2, Line 411: AtomicInt maybe, to be on the safe side, use std::atomic here to have well-defined behavior in case of overflow? -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] gradle: target JRE 7 compatibility
Grant Henke has posted comments on this change. Change subject: gradle: target JRE 7 compatibility .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7785/1/java/gradle/compile.gradle File java/gradle/compile.gradle: Line 23: targetCompatibility = javaSourceCompatibility Since source and target versions will likely always match perhaps we should just change javaSourceCompatibility to javaCompatibility here and in the gradle.properties file. -- To view, visit http://gerrit.cloudera.org:8080/7785 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib836a748f0f000b07460dd3141940655dbfb7a29 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] KUDU-871. Support tombstoned voting
Mike Percy has posted comments on this change. Change subject: KUDU-871. Support tombstoned voting .. Patch Set 13: Well, actually, let me hit raft_consensus-itest a little harder. It's possible that I made it flaky and hadn't previously noticed. -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-871. Support tombstoned voting
Mike Percy has posted comments on this change. Change subject: KUDU-871. Support tombstoned voting .. Patch Set 13: The new tests still look good. 100% reliable after 100 loops w/ TSAN + 8x stress: tombstoned_voting-itest: http://dist-test.cloudera.org/job?job_id=mpercy.1503540656.2157 tombstoned_voting-stress-test: http://dist-test.cloudera.org/job?job_id=mpercy.1503540557.1392 If other tests get flaky, they'll show up on the flaky test dashboard and I'll fix them. -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets
Alexey Serbin has posted comments on this change. Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets .. Patch Set 4: Verified+1 unrelated flake in AdminCliTest.TestMoveTablet -- To view, visit http://gerrit.cloudera.org:8080/ To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets
Alexey Serbin has submitted this change and it was merged. Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets .. [cmake] introduce 'pb-gen' and 'krpc-gen' targets Introduced new targets to generate protobuf stubs and KRPC service/proxy files. The 'iwyu' target now depends on the newly introduced 'pb-gen' and 'krpc-gen' targets. That's because IWYU needs to have all protobuf and KRPC headers available when running. Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1 Reviewed-on: http://gerrit.cloudera.org:8080/ Reviewed-by: Adar DemboTested-by: Alexey Serbin --- M CMakeLists.txt M cmake_modules/FindKRPC.cmake M cmake_modules/FindProtobuf.cmake 3 files changed, 23 insertions(+), 0 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Alexey Serbin: Verified -- To view, visit http://gerrit.cloudera.org:8080/ To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Todd Lipcon
[kudu-CR] open FS layout in presence of disk failure
Alexey Serbin has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/block_manager_util.cc File src/kudu/fs/block_manager_util.cc: PS5, Line 68: ; nit: this is not needed, right? Usually those macros are used as RETURN_NOT_OK_FAIL_INSTANCE_PREPEND(...); so the semicolon is added at the call site. http://gerrit.cloudera.org:8080/#/c/7784/5/src/kudu/fs/data_dirs-test.cc File src/kudu/fs/data_dirs-test.cc: PS5, Line 244: Status s = dd_manager_->MarkDataDirFailed(kNumDirs - 1); : ASSERT_STR_CONTAINS(s.ToString(), "All data dirs have failed"); As I see the call to CreateDataDirGroup() is removed. Is that correct? I mean, I would expect to see an attempt to do something with the object after marking all directories as failed (that looks like just a preparation for some meaningful action). PS5, Line 363: void OpenDataDirManager() { : ASSERT_OK(DataDirManager::OpenExistingForTests(env_, test_roots_, : DataDirManagerOptions(), _manager_)); : } nit: consider making this method returning Status and then using ASSERT_OK() -- that way it would be more easier to find the place where it fails, if it does so. -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] log block manager: use unsigned int for next block id
Adar Dembo has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/7796/2//COMMIT_MSG Commit Message: PS2, Line 12: Moreover, block IDs "Since block IDs..." PS2, Line 13: , so drop this PS2, Line 14: than rather than PS2, Line 15: properly correctly to int64_t everywhere PS2, Line 20: can is PS2, Line 20: occur to occur -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] open FS layout in presence of disk failure
Mike Percy has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Adar Dembo has posted comments on this change. Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. Patch Set 28: (3 comments) http://gerrit.cloudera.org:8080/#/c/7207/28/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS28, Line 352: synchronizing possibly synchronizing PS28, Line 1227: can could PS28, Line 1229: is would be -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 28 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] KUDU-871. Support tombstoned voting
Mike Percy has posted comments on this change. Change subject: KUDU-871. Support tombstoned voting .. Patch Set 13: > lgtm. Did you do any last test loop of the relevant tests after the > latest round of changes? Thanks for the review. I'll double-check that now and will post results. I did see some unrelated failures on some unrelated tests, but we've have a lot of code go in this week. -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] disk failure: don't open tablets on failed disks
Mike Percy has posted comments on this change. Change subject: disk failure: don't open tablets on failed disks .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/integration-tests/disk_failure-itest.cc File src/kudu/integration-tests/disk_failure-itest.cc: Line 31: METRIC_DECLARE_gauge_uint64(data_dirs_failed); not used in an external minicluster test Line 33: DECLARE_string(block_manager); same PS4, Line 42: 120 does 30 work? Line 55: void SetupDefaultCluster(vector* ext_tservers) { I dont think you need this when you can just use cluster_->tablet_server(i) to get what you are loading into this vector PS4, Line 96: // Ensure the tablets get to a running state. : ASSERT_OK(cluster_->WaitForTabletsRunning(cluster_->tablet_server(0), : kNumTablets, kAgreementTimeout)); how about: auto& ts0 = ts_map_[cluster_->tablet_server(0)->uuid()]; string tablet_id; ASSERT_EVENTUALLY([&] { vector tablet_ids; ASSERT_OK(ListRunningTabletIds(ts0, kTimeout, _ids)); ASSERT_EQ(1, tablet_ids.size()); tablet_id = tablet_ids[0]; } ASSERT_OK(WaitForServersToAgree(kTimeout, ts_map_, tablet_id, workload.batches_completed())); PS4, Line 116: ASSERT_OK(cluster_->WaitForTabletsRunning(cluster_->tablet_server(0), : kNumTablets, kAgreementTimeout)); additionally, how about adding: ASSERT_OK(WaitForServersToAgree(kTimeout, ts_map_, tablet_id, workload.batches_completed())); -- To view, visit http://gerrit.cloudera.org:8080/7766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] open FS layout in presence of disk failure
Adar Dembo has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS4, Line 56: // On success, either 'metadata_' is overwritten with the contents of the : // file, or, in the case of disk failure, returns OK and sets : // 'health_status_' to a non-OK value. : Status LoadFromDisk(); > I'm not sure I fully agree. For one the error message is confusing: "LOG(ER Andrew and I previously discussed this approach. You can see our comments here: https://gerrit.cloudera.org/#/c/7270/6/src/kudu/fs/data_dirs.cc@408. FWIW, I find the idea of "internalizing" the disk failure to PIMF to be reasonable. -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] log block manager: use unsigned int for next block id
Hao Hao has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/7796/1//COMMIT_MSG Commit Message: Line 14: it makes more sense to treat 'next_block_id_' as uint64_t than to > Should probably explain why we're switching it to uint64_t instead of switc Done PS1, Line 15: > conversion Done PS1, Line 16: > in the reuse of an existing block ID. Done -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7207 to look at the new patch set (#28). Change subject: KUDU-1943: Add BlockTransaction to Block Manager .. KUDU-1943: Add BlockTransaction to Block Manager This adds a new layer of abstraction 'BlockTransaction' to Block Manager, to accumulate new blocks in a logical operation, e.g. flush/compaction. Avoid using fsync() per block can potentionaly reduce a lot I/O overhead. This patch also add a new API, Finalize(), to Block Manager, so that an LBM container can be reused without flushing in-flight writable blocks to disk. This can bring down the log block container number and disk space consumption for first few flush/compaction operations when using log block manager. I performed manual test which creates a table with two tablets, 'kudu perf loadgen localhost --num-rows-per-thread=2000 --num-threads=10 --keep-auto-table --table_num_buckets=2'. With this patch, 'log_block_manager_containers' dropped from 56 to 7. A design doc can be found here: https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd --- M src/kudu/cfile/bloomfile.cc M src/kudu/cfile/bloomfile.h M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/cfile_writer.h M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.cc M src/kudu/fs/block_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltafile.h M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/diskrowset.h M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h 20 files changed, 599 insertions(+), 434 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/28 -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd Gerrit-PatchSet: 28 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] log block manager: use unsigned int for next block id
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7796 to look at the new patch set (#2). Change subject: log block manager: use unsigned int for next_block_id_ .. log block manager: use unsigned int for next_block_id_ KUDU-1538 introduced 'next_block_id_' to keep track of unique block ID that should be used for block creation. Currently, it is defined as int64_t. However, it could be updated based on the value of 'max_block_id' which is uint64_t. Moreover, block IDs are defined as uint64_t both on disk (fs.proto) and in memory (block_id.h), so it makes more sense to treat 'next_block_id_' as uint64_t than to convert it properly. This patch changes the type of 'next_block_id_' to uint64_t to avoid overflow due to conversion of unsinged int to int, which can result in the reuse of an existing block ID. It does not add a standalone test case because the failure can most likely occur in specific test-only scenarios. Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 --- M src/kudu/fs/log_block_manager.h 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/7796/2 -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] open FS layout in presence of disk failure
David Ribeiro Alves has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS4, Line 56: // On success, either 'metadata_' is overwritten with the contents of the : // file, or, in the case of disk failure, returns OK and sets : // 'health_status_' to a non-OK value. : Status LoadFromDisk(); > Disk failure errors here would be ignored anyway. By adding this to the con I'm not sure I fully agree. For one the error message is confusing: "LOG(ERROR) << "Ignoring error with status: " << _s_prepended.ToString(); \" seems like we're ignoring the error when we really aren't. Then I'd expect that upon getting an OK here the metadata had been loaded, when it's not the case. Given there is a single call site I think it would make sense to be explicit. That is to return a non-ok status and to special case the isDiskFailure() status along with logging the error message there. Otherwise you're pushing logic from the call site onto this class which is a bit dirtier. -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR](branch-1.3.x) [build-support] IWYU build configuration for Jenkins
Jean-Daniel Cryans has submitted this change and it was merged. Change subject: [build-support] IWYU build configuration for Jenkins .. [build-support] IWYU build configuration for Jenkins In this branch, the IWYU target is just a no-op. The idea is that we have common configuration matrix for gerrit pre-commit Jenkins jobs, so in particular branches the IWYU is run for real (as of now, at least in the main trunk). Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Reviewed-on: http://gerrit.cloudera.org:8080/7750 Reviewed-by: Adar DemboTested-by: Kudu Jenkins (cherry picked from commit 8f6812047a2fe50dabfcb26093385b33f9960a7c) Reviewed-on: http://gerrit.cloudera.org:8080/7779 Reviewed-by: Jean-Daniel Cryans --- M build-support/jenkins/build-and-test.sh 1 file changed, 15 insertions(+), 6 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7779 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-1.3.x Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR](branch-1.4.x) [build-support] IWYU build configuration for Jenkins
Jean-Daniel Cryans has submitted this change and it was merged. Change subject: [build-support] IWYU build configuration for Jenkins .. [build-support] IWYU build configuration for Jenkins In this branch, the IWYU target is just a no-op. The idea is that we have common configuration matrix for gerrit pre-commit Jenkins jobs, so in particular branches the IWYU is run for real (as of now, at least in the main trunk). Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Reviewed-on: http://gerrit.cloudera.org:8080/7750 Reviewed-by: Adar DemboTested-by: Kudu Jenkins (cherry picked from commit 8f6812047a2fe50dabfcb26093385b33f9960a7c) Reviewed-on: http://gerrit.cloudera.org:8080/7780 Reviewed-by: Jean-Daniel Cryans --- M build-support/jenkins/build-and-test.sh 1 file changed, 15 insertions(+), 6 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7780 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-1.4.x Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR](branch-1.3.x) [build-support] IWYU build configuration for Jenkins
Jean-Daniel Cryans has posted comments on this change. Change subject: [build-support] IWYU build configuration for Jenkins .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7779 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.3.x Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-1.2.x) [build-support] IWYU build configuration for Jenkins
Jean-Daniel Cryans has posted comments on this change. Change subject: [build-support] IWYU build configuration for Jenkins .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.2.x Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-1.4.x) [build-support] IWYU build configuration for Jenkins
Jean-Daniel Cryans has posted comments on this change. Change subject: [build-support] IWYU build configuration for Jenkins .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7780 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.4.x Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-1.2.x) [build-support] IWYU build configuration for Jenkins
Jean-Daniel Cryans has submitted this change and it was merged. Change subject: [build-support] IWYU build configuration for Jenkins .. [build-support] IWYU build configuration for Jenkins In this branch, the IWYU target is just a no-op. The idea is that we have common configuration matrix for gerrit pre-commit Jenkins jobs, so in particular branches the IWYU is run for real (as of now, at least in the main trunk). Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Reviewed-on: http://gerrit.cloudera.org:8080/7750 Reviewed-by: Adar DemboTested-by: Kudu Jenkins (cherry picked from commit 8f6812047a2fe50dabfcb26093385b33f9960a7c) Reviewed-on: http://gerrit.cloudera.org:8080/7778 Reviewed-by: Jean-Daniel Cryans --- M build-support/jenkins/build-and-test.sh 1 file changed, 15 insertions(+), 6 deletions(-) Approvals: Jean-Daniel Cryans: Looks good to me, approved Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: branch-1.2.x Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] disk failure: don't open tablets on failed disks
Andrew Wong has posted comments on this change. Change subject: disk failure: don't open tablets on failed disks .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/fs/file_block_manager.cc File src/kudu/fs/file_block_manager.cc: PS4, Line 597: TODO(awong): currently the FsReport only reports on containers for the : // log block manager. Implement some sort of reporting for failed : // directories as well. > mind creating a jira to track this and assigning to yourself? Done http://gerrit.cloudera.org:8080/#/c/7766/4/src/kudu/integration-tests/disk_failure-itest.cc File src/kudu/integration-tests/disk_failure-itest.cc: PS4, Line 104: Substitute("--env_inject_eio_globs=$0", JoinPathSegments(failed_dir, "**")), : "--env_inject_eio=1.0", : "--suicide_on_eio=false", > what if you just append to the current flags and not replace them all? Done -- To view, visit http://gerrit.cloudera.org:8080/7766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] disk failure: don't open tablets on failed disks
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7766 to look at the new patch set (#5). Change subject: disk failure: don't open tablets on failed disks .. disk failure: don't open tablets on failed disks Currently Kudu servers open the FS layout with failed disks. However, the moment tablets attempt to bootstrap (i.e. open blocks, etc.), they will attempt to read from the failed disk and fail. This can be avoided by checking whether a tablet's disk group contains a failed disk before attempting to read data from the tablet. If so, the tablet should be marked as having an error so it can be reassigned. The default behavior of the 'fs_target_data_dirs_per_tablet' flag is updated to take into account disk state when assigning new directory groups. This allows the tablet to be reassigned to a server without being spread across a failed directory. Testing is done by loading data into a cluster configured to use multiple directories for data blocks, failing a single directory on one of the tablet servers, and ensuring that the tablets with blocks on the failed directory get re-replicated at startup time. Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 --- M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/log_block_manager.cc M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk_failure-itest.cc M src/kudu/tserver/ts_tablet_manager.cc 7 files changed, 201 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/7766/5 -- To view, visit http://gerrit.cloudera.org:8080/7766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id3fae98355657f6aa4b134c542f92fc07f5c0aa1 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] open FS layout in presence of disk failure
Andrew Wong has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 4: (16 comments) http://gerrit.cloudera.org:8080/#/c/7784/3//COMMIT_MSG Commit Message: PS3, Line 20: mangers' > looks like a typo It is, indeed, "managers'," referring to the startups of the directory manager, log block manager, and fs manager. http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.cc File src/kudu/fs/block_manager_util.cc: PS4, Line 189: it's > nit its Done PS4, Line 189: them > nit s/them/it Done http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS4, Line 56: // On success, either 'metadata_' is overwritten with the contents of the : // file, or, in the case of disk failure, returns OK and sets : // 'health_status_' to a non-OK value. : Status LoadFromDisk(); > this is weird. whats the reason for this not to return a non-ok status righ Disk failure errors here would be ignored anyway. By adding this to the contract, we don't have to sully the call-site with disk-handling code, and we can keep the error handling (i.e. setting health_status_) internal. PS4, Line 83: // Whether or not the instance is healthy. If the instance file lives on a : // disk that has failed, this should return false. : bool healthy() const { : return health_status_.ok(); : } > we usually don't have this extra getter (i.e. consider just doing pimf.heal As per Mike's comment, I think it helps for readability. Also, I get the feeling that we don't use this elsewhere (at least for tablets/replicas) because other components have multiple states. In this case, it's pretty cut-and-dry: healthy, not healthy. PS4, Line 83: // Whether or not the instance is healthy. If the instance file lives on a : // disk that has failed, this should return false. : bool healthy() const { : return health_status_.ok(); : } > I had the same initial reaction but it seems to make the code that uses it Done http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: Line 116: using std::make_pair; > warning: using decl 'make_pair' is unused [misc-unused-using-decls] Done PS4, Line 794: if (FindUuidByRoot(root, )) { : return FindUuidIndexByUuid(uuid, uuid_idx); : } : return false; > there are no chances of races here, right? Nope, the first call is guarded by a scoped lock, as is the second. These values don't change after they're first set in Open(). http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: PS4, Line 366: Finds the UUID for the given canonicalized root directory name. > nit, for consistency append the same suffix as the previous new method (ret Done http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager-test.cc File src/kudu/fs/fs_manager-test.cc: Line 279: FLAGS_env_inject_eio = 0; > nit: shouldn't need this due to flag saver in KuduTest unless I'm missing s See Adar's comment. Line 279: FLAGS_env_inject_eio = 0; > Unfortunately, KuduTest doesn't restore the flags until _after_ TearDown() Thanks for the explanation. Line 302: FLAGS_env_inject_eio = 0; > same See above. http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: PS3, Line 180: set all_roots; : all_roots.insert(wal_fs_root_); : for (const string& data_fs_root : data_fs_roots_) { : all_roots.insert(data_fs_root); : } > nit: consider Done PS3, Line 228: if (!ContainsKey(unique_roots, root.path)) { : canonicalized_data_fs_roots_.emplace_back(root); : canonicalized_all_fs_roots_.emplace_back(root); : InsertOrDie(_roots, root.path); : } > Consider using InsertIfNotPresent() instead to avoid double look-ups: Done PS3, Line 248: string wal_root > Is it necessary to make a copy here? Why not to use const reference here i Done PS3, Line 251: string meta_root > ditto Done -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] open FS layout in presence of disk failure
Hello Mike Percy, Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7784 to look at the new patch set (#5). Change subject: open FS layout in presence of disk failure .. open FS layout in presence of disk failure Currently, if a Kudu server starts up with a failed disk, the server will crash. There are a number of reasons for this, but the pressing one is that the path instance files may not be readable, meaning the directories' UUIDs may not be available. This patch changes this by introducing an "unhealthy" instance in-memory for each instance that fails to lock, load, canonicalize, etc. Such instances are ignored when it comes to checking the integrity of the FS layout, and are simply marked failed by the directory manager. Testing is done in data_dirs-test, log_block_manager-test, and fs_manager-test to ensure failed directories do not impede the managers' startups. Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager_util-test.cc M src/kudu/fs/block_manager_util.cc M src/kudu/fs/block_manager_util.h M src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc 13 files changed, 635 insertions(+), 174 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/7784/5 -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-871. Support tombstoned voting
Alexey Serbin has posted comments on this change. Change subject: KUDU-871. Support tombstoned voting .. Patch Set 13: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-871. Support tombstoned voting
Todd Lipcon has posted comments on this change. Change subject: KUDU-871. Support tombstoned voting .. Patch Set 13: Code-Review+1 lgtm. Did you do any last test loop of the relevant tests after the latest round of changes? -- To view, visit http://gerrit.cloudera.org:8080/6960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Edward Fancher Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets
Adar Dembo has posted comments on this change. Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/ To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] open FS layout in presence of disk failure
Adar Dembo has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 4: Code-Review+1 Don't have anything to add to David/Mike's comments. -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] [cmake] introduce 'pb-gen' and 'krpc-gen' targets
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/ to look at the new patch set (#4). Change subject: [cmake] introduce 'pb-gen' and 'krpc-gen' targets .. [cmake] introduce 'pb-gen' and 'krpc-gen' targets Introduced new targets to generate protobuf stubs and KRPC service/proxy files. The 'iwyu' target now depends on the newly introduced 'pb-gen' and 'krpc-gen' targets. That's because IWYU needs to have all protobuf and KRPC headers available when running. Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1 --- M CMakeLists.txt M cmake_modules/FindKRPC.cmake M cmake_modules/FindProtobuf.cmake 3 files changed, 23 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/77//4 -- To view, visit http://gerrit.cloudera.org:8080/ To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I48833098553ef7d3f4fae8b88079cb354b52bea1 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Give more context on errors reading cfiles
Adar Dembo has posted comments on this change. Change subject: Give more context on errors reading cfiles .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7620 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] server: fix crash when Shutdown is called before Init
Adar Dembo has submitted this change and it was merged. Change subject: server: fix crash when Shutdown is called before Init .. server: fix crash when Shutdown is called before Init This SIGSEGV suggests unconditional access to Messenger::Shutdown is bad. I don't know why MiniMaster::Start failed in the test but I suspect it's due to clock unsynchronization, because that would cause ServerBase::Init to fail before creating the Messenger. *** SIGSEGV (@0x18) received by PID 24219 (TID 0x7fd0b2486c40) from PID 24; stack trace: *** @ 0x7fd0ad48c330 (unknown) at ??:0 @ 0x7fd0ac46e24f kudu::percpu_rwlock::lock() at ??:0 @ 0x7fd0aab3ee59 kudu::rpc::Messenger::ShutdownInternal() at ??:0 @ 0x7fd0aab3f0bc kudu::rpc::Messenger::Shutdown() at ??:0 @ 0x7fd0aff36bc8 kudu::server::ServerBase::Shutdown() at ??:0 @ 0x7fd0aff33749 kudu::server::ServerBase::~ServerBase() at ??:0 @ 0x7fd0b1451a08 kudu::kserver::KuduServer::~KuduServer() at ??:0 @ 0x7fd0b144f5be kudu::master::Master::~Master() at ??:0 @ 0x7fd0b144f72a kudu::master::Master::~Master() at ??:0 @ 0x7fd0b14687c4 kudu::master::MiniMaster::Start() at ??:0 @ 0x4d10a4 kudu::master::SysCatalogTest::SetUp() at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/master/sys_catalog-test.cc:? @ 0x7fd0b112fb3a testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0 @ 0x7fd0b110fb27 testing::Test::Run() at ??:0 @ 0x7fd0b037 testing::TestInfo::Run() at ??:0 @ 0x7fd0bda7 testing::TestCase::Run() at ??:0 @ 0x7fd0b111d897 testing::internal::UnitTestImpl::RunAllTests() at ??:0 @ 0x7fd0b1130a1a testing::internal::HandleExceptionsInMethodIfSupported<>() at ??:0 @ 0x7fd0b111d2e2 testing::UnitTest::Run() at ??:0 @ 0x7fd0b208f3dc RUN_ALL_TESTS() at ??:0 @ 0x7fd0b208eb97 main at ??:0 @ 0x7fd0a5733f45 __libc_start_main at ??:0 @ 0x4311a7 (unknown) at ??:? @0x0 (unknown) Change-Id: I1e36318d93cabd9d98e43951cee2075272f8f799 Reviewed-on: http://gerrit.cloudera.org:8080/7795 Reviewed-by: Todd LipconTested-by: Kudu Jenkins --- M src/kudu/kserver/kserver.cc M src/kudu/server/server_base.cc 2 files changed, 6 insertions(+), 2 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I1e36318d93cabd9d98e43951cee2075272f8f799 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] log block manager: use unsigned int for next block id
Adar Dembo has posted comments on this change. Change subject: log block manager: use unsigned int for next_block_id_ .. Patch Set 1: (3 comments) I've been wondering whether this deserves a standalone test. I'm inclined to say 'no' since I think the pain is only felt in specific test-only scenarios. Namely: 1) You have to be running in a gtest, so either you're using an InternalMiniCluster or a lower level unit test. 2) You have to be using the block cache singleton. 3) You have to flush some data. 4) You have to restart the block manager some number of times. BTW, KUDU-1538 is related and perhaps should be referenced in the commit description. http://gerrit.cloudera.org:8080/#/c/7796/1//COMMIT_MSG Commit Message: Line 14: This patch changes the type of next_block_id_ to uint64_t to Should probably explain why we're switching it to uint64_t instead of switching every other site to int64_t (since the convention in Kudu is to use int64_t for large integer values). The reason is that block IDs are defined as uint64_t both on disk (fs.proto) and in memory (block_id.h), so it makes more sense to treat them as such in the LBM than to convert them properly. PS1, Line 15: conversation conversion PS1, Line 16: in block id reused. in the reuse of an existing block ID. -- To view, visit http://gerrit.cloudera.org:8080/7796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib315b20719ef529331304df5c56c4242902524d4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] Give more context on errors reading cfiles
Hello David Ribeiro Alves, Andrew Wong, Grant Henke, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7620 to look at the new patch set (#2). Change subject: Give more context on errors reading cfiles .. Give more context on errors reading cfiles Recently, a user reported an error loading a tablet in which the only reported message was "bad magic". This wasn't useful for pinpointing the id of the bad block or what might have happened to cause the problem. This patch adds more context info in such circumstances: we now include DebugString output for "bad magic" errors as well as the block ID in all cases. The unit test is updated to check that block IDs show up in all Corruption status messages. Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/index_btree.cc M src/kudu/tablet/deltafile.cc 4 files changed, 52 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/7620/2 -- To view, visit http://gerrit.cloudera.org:8080/7620 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0bda5688a020032c512235ee574cb3e53c7872af Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout"
Todd Lipcon has posted comments on this change. Change subject: Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout" .. Patch Set 1: Code-Review+2 Going to just +2 this since it's a revert -- To view, visit http://gerrit.cloudera.org:8080/7788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout"
Todd Lipcon has submitted this change and it was merged. Change subject: Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout" .. Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout" This reverts commit 351337ee2d92361e2c87699d1c87a49774a3ab4d. This patch seemed to cause a problem in test cluster that runs YCSB stress tests. An RPC client and server got "out of sync" somehow on a connection so that the footer of one call was interpreted as the length prefix of the next call by the server. It then expected a 40MB+ RPC and just sat there consuming RPC calls for many hours assuming it was all part of one large RPC. The other end just kept timing out since it was sending calls and not getting any responses. Michael and I looked into it for a while and found one issue but still couldn't fully explain how this happened, so we'll revert for now. Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f Reviewed-on: http://gerrit.cloudera.org:8080/7788 Tested-by: Todd LipconReviewed-by: Todd Lipcon --- M docs/design-docs/rpc.md M src/kudu/rpc/blocking_ops.cc M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/constants.cc M src/kudu/rpc/inbound_call.cc M src/kudu/rpc/inbound_call.h M src/kudu/rpc/outbound_call.cc M src/kudu/rpc/outbound_call.h M src/kudu/rpc/reactor.cc M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/rpc/rpc_header.proto M src/kudu/rpc/rtest.proto M src/kudu/rpc/serialization.cc M src/kudu/rpc/serialization.h M src/kudu/rpc/server_negotiation.cc M src/kudu/rpc/transfer.cc M src/kudu/rpc/transfer.h 19 files changed, 157 insertions(+), 530 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/7788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout"
Todd Lipcon has posted comments on this change. Change subject: Revert "KUDU-2065, KUDU-2011: Release sidecars on cancellation or timeout" .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1430cabfc54e858c1ad073574404821ae71c169f Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] server: fix crash when Shutdown is called before Init
Todd Lipcon has posted comments on this change. Change subject: server: fix crash when Shutdown is called before Init .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1e36318d93cabd9d98e43951cee2075272f8f799 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] open FS layout in presence of disk failure
Mike Percy has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 4: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS4, Line 83: // Whether or not the instance is healthy. If the instance file lives on a : // disk that has failed, this should return false. : bool healthy() const { : return health_status_.ok(); : } > we usually don't have this extra getter (i.e. consider just doing pimf.heal I had the same initial reaction but it seems to make the code that uses it a little more reader-friendly. -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] open FS layout in presence of disk failure
Adar Dembo has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager-test.cc File src/kudu/fs/fs_manager-test.cc: Line 279: FLAGS_env_inject_eio = 0; > nit: shouldn't need this due to flag saver in KuduTest unless I'm missing s Unfortunately, KuduTest doesn't restore the flags until _after_ TearDown() runs, and env_inject_eio!=0 can cause test directory cleanup to fail. That said, having to do this for every test is annoying, so maybe we should manually destroy the FlagSaver at the beginning of TearDown instead? -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] open FS layout in presence of disk failure
Alexey Serbin has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 3: (5 comments) A quick first pass http://gerrit.cloudera.org:8080/#/c/7784/3//COMMIT_MSG Commit Message: PS3, Line 20: mangers' looks like a typo Is that supposed to be "directory managers'" or "managers'" ? http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: PS3, Line 180: set all_roots; : all_roots.insert(wal_fs_root_); : for (const string& data_fs_root : data_fs_roots_) { : all_roots.insert(data_fs_root); : } nit: consider set all_roots(data_fs_roots_.begin(), data_fs_roots_.end()); all_roots.insert(wal_fs_root_); PS3, Line 228: if (!ContainsKey(unique_roots, root.first)) { : canonicalized_data_fs_roots_.emplace_back(root); : canonicalized_all_fs_roots_.emplace_back(root); : InsertOrDie(_roots, root.first); : } Consider using InsertIfNotPresent() instead to avoid double look-ups: if (InsertIfNotPresent(_roots, root.first)) { canonicalized_data_fs_roots_.emplace_back(root); canonicalized_all_fs_roots_.emplace_back(root); } PS3, Line 248: string wal_root Is it necessary to make a copy here? Why not to use const reference here instead? PS3, Line 251: string meta_root ditto -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR](branch-1.2.x) [build-support] IWYU build configuration for Jenkins
Adar Dembo has posted comments on this change. Change subject: [build-support] IWYU build configuration for Jenkins .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.2.x Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-1.3.x) [build-support] IWYU build configuration for Jenkins
Adar Dembo has posted comments on this change. Change subject: [build-support] IWYU build configuration for Jenkins .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7779 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.3.x Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR](branch-1.4.x) [build-support] IWYU build configuration for Jenkins
Adar Dembo has posted comments on this change. Change subject: [build-support] IWYU build configuration for Jenkins .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7780 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I45f2e8529891dd7da18c9fa6383e0c41e6b4182e Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-1.4.x Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] open FS layout in presence of disk failure
Mike Percy has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 3: (3 comments) couple other minor nits, overall lgtm http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: PS3, Line 831: dirs_to_update unused? http://gerrit.cloudera.org:8080/#/c/7784/3/src/kudu/fs/fs_manager-test.cc File src/kudu/fs/fs_manager-test.cc: Line 279: FLAGS_env_inject_eio = 0; nit: shouldn't need this due to flag saver in KuduTest unless I'm missing something Line 302: FLAGS_env_inject_eio = 0; same -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] [java] Gradle test task improvements
Adar Dembo has submitted this change and it was merged. Change subject: [java] Gradle test task improvements .. [java] Gradle test task improvements Previously some tests would be extremely flaky whenrunning via Gradle. Especially secure tests with tokenrenewal. This fixes that issue by forking a new jvm foreach test class to isolate the tests.I also add a few system properties to match theMaven build and support filtering tests from the root directory with the --tests argument. Change-Id: Ifb660b15423b68c299518b7ba54a15135972a007 Reviewed-on: http://gerrit.cloudera.org:8080/7782 Reviewed-by: Adar DemboTested-by: Adar Dembo --- M java/gradle/tests.gradle M java/pom.xml 2 files changed, 21 insertions(+), 2 deletions(-) Approvals: Adar Dembo: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/7782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ifb660b15423b68c299518b7ba54a15135972a007 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Todd Lipcon
[kudu-CR] open FS layout in presence of disk failure
David Ribeiro Alves has posted comments on this change. Change subject: open FS layout in presence of disk failure .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.cc File src/kudu/fs/block_manager_util.cc: PS4, Line 189: them nit s/them/it PS4, Line 189: it's nit its http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/block_manager_util.h File src/kudu/fs/block_manager_util.h: PS4, Line 56: // On success, either 'metadata_' is overwritten with the contents of the : // file, or, in the case of disk failure, returns OK and sets : // 'health_status_' to a non-OK value. : Status LoadFromDisk(); this is weird. whats the reason for this not to return a non-ok status right away? PS4, Line 83: // Whether or not the instance is healthy. If the instance file lives on a : // disk that has failed, this should return false. : bool healthy() const { : return health_status_.ok(); : } we usually don't have this extra getter (i.e. consider just doing pimf.healt_status().ok()) http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: PS4, Line 794: if (FindUuidByRoot(root, )) { : return FindUuidIndexByUuid(uuid, uuid_idx); : } : return false; there are no chances of races here, right? http://gerrit.cloudera.org:8080/#/c/7784/4/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: PS4, Line 366: Finds the UUID for the given canonicalized root directory name. nit, for consistency append the same suffix as the previous new method (returning false...) -- To view, visit http://gerrit.cloudera.org:8080/7784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib2a1c824526ed52a6b90ddfbc735cecc4c491118 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes