[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Will Berkeley (Code Review)
Will Berkeley has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

Due to KUDU-2376, the new test in replace_tablet-itest fails. Since
this tool is meant to fix broken tables, I think it's still valuable
to add this tool in the meantime.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Reviewed-on: http://gerrit.cloudera.org:8080/9393
Reviewed-by: Dan Burkert 
Reviewed-by: Adar Dembo 
Tested-by: Will Berkeley 
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master-stress-test.cc
A src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tablet.cc
14 files changed, 494 insertions(+), 9 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Adar Dembo: Looks good to me, approved
  Will Berkeley: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 12
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 11: Verified+1

IWYU failures are bogus (they break the build if they are applied).


--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 11
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 05 Apr 2018 21:01:29 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Will Berkeley (Code Review)
Will Berkeley has removed a vote on this change.

Change subject: KUDU-2290: Tool to re-create a tablet
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 11
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 11
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 05 Apr 2018 19:06:56 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9393/10/src/kudu/tools/tool_action_tablet.cc
File src/kudu/tools/tool_action_tablet.cc:

http://gerrit.cloudera.org:8080/#/c/9393/10/src/kudu/tools/tool_action_tablet.cc@549
PS10, Line 549:   cout << "Replaced tablet " << tablet_id
> If it's not too fiddly, it might be nice to print the 'Replaced tablet ${ta
Done



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 10
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 05 Apr 2018 18:59:00 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 11
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 05 Apr 2018 18:59:23 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Dan Burkert, Kudu Jenkins, Andrew Wong, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#11).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

Due to KUDU-2376, the new test in replace_tablet-itest fails. Since
this tool is meant to fix broken tables, I think it's still valuable
to add this tool in the meantime.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master-stress-test.cc
A src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tablet.cc
14 files changed, 494 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/11
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 11
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-05 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 10:

(1 comment)

LGTM except a small nit!

http://gerrit.cloudera.org:8080/#/c/9393/10/src/kudu/tools/tool_action_tablet.cc
File src/kudu/tools/tool_action_tablet.cc:

http://gerrit.cloudera.org:8080/#/c/9393/10/src/kudu/tools/tool_action_tablet.cc@549
PS10, Line 549:   cout << "Replaced tablet " << tablet_id
If it's not too fiddly, it might be nice to print the 'Replaced tablet 
${tablet_id} with tablet " portion to stderr and the new tablet ID to stdout.  
This will make it much easier to use this in a script in the future where the 
new tablet ID is needed.

cerr << "Replaced tablet " << tablet_id << " with tablet ";
cout << resp.replacement_tablet_id() << endl;



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 10
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 05 Apr 2018 17:18:50 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 10: Code-Review+1

I'd like Dan to take a look too.


--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 10
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 03 Apr 2018 21:56:32 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-03 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Dan Burkert, Kudu Jenkins, Andrew Wong, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#10).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

Due to KUDU-2376, the new test in replace_tablet-itest fails. Since
this tool is meant to fix broken tables, I think it's still valuable
to add this tool in the meantime.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master-stress-test.cc
A src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tablet.cc
14 files changed, 493 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/10
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 10
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-03 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/master/catalog_manager.cc@4364
PS9, Line 4364: new_tablet->metadata().ReadLock();
> Nit: use a new TabletMetadataLock for this.
Done


http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/master/catalog_manager.cc@4371
PS9, Line 4371:   l_table.Commit();
> Actually, you don't need a TableMetadataLock at all; I misunderstood the co
Done



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 9
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 03 Apr 2018 21:31:32 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-03 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/integration-tests/replace_tablet-itest.cc
File src/kudu/integration-tests/replace_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/integration-tests/replace_tablet-itest.cc@90
PS9, Line 90: PROCESSOR
Nit: PROCESSORS


http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/master/catalog_manager.cc@4364
PS9, Line 4364: new_tablet->metadata().ReadLock();
Nit: use a new TabletMetadataLock for this.


http://gerrit.cloudera.org:8080/#/c/9393/9/src/kudu/master/catalog_manager.cc@4371
PS9, Line 4371:   l_table.Commit();
Actually, you don't need a TableMetadataLock at all; I misunderstood the 
comment at the top of TableInfo::AddRemoveTablets: you need READ-level locks 
(or stronger) on the _tablets_, not the table. And unlike AlterTable, we're not 
making any metadata changes to the TableInfo (e.g. to its schema), so you can 
forgo its lock altogether.



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 9
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 03 Apr 2018 17:40:31 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-03 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Dan Burkert, Kudu Jenkins, Andrew Wong, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#9).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

Due to KUDU-2376, the new test in replace_tablet-itest fails. Since
this tool is meant to fix broken tables, I think it's still valuable
to add this tool in the meantime.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master-stress-test.cc
A src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tablet.cc
14 files changed, 496 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/9
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 9
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-03 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt@101
PS7, Line 101: ADD_KUDU_TEST(replace_tablet-itest)
> Sorry, I missed that the disabled test was the only test in the file.
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4376
PS7, Line 4376:   LOG(INFO) << "ReplaceTablet: tablet " << old_tablet->id()
  : << " deleted and replaced by tablet " << 
new_tablet->id();
  :   resp->set_replaceme
> 1. l_table doesn't actually need to be in WRITE mode; READ is sufficient. T
I took a look at alter table and mirrored that, since it seemed right to me and 
it handles a similar situation where a range partition is dropped and added in 
one alter.



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 8
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 03 Apr 2018 06:44:57 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-02 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt@101
PS7, Line 101: ADD_KUDU_TEST(replace_tablet-itest)
> B-but the test always fails and is disabled, because of KUDU-2376. Can I ad
Sorry, I missed that the disabled test was the only test in the file.

Add the TODO near the test itself so it'll be more obvious when it is reenabled.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4376
PS7, Line 4376:   LOG(INFO) << "ReplaceTablet: tablet " << old_tablet->id()
  : << " deleted and replaced by tablet " << 
new_tablet->id();
  :   resp->set_replaceme
> Does the following order work?
1. l_table doesn't actually need to be in WRITE mode; READ is sufficient. That 
should simplify this a bit by eliminating step #4.
2. We should be able to follow what ProcessPendingAssignments does, right? It 
also replaces tablets. Looks like it first commits the tablets lock, then 
adds/removes the tablets to the table, then finally updates the tablet map.

In general I'm not sure there's a deadlock free way to do this without exposing 
a window where GetTableLocations and  GetTabletLocations aren't totally 
consistent with one another. From the looks of it, ProcessPendingAssignments is 
vulnerable to the window you're describing.  Maybe it's able to get away with 
it because the table is expected to be generally unusable during this time 
(it's just after a CreateTable after all). Though, maybe the same code path is 
followed during an ALTER TABLE ADD PARTITION too?



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 8
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 03 Apr 2018 04:09:11 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-02 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Dan Burkert, Kudu Jenkins, Andrew Wong, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#8).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

Due to KUDU-2376, the new test in replace_tablet-itest fails. Since
this tool is meant to fix broken tables, I think it's still valuable
to add this tool in the meantime.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master-stress-test.cc
A src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tablet.cc
14 files changed, 490 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/8
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 8
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-04-02 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 7:

(21 comments)

Are you comfortable merging this given the risk of KUDU-2376? The same 
segmentation fault will happen, and is captured by the disabled 
replace_tablet-itest, just like the test case I posted in the JIRA for alter 
table.

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/client/client-internal.cc@97
PS7, Line 97: using master::ReplaceTabletRequestPB;
> warning: using decl 'ReplaceTabletRequestPB' is unused [misc-unused-using-d
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/client/client-internal.cc@98
PS7, Line 98: using master::ReplaceTabletResponsePB;
> warning: using decl 'ReplaceTabletResponsePB' is unused [misc-unused-using-
Ack


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt@101
PS7, Line 101: ADD_KUDU_TEST(replace_tablet-itest)
> Could you follow the approach Todd took in commit 1c1d3ba to decide what va
B-but the test always fails and is disabled, because of KUDU-2376. Can I add a 
TODO here instead?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc
File src/kudu/integration-tests/master-stress-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc@297
PS7, Line 297: // The tablet is gone because it was already replaced or 
deleted.
> Should we still wait in this case?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc@468
PS7, Line 468:   LOG(INFO) << "Tables created: " << num_tables_created_.Load();
> What's the distribution of operations before and after your change? Curious
The numbers vary a lot, but it's roughly the same before and after. I see (with 
a debug build) ~70 tables created, ~45 deleted or altered, and about 45 tablets 
replaced. There's 5 master restarts or so.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc
File src/kudu/integration-tests/replace_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc@63
PS7, Line 63: CHECK(tablet_id);
> Not needed?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc@76
PS7, Line 76: CHECK(proxy);
> Not needed?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4324
PS7, Line 4324: new_tablet = scoped_refptr(new 
TabletInfo(table, GenerateId()));
> How about:
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4346
PS7, Line 4346:   // Ensure we abort the mutations if persisting the changes 
fails.
> Don't the l_table and l_tablets destructors abort automatically?
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4358
PS7, Line 4358:   actions.tablets_to_delete.push_back(old_tablet);
> Hmm, this should be in tablets_to_update, no? We never actually delete entr
Done


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4376
PS7, Line 4376: table->AddRemoveTablets({new_tablet}, {old_tablet});
  : l_tablets.Commit();
  : l_table.Commit();
> Are you sure these three should happen with the lock_ held? They acquire lo
Does the following order work?

1. table->AddRemoveTablets
2. l_tablets.Commit()
l(lock_) {
3. Insert to tablet map
}
4. l_table.Commit()

We need tablet + table locks when adding and removing from the table, and we 
want to commit the tablet state before  publishing the new tablet to the tablet 
map so uninitialized state won't be visible. However, we need to add the tablet 
to the tablet map before committing the changes to the table, since if we 
commit table changes before altering the tablet map there's a window where a 
GetTableLocations would see the new tablet as part of the table but no info 
about the new tablet would be available from GetTabletLocations. However, I'm 
worried about the interval between 2 + 4 where a GetTableLocations would see 
the old tablet as deleted. I wanted to ensure that GetTableLocations sees 
either a non-deleted old tablet or a new tablet.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4385
PS7, Line 4385: background_tasks_->Wake();
> This is for the newly created tablet, right? Can you update the comment jus
Done



[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-29 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 7:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/CMakeLists.txt@101
PS7, Line 101: ADD_KUDU_TEST(replace_tablet-itest)
Could you follow the approach Todd took in commit 1c1d3ba to decide what value 
of PROCESSORS to assign to this new itest?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc
File src/kudu/integration-tests/master-stress-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc@297
PS7, Line 297: // The tablet is gone because it was already replaced or 
deleted.
Should we still wait in this case?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/master-stress-test.cc@468
PS7, Line 468:   LOG(INFO) << "Tables created: " << num_tables_created_.Load();
What's the distribution of operations before and after your change? Curious how 
tablet replacement has affected things.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc
File src/kudu/integration-tests/replace_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc@63
PS7, Line 63: CHECK(tablet_id);
Not needed?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/integration-tests/replace_tablet-itest.cc@76
PS7, Line 76: CHECK(proxy);
Not needed?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4324
PS7, Line 4324: new_tablet = scoped_refptr(new 
TabletInfo(table, GenerateId()));
How about:

  new_tablet.reset(new TabletInfo(...));


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4346
PS7, Line 4346:   // Ensure we abort the mutations if persisting the changes 
fails.
Don't the l_table and l_tablets destructors abort automatically?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4358
PS7, Line 4358:   actions.tablets_to_delete.push_back(old_tablet);
Hmm, this should be in tablets_to_update, no? We never actually delete entries 
from the catalog table.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4376
PS7, Line 4376: table->AddRemoveTablets({new_tablet}, {old_tablet});
  : l_tablets.Commit();
  : l_table.Commit();
Are you sure these three should happen with the lock_ held? They acquire locks 
of their own; Commit() may also sleep waiting for locks, which we shouldn't do 
with a spinlock held.

IIRC other CatalogManager operations only hold lock_ to update global maps, 
then do the other steps without it.


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/master/catalog_manager.cc@4385
PS7, Line 4385: background_tasks_->Wake();
This is for the newly created tablet, right? Can you update the comment just 
above?


http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.h
File src/kudu/tools/tool_action_common.h:

http://gerrit.cloudera.org:8080/#/c/9393/7/src/kudu/tools/tool_action_common.h@155
PS7, Line 155:LeaderMasterProxy() = default;
Nit: indentation.



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 7
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 30 Mar 2018 04:37:28 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-29 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 7:

> Build Failed

Unrelated, pre-existing race. See KUDU-2058.


--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 7
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 29 Mar 2018 21:45:28 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-29 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Dan Burkert, Kudu Jenkins, Andrew Wong, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#7).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

Due to KUDU-2376, the new test in replace_tablet-itest fails. Since
this tool is meant to fix broken tables, I think it's still valuable
to add this tool in the meantime.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master-stress-test.cc
A src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tablet.cc
14 files changed, 506 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/7
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 7
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-29 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9393/6/src/kudu/integration-tests/replace_tablet-itest.cc
File src/kudu/integration-tests/replace_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9393/6/src/kudu/integration-tests/replace_tablet-itest.cc@55
PS6, Line 55: ExternalMiniClusterITestBase(),
> warning: initializer for base class 'kudu::ExternalMiniClusterITestBase' is
Done


http://gerrit.cloudera.org:8080/#/c/9393/6/src/kudu/integration-tests/replace_tablet-itest.cc@121
PS6, Line 121:   while (workload.rows_inserted() < 2 * kNumRows) {
> warning: either cast from 'int' to 'long' is ineffective, or there is loss
Done


http://gerrit.cloudera.org:8080/#/c/9393/6/src/kudu/tools/tool_action_common.cc
File src/kudu/tools/tool_action_common.cc:

http://gerrit.cloudera.org:8080/#/c/9393/6/src/kudu/tools/tool_action_common.cc@554
PS6, Line 554: LeaderMasterProxy::LeaderMasterProxy(const 
client::sp::shared_ptr& client) :
> warning: pass by value and use std::move [modernize-pass-by-value]
Done



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 6
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 29 Mar 2018 16:38:34 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-29 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Dan Burkert, Kudu Jenkins, Andrew Wong, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#6).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

Due to KUDU-2376, the new test in replace_tablet-itest fails. Since
this tool is meant to fix broken tables, I think it's still valuable
to add this tool in the meantime.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master-stress-test.cc
A src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tablet.cc
14 files changed, 502 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/6
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 6
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-13 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/client/meta_cache.cc@912
PS5, Line 912: // Partition should not have changed.
> What is a partition except a pair of bounds, in other words, what does your
I've developed a bad habit of using "partition" and "tablet" interchangeably. 
When a tablet is replaced, its partition (or partition boundaries) stay the 
same, but the tablet ID changes. That's what I'm getting at here; that the ID 
has changed.


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/master.proto@715
PS5, Line 715:
> Keep it optional, please.  We should never introduce new required fields.
Just to close this out, Dan (and Todd) reminded me that for scalars, optional 
provides a reasonable default value, one that would have been a valid value to 
begin with. For example, the default for "bytes" in C++ is the empty string, 
which would have been valid to send had the record been required too. Meaning, 
if the empty string is invalid input, it would need to be sanitized regardless 
of whether the field was optional or required.

The default value for embedded messages is null and thus is an exception to 
this rule (i.e. you can't dereference 'foo' without first validating it via 
'has_foo()'), but those are less prevalent than scalars, so the burden isn't as 
onerous.



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 5
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Wed, 14 Mar 2018 00:06:59 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-13 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 5:

(3 comments)

I have reservations about how this patch introduces a new 'quarantined' table.  
In no other circumstances do we pollute the table namespace with auto-created 
names, and I don't think this is a good reason to start. This will become very 
painful down the line when begin to integrate with thirdparty metadata systems 
like Hive and Sentry which have no such concept as a quarantined table.

I think it would be better to simply swap in the new tablet to the existing 
table, and provide the option to have the master not delete the tablet, if the 
data is still relevant.  The data could then still be accessed and operated on 
via the CLI tools which take the tablet ID directly.  Even then there are still 
authorization issues that will need to be figured out eventually.

http://gerrit.cloudera.org:8080/#/c/9393/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9393/5//COMMIT_MSG@26
PS5, Line 26: eventually this will stop
> Maybe be more specific; this behavior ends when the cached entries' TTL exp
There may be some interesting fallout from this. In particular, I believe this 
is the first time in which the meta-cache TTL is significant for _positive_ 
cache locations .  Right now I think the TTL only has observable effects for 
non-covered ranges.

There's nothing actionable, but it's maybe something to think about and 
document.


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/client/meta_cache.cc@912
PS5, Line 912: // Partition should not have changed.
> Could you clarify this a bit: the partition's _boundaries_ should have not
What is a partition except a pair of bounds, in other words, what does your 
suggestion clarify?


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/master.proto@715
PS5, Line 715:
> I consider myself to be a protobuf luddite, but I don't understand why 'opt
Keep it optional, please.  We should never introduce new required fields.  This 
is a great example, since you could later go back and add a variant of tablet 
replacement which just deletes the old version, and doesn't move it to a 
quarantined state.  If we lived in a perfect world with perfect foresight we 
could use required, but we don't and we shouldn't.



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 5
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 13 Mar 2018 23:45:30 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-13 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 5:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/9393/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9393/5//COMMIT_MSG@11
PS5, Line 11: is
Nit: omit this


http://gerrit.cloudera.org:8080/#/c/9393/5//COMMIT_MSG@19
PS5, Line 19: 
Given that tablet_id is globally unique and table_name already identifies the 
table for humans, what value does the inclusion of table_id add?


http://gerrit.cloudera.org:8080/#/c/9393/5//COMMIT_MSG@24
PS5, Line 24: its
Nit: it's


http://gerrit.cloudera.org:8080/#/c/9393/5//COMMIT_MSG@26
PS5, Line 26: eventually this will stop
Maybe be more specific; this behavior ends when the cached entries' TTL expires?


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/client/meta_cache.cc@912
PS5, Line 912: // Partition should not have changed.
Could you clarify this a bit: the partition's _boundaries_ should have not 
changed,


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/client/meta_cache.cc@919
PS5, Line 919: // In this case, 'tablets_by_key' will be empty but a 
RemoteTablet
 : // will exist in the cache.
I don't understand this case. AFAICT tablet replacement is "atomic", which 
means the client sees one of two states for a replaced tablet:
1. Tablet with ID foo from key x to y.
2. Tablet with ID bar from key x to y.

Since the boundaries of the tablet haven't changed, doesn't this mean 
tablet_lower_bound is a valid way to look up the cached entry regardless of 
which state the client is exposed to?

Looking at the code some more, it seems that following tablet replacement, 
'remote' is going to be null so we'll skip all of this, erase the tablet's 
partition from tablets_by_key (L937), set up a new RemoteTablet for the 
replacement, and insert the tablet's partition again (L950). At no point is the 
new tablet exposed in tablets_by_id_ without any corresponding partitions in 
tablets_by_key. I must be missing something here...


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/catalog_manager.h@581
PS5, Line 581:   Status ReplaceTablet(const std::string& tablet_id, 
master::ReplaceTabletResponsePB* resp);
It'd be nice to have testing of this new method outside of the CLI tests. Both 
in isolation as well as in a "stress" environment (i.e. ongoing concurrent data 
operations to the table and/or replaced tablet). It'd also be nice to test 
concurrent metadata operations: what happens if I delete a table and replace 
one of its tablets at the same time? What happens if I alter it concurrently? 
Etc.


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/catalog_manager.cc@4302
PS5, Line 4302:   // To be safe, we'll take the global catalog manager lock for 
the rest of this method.
While I appreciate the motivation to block out everything else, I don't think 
this is a good idea. First, the global catalog manager lock is a spinlock, 
which means anyone waiting on this lock will be spinning, and they'll be 
waiting for a long time because in ReplaceTablet the lock is held during the 
catalog write, which involves network and disk IO. Second, the catalog manager 
lock is never held while acquiring other catalog manager locks; breaking that 
invariant raises the possibility of deadlocks elsewhere. The only exceptions 
I'm aware of to this rule are during catalog manager initialization, at which 
point RPCs are largely rejected anyway.

Can we treat this like any other DDL method and acquire the right locks at the 
right time? The background task does tablet replacement in 
CatalogManager::ProcessPendingAssignments; can that be used as a template for 
how to do it safely here?


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/catalog_manager.cc@4303
PS5, Line 4303: lock
Style nit: Use l (or l_foo) for lock guards. It especially helps in this case, 
since lock and lock_ are so similar.


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/master.proto@707
PS5, Line 707: recovered
Nit: "manually recovered", to make it clear that any actual recovery is out of 
scope of Kudu's normal operations?


http://gerrit.cloudera.org:8080/#/c/9393/5/src/kudu/master/master.proto@715
PS5, Line 715:
I consider myself to be a protobuf luddite, but I don't understand why 
'optional' is appropriate for the below fields. There's the 

[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-13 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9393/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9393/4//COMMIT_MSG@23
PS4, Line 23: Note that, if the
: tablet recovers quickly enough after being replaced,
> This procedure would amount to scanning the new table and dumping the rows
Did you mean to comment on the previous sentence? If so, the answer is yup.


http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc@2405
PS4, Line 2405: replaced_tablet_id
> nit: could we use "old"/"new" instead of "replaced" here? I'm conflicted be
Done


http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc@2445
PS4, Line 2445: replica
> nit: tablet
Done


http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc@2445
PS4, Line 2445: plus one new one.
> This is referring to the new tablet count, not the number of replicas, righ
Done



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 13 Mar 2018 06:23:22 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-13 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#5).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet is has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

The original tablet is moved to a new, specially-named table,
named like

replaced___

so that bad replicas are "quarantined" so they can be investigated,
and if a replica of the original tablet can be recovered, its data
can be scanned and recovered relatively easily. Note that, if the
tablet recovers quickly enough after being replaced, its possible
for clients with cache entries from before the replace to read
the rows from the replaced tablet, but eventually this will stop.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/client/meta_cache.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tablet.cc
11 files changed, 366 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/5
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 5
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-12 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9393/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9393/4//COMMIT_MSG@23
PS4, Line 23: Note that, if the
: tablet recovers quickly enough after being replaced,
This procedure would amount to scanning the new table and dumping the rows into 
the old one, right?


http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc@2405
PS4, Line 2405: replaced_tablet_id
nit: could we use "old"/"new" instead of "replaced" here? I'm conflicted 
because I think it's a clearer than "replaced"/"replacement"/"to-replace", even 
if it's not as accurate.


http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc@2445
PS4, Line 2445: replica
nit: tablet


http://gerrit.cloudera.org:8080/#/c/9393/4/src/kudu/tools/kudu-tool-test.cc@2445
PS4, Line 2445: plus one new one.
This is referring to the new tablet count, not the number of replicas, right?



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:11:46 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-12 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 4:

> Build Failed
 >
 > http://jenkins.kudu.apache.org/job/kudu-gerrit/12378/ : FAILURE

Another instance of https://issues.apache.org/jira/browse/KUDU-1736


--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:05:05 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-12 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#4).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet is has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

The original tablet is moved to a new, specially-named table,
named like

replaced___

so that bad replicas are "quarantined" so they can be investigated,
and if a replica of the original tablet can be recovered, its data
can be scanned and recovered relatively easily. Note that, if the
tablet recovers quickly enough after being replaced, its possible
for clients with cache entries from before the replace to read
the rows from the replaced tablet, but eventually this will stop.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/client/meta_cache.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tablet.cc
11 files changed, 367 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/4
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-11 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/9393/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9393/2//COMMIT_MSG@9
PS2, Line 9: replace_tablet
> nit: how about unsafe_replace_tablet, given we're guaranteed "data loss"
Done


http://gerrit.cloudera.org:8080/#/c/9393/2//COMMIT_MSG@13
PS2, Line 13: Before, the table would
: have to be recreated to recover from a permanently lost tablet
> Maybe we should add a check somewhere that we don't actually have a live ta
I looked into this a bit and the master doesn't have that much information 
about how healthy a tablet is, and isn't the authority on it, especially if the 
tablet is unavailable but not unrecoverable. So I think having the scary 
"unsafe" in front of the name suffices here; adding a --force flag would mean 
in practice we'd always have to specify --force and it wouldn't be meaningful.


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/client/client-internal.cc@97
PS2, Line 97: using master::ReplaceTabletRequestPB;
> warning: using decl 'ReplaceTabletRequestPB' is unused [misc-unused-using-d
Done


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/client/client-internal.cc@98
PS2, Line 98: using master::ReplaceTabletResponsePB;
> warning: using decl 'ReplaceTabletResponsePB' is unused [misc-unused-using-
Done


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.h@577
PS2, Line 577: , causing all of its data
 :   // to be permanently lost.
> Would it be feasible to turn the replaced tablet into a new single-tablet t
Done, but it involved making an exception in the MetaCache lookup code because 
having a tablet switch tables violated some assumptions.


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.cc@4249
PS2, Line 4249:   
replaced_tablet->mutable_metadata()->mutable_dirty()->set_state(SysTabletsEntryPB::DELETED,
> if we make it DELETED it will also get removed off of the tablet servers. w
Done


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master.proto@712
PS2, Line 712: there an e
> nit: there is an
Done


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master.proto@792
PS2, Line 792: option (kudu.rpc.authz_method) = "AuthorizeService";
> we don't want AuthorizeSuperUser here?
Done


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master_service.cc
File src/kudu/master/master_service.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master_service.cc@517
PS2, Line 517:   LOG(INFO) << "ReplaceTablet: received request to replace 
tablet " << req->tablet_id();
> worth logging the requestor info too
Done


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc@2371
PS2, Line 2371:   const int kNumTservers = 3;
  :   const int kNumTablets = 3;
> Maybe test with multiple masters too? This would still work, right?
It's a bit of work to change ToolTest to support that and there's nothing 
really different about multimaster, so I hope you don't mind if I pass on it. 
(Also my manual testing was with multimaster).


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc@2392
PS2, Line 2392:   // Replace the tablet. This isn't the use case for the tool, 
but it should work.
> Should it? Maybe we should safeguard against users shooting themselves in t
The master can think a tablet is healthy when it's busted, as we've seen, and 
it's not the authority on the health of a tablet, so I think we ought to trust 
the users to know what they are doing or be scared of the "unsafe" at the 
beginning of the action name.


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc@2490
PS2, Line 2490: workload.StopAndJoin();
> Another thought: maybe use the cluster verifier after this to make sure thi
Done



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 

[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-03-11 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#3).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet unsafe_replace_tablet` that can be
used to replace a tablet with another having the same partition
information. This tool is useful when a tablet is has permanently
lost all replicas because it can repair the tablet's table.
Before, the table would have to be recreated to recover from a
permanently lost tablet.

The original tablet is moved to a new, specially-named table,
named like

replaced___

so that bad replicas are "quarantined" so they can be investigated,
and if a replica of the original tablet can be recovered, its data
can be scanned and recovered relatively easily. Note that, if the
tablet recovers quickly enough after being replaced, its possible
for clients with cache entries from before the replace to read
the rows from the replaced tablet, but eventually this will stop.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/client/meta_cache.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tablet.cc
11 files changed, 367 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/3
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-02-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.h@577
PS2, Line 577: , causing all of its data
 :   // to be permanently lost.
Would it be feasible to turn the replaced tablet into a new single-tablet 
table, so that if you eventually were able to repair it, you could at least get 
the data back out of it?


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/catalog_manager.cc@4249
PS2, Line 4249:   
replaced_tablet->mutable_metadata()->mutable_dirty()->set_state(SysTabletsEntryPB::DELETED,
if we make it DELETED it will also get removed off of the tablet servers. we 
might want to keep it there for some investigation or attempts to recover the 
data. Is there a way we can accomplish that?


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master.proto@792
PS2, Line 792: option (kudu.rpc.authz_method) = "AuthorizeService";
we don't want AuthorizeSuperUser here?


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master_service.cc
File src/kudu/master/master_service.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master_service.cc@517
PS2, Line 517:   LOG(INFO) << "ReplaceTablet: received request to replace 
tablet " << req->tablet_id();
worth logging the requestor info too



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 23 Feb 2018 23:19:10 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-02-23 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc@2490
PS2, Line 2490: workload.StopAndJoin();
Another thought: maybe use the cluster verifier after this to make sure things 
look spick and span. I'd imagine ksck after this point would be immediately 
healthy, right?



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 23 Feb 2018 17:43:29 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-02-22 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 2:

(5 comments)

Mostly some nits and high-level points

http://gerrit.cloudera.org:8080/#/c/9393/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9393/2//COMMIT_MSG@9
PS2, Line 9: replace_tablet
nit: how about unsafe_replace_tablet, given we're guaranteed "data loss"


http://gerrit.cloudera.org:8080/#/c/9393/2//COMMIT_MSG@13
PS2, Line 13: Before, the table would
: have to be recreated to recover from a permanently lost tablet
Maybe we should add a check somewhere that we don't actually have a live tablet 
somewhere. And if we do (or think we do), block the run with a `--force` flag 
or somesuch.


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/master/master.proto@712
PS2, Line 712: there an e
nit: there is an


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc@2371
PS2, Line 2371:   const int kNumTservers = 3;
  :   const int kNumTablets = 3;
Maybe test with multiple masters too? This would still work, right?


http://gerrit.cloudera.org:8080/#/c/9393/2/src/kudu/tools/kudu-tool-test.cc@2392
PS2, Line 2392:   // Replace the tablet. This isn't the use case for the tool, 
but it should work.
Should it? Maybe we should safeguard against users shooting themselves in the 
foot.



--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Fri, 23 Feb 2018 00:25:31 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-02-22 Thread Will Berkeley (Code Review)
Hello Tidy Bot, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9393

to look at the new patch set (#2).

Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet replace_tablet` that can be used
to replace a tablet with another having the same partition
information. Any data in the replaced tablet is lost. This tool
is useful when a tablet is has permanently lost all replicas
because it can repair the tablet's table. Before, the table would
have to be recreated to recover from a permanently lost tablet.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tablet.cc
10 files changed, 318 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/2
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-02-22 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9393 )

Change subject: KUDU-2290: Tool to re-create a tablet
..


Patch Set 1:

I think tidy bot is wrong about the using decls.


--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 22 Feb 2018 19:32:31 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2290: Tool to re-create a tablet

2018-02-22 Thread Will Berkeley (Code Review)
Will Berkeley has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9393


Change subject: KUDU-2290: Tool to re-create a tablet
..

KUDU-2290: Tool to re-create a tablet

This adds a tool `kudu tablet replace_tablet` that can be used
to replace a tablet with another having the same partition
information. Any data in the replaced tablet is lost. This tool
is useful when a tablet is has permanently lost all replicas
because it can repair the tablet's table. Before, the table would
have to be recreated to recover from a permanently lost tablet.

Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
---
M src/kudu/client/client-internal.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/master_service.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tablet.cc
10 files changed, 312 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/9393/1
--
To view, visit http://gerrit.cloudera.org:8080/9393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifbefbde68e3ca724f04efe0426a3906e5c33ed3c
Gerrit-Change-Number: 9393
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley