[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15623066#comment-15623066 ] Dinesh Bhat commented on KUDU-1618: --- Cool, thanks Todd for these ideas, I will spin a patch for that independent of this JIRA. > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it causes the whole > tserver to fail. > For these cases we should add a 'local_replica' subtool to fully remove a > local tablet. Related, it might be useful to have a 'local_replica archive' > which would create a tarball from the data in this tablet for later > examination by developers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15606498#comment-15606498 ] Todd Lipcon commented on KUDU-1618: --- If I recall correctly, ksck does call ListTablets on each of the tablet servers, in which case it could notice tablets that are on tservers that "shouldn't be" > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it causes the whole > tserver to fail. > For these cases we should add a 'local_replica' subtool to fully remove a > local tablet. Related, it might be useful to have a 'local_replica archive' > which would create a tarball from the data in this tablet for later > examination by developers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15606212#comment-15606212 ] Dinesh Bhat commented on KUDU-1618: --- Thanks [~tlipcon], agreed to your points above that this is not a bug. I confirmed that ksck as of now doesn't know about this spurious replica resulting from the tool's action. I wonder if it's even possible to show this info via ksck because I guess these reports are not sent to master ? > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it causes the whole > tserver to fail. > For these cases we should add a 'local_replica' subtool to fully remove a > local tablet. Related, it might be useful to have a 'local_replica archive' > which would create a tarball from the data in this tablet for later > examination by developers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15577292#comment-15577292 ] Todd Lipcon commented on KUDU-1618: --- bq. Todd Lipcon thanks for a quick reply, by 'shouldn't have a replica' in above comment, you meant current tablet server where we are trying to bring up the replica, is not part of raft config for that tablet anymore right ? It has other tservers as replicas at this point. That makes sense. I believe tserver keeps trying until there may be another change_config in future which brings in this tserver as replica for that tablet. Right, when you copied the replica it copied the new configuration, and it's not a part of that configuration. So, it knows that it shouldn't try to get the other nodes to vote for it. It would be reasonable to say that we should detect this scenario and mark the tablet as 'failed', but it's actually somewhat useful occasionally -- eg I've used this before to copy a tablet from a running cluster onto my laptop so I could then use tools like 'dump_tablet' against it locally. Given that I don't think you can get into this state without using explicit tablet copy repair tools, I don't think it should really be considered a bug. bq. One follow up Qn is: What state should the replica be in after step 6 ? I see it in RUNNING state, which was slightly confusing, because this replica isn't an active replica at this point. The tablet's state is referring more to the data layer. It's up and running, it has replayed its log, it has valid data, etc. So it's RUNNING even though it's not actually an active part of any raft configuration. If you run ksck on this cluster does ksck report the "extra" replica anywhere? That might be a useful thing to do so we can detect if this ever happens in real life. > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it causes the whole > tserver to fail. > For these cases we should add a 'local_replica' subtool to fully remove a > local tablet. Related, it might be useful to have a 'local_replica archive' > which would create a tarball from the data in this tablet for later > examination by developers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576389#comment-15576389 ] Dinesh Bhat commented on KUDU-1618: --- [~tlipcon] thanks for a quick reply, by 'shouldn't have a replica' in above comment, you meant current tablet server where we are trying to bring up the replica, is not part of raft config for that tablet anymore right ? It has other tservers as replicas at this point. That makes sense. I believe tserver keeps trying until there may be another change_config in future which brings in this tserver as replica for that tablet. One follow up Qn is: What state should the replica be in after step 6 ? I see it in RUNNING state, which was slightly confusing, because this replica isn't an active replica at this point. > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it causes the whole > tserver to fail. > For these cases we should add a 'local_replica' subtool to fully remove a > local tablet. Related, it might be useful to have a 'local_replica archive' > which would create a tarball from the data in this tablet for later > examination by developers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576300#comment-15576300 ] Todd Lipcon commented on KUDU-1618: --- This seems like expected behavior to me. You created a replica on a node that was removed from the raft config, so when it starts up, it's confused because the metadata says it shouldn't have a replica. > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it causes the whole > tserver to fail. > For these cases we should add a 'local_replica' subtool to fully remove a > local tablet. Related, it might be useful to have a 'local_replica archive' > which would create a tarball from the data in this tablet for later > examination by developers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576179#comment-15576179 ] Dinesh Bhat commented on KUDU-1618: --- Although above issue was not exactly related to 'local_replica delete' tool we are writing, this issue was observed more or less in the context of some testing around the usability of this tool, hence using an existing JIRA to investigate the above observations. > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it causes the whole > tserver to fail. > For these cases we should add a 'local_replica' subtool to fully remove a > local tablet. Related, it might be useful to have a 'local_replica archive' > which would create a tarball from the data in this tablet for later > examination by developers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KUDU-1618) Add local_replica tool to delete a replica
[ https://issues.apache.org/jira/browse/KUDU-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576176#comment-15576176 ] Dinesh Bhat commented on KUDU-1618: --- I was trying to repro an issue where I was not able to do a remote tablet copy onto a local_replica if the tablet was DELETE_TOMBSTONED(but has metadata file present). However along with the issue reproduction, I saw one state of the replica which was confusing. Here are the steps I executed: 1. Bring up a cluster with 1 master, 3 tablet servers hosting 3 tablets, each tablet had 3 replicas. 2. There was a standby server which was added later. 3. KILL one tserver, after 5 mins the all replicas on that tserver failover to new standby. 4. Use 'local_replica copy_from_remote' to copy one tablet replica before bringing up, the command fails: {noformat} I1013 16:43:41.523896 30948 tablet_copy_service.cc:124] Beginning new tablet copy session on tablet 048c7d202da3469eb1b1973df9510007 from peer bb2517bc5f2b4980bb07c06019b5a8e9 at {real_user=dinesh, eff_user=} at 127.61.33.8:40240: session id = bb2517bc5f2b4980bb07c06019b5a8e9-048c7d202da3469eb1b1973df9510007 I1013 16:43:41.524291 30948 tablet_copy_session.cc:142] T 048c7d202da3469eb1b1973df9510007 P 19acc272821d425582d3dfb9ed2ab7cd: Tablet Copy: opened 0 blocks and 1 log segments Already present: Tablet already exists: 048c7d202da3469eb1b1973df9510007 {noformat} 5. Remove the metadata file and WAL log for that tablet, and the copy_from_fremote succeeds at this point(expected). 6. Bring up the killed tserver, now all replicas on this are tombstoned except one tablet for which we did a copy_from_remote in step 5. Master who was incessantly trying to TOMBSTONED the evicted replicas on the tserver which was down earlier, throws some interesting log: {noformat} [dinesh@ve0518 debug]$ I1013 16:55:54.551717 26141 catalog_manager.cc:2591] Sending DeleteTablet(TABLET_DATA_TOMBSTONED) for tablet 048c7d202da3469eb1b1973df9510007 on bb2517bc5f2b4980bb07c06019b5a8e9 (127.95.58.1:40867) (TS bb2517bc5f2b4980bb07c06019b5a8e9 not found in new config with opid_index 4) W1013 16:55:54.552803 26141 catalog_manager.cc:2552] TS bb2517bc5f2b4980bb07c06019b5a8e9 (127.95.58.1:40867): delete failed for tablet 048c7d202da3469eb1b1973df9510007 due to a CAS failure. No further retry: Illegal state: Request specified cas_config_opid_index_less_or_equal of -1 but the committed config has opid_index of 5 I1013 16:55:54.884133 26141 catalog_manager.cc:2591] Sending DeleteTablet(TABLET_DATA_TOMBSTONED) for tablet e9481b695d34483488af07dfb94a8557 on bb2517bc5f2b4980bb07c06019b5a8e9 (127.95.58.1:40867) (TS bb2517bc5f2b4980bb07c06019b5a8e9 not found in new config with opid_index 3) I1013 16:55:54.885964 26141 catalog_manager.cc:2567] TS bb2517bc5f2b4980bb07c06019b5a8e9 (127.95.58.1:40867): tablet e9481b695d34483488af07dfb94a8557 (table test-table [id=ca8f507e47684ddfa147e2cd232ed773]) successfully deleted I1013 16:55:54.915202 26141 catalog_manager.cc:2591] Sending DeleteTablet(TABLET_DATA_TOMBSTONED) for tablet e3ff6a1529cf46c5b9787fe322a749e6 on bb2517bc5f2b4980bb07c06019b5a8e9 (127.95.58.1:40867) (TS bb2517bc5f2b4980bb07c06019b5a8e9 not found in new config with opid_index 3) I1013 16:55:54.916774 26141 catalog_manager.cc:2567] TS bb2517bc5f2b4980bb07c06019b5a8e9 (127.95.58.1:40867): tablet e3ff6a1529cf46c5b9787fe322a749e6 (table test-table [id=ca8f507e47684ddfa147e2cd232ed773]) successfully deleted {noformat} 7. It continuously spews log messages like this now: {noformat} [dinesh@ve0518 debug]$ W1013 16:55:36.608486 6519 raft_consensus.cc:461] T 048c7d202da3469eb1b1973df9510007 P bb2517bc5f2b4980bb07c06019b5a8e9 [term 5 NON_PARTICIPANT]: Failed to trigger leader election: Illegal state: Not starting election: Node is currently a non-participant in the raft config: opid_index: 5 OBSOLETE_local: false peers { permanent_uuid: "9acfc108d9b446c1be783b6d6e7b49ef" member_type: VOTER last_known_addr { host: "127.95.58.0" port: 33932 } } peers { permanent_uuid: "b11d2af1457b4542808407b4d4d1bd29" member_type: VOTER last_known_addr { host: "127.95.58.2" port: 40670 } } peers { permanent_uuid: "19acc272821d425582d3dfb9ed2ab7cd" member_type: VOTER last_known_addr { host: "127.61.33.8" port: 63532 } } {noformat} > Add local_replica tool to delete a replica > -- > > Key: KUDU-1618 > URL: https://issues.apache.org/jira/browse/KUDU-1618 > Project: Kudu > Issue Type: Improvement > Components: ops-tooling >Affects Versions: 1.0.0 >Reporter: Todd Lipcon >Assignee: Dinesh Bhat > > Occasionally we've hit cases where a tablet is corrupt in such a way that the > tserver fails to start or crashes soon after starting. Typically we'd prefer > the tablet just get marked FAILED but in the worst case it