Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11218 )
Change subject: KUDU-2538: [docs] Document how to manually recover from Cfile corruption ...................................................................... Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/11218/3/docs/troubleshooting.adoc File docs/troubleshooting.adoc: http://gerrit.cloudera.org:8080/#/c/11218/3/docs/troubleshooting.adoc@640 PS3, Line 640: Until link:https://issues.apache.org/jira/browse/KUDU-2469[KUDU-2469] is > Do we want to go into so many details as to list JIRAs? I wan't to list Jiras so users can see that we are working on fixes and what versions they are in once complete. We can update this document once all the fixes are in. http://gerrit.cloudera.org:8080/#/c/11218/3/docs/troubleshooting.adoc@645 PS3, Line 645: link:https://kudu.apache.org/docs/command_line_tools_reference.html#cluster-ksck[ksck tool]. > These links should be relative: Done http://gerrit.cloudera.org:8080/#/c/11218/3/docs/troubleshooting.adoc@655 PS3, Line 655: `link:https://kudu.apache.org/docs/command_line_tools_reference.html#remote_replica-delete[remote_replica delete tool]`. > Same here + the backticks should wrap only `remote_replica delete`: Done http://gerrit.cloudera.org:8080/#/c/11218/3/docs/troubleshooting.adoc@668 PS3, Line 668: `link:https://kudu.apache.org/docs/command_line_tools_reference.html#tablet-unsafe_replace_tablet[unsafe_replace_tablet tool]`. > Same here Done http://gerrit.cloudera.org:8080/#/c/11218/3/docs/troubleshooting.adoc@671 PS3, Line 671: sudo -u kudu kudu tablet unsafe_replace_tablet <master_addresses> <tablet_id> > As this tool will only be available in 1.8.0 and above, maybe we > should put a note about it and suggest upgrading in case the users > run into this corruption on an older release with the only > workaround being deleting the whole range partition (or table in > case there's no range partitioning). I don't think suggesting an upgrade will help given users won't come to the troubleshooting page unless they have already hit the issue. This is the only solution that is straightforward and reliable. All other solutions are case by case and I didn't want to do a long this of conditional options. > Alternatively, we could also describe how to remove the affected > rowset manually from the tablet metadata, but if we do that, we > should add a huge warning about it being unsafe. That is very low level and error prone. I don't think a list of steps that long and complicated should be listed in the docs. Hopefully users will ask in the mailing list or slack if they are in a situation that this documentation doesn't cover/fix. -- To view, visit http://gerrit.cloudera.org:8080/11218 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieefd472bef104921de7cab442fd49ab32c0fe81b Gerrit-Change-Number: 11218 Gerrit-PatchSet: 3 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Wed, 15 Aug 2018 14:13:49 +0000 Gerrit-HasComments: Yes
