helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/13820 )
Change subject: [docs] update the upgrade documentation ...................................................................... Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc File docs/installation.adoc: http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@a635 PS3, Line 635: > I like the idea of documenting how to do a rolling upgrade because we know We have been using this rolling upgrade approach for almost 2 years and I'm sure others are also using it too. I want to hear what other people think about upgrade tests :) http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@635 PS3, Line 635: WARNING: The following upgrade process is only relevant when building from source code. > Why is this the case? Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@638 PS3, Line 638: - Copy the `kudu-tserver`, `kudu-master` and `kudu` binaries from your build directory. > Maybe say something like "Replace the `kudu-server`..." Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@638 PS3, Line 638: - Copy the `kudu-tserver`, `kudu-master` and `kudu` binaries from your build directory. > Maybe "Place the new `kudu-tserver`, `kudu-master`, and `kudu` binaries int Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@640 PS3, Line 640: - Set the unavailable time for every tablet server to a large value (2 hours or more) by gflag > The 2 hours or more feels a bit arbitrary. Maybe say something like 2x your Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@647 PS3, Line 647: - Restart a tablet server and wait until it is online, then reset the gflag above to be 7200 again. > I think this should be broken down further: Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@647 PS3, Line 647: wait until it is online > Maybe note what a user should a user look at to know it's online? Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@651 PS3, Line 651: Make sure the restarted tablet server is already online before resetting the gflag. > This is good information to have, but I'd be concerned that it would confus Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@652 PS3, Line 652: restarted tablet servers > "Replicas hosted on restarted tablet servers" Done http://gerrit.cloudera.org:8080/#/c/13820/3/docs/installation.adoc@652 PS3, Line 652: restarted tablet servers > This seems like a race against time given the default is 5 minutes. In our production cluster, it takes more than half an hour to restart a tserver on average. So, it's important to raise the gflag. -- To view, visit http://gerrit.cloudera.org:8080/13820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b3e5c549dc05c3388c0b0dd628d205a356da344 Gerrit-Change-Number: 13820 Gerrit-PatchSet: 3 Gerrit-Owner: helifu <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Priyanka Chheda <[email protected]> Gerrit-Reviewer: helifu <[email protected]> Gerrit-Comment-Date: Mon, 15 Jul 2019 11:37:16 +0000 Gerrit-HasComments: Yes
