ddanielr commented on PR #5438: URL: https://github.com/apache/accumulo/pull/5438#issuecomment-2769844738
> Can we get an updated description of this PR? It links to another PR that is much bigger, but I haven't reviewed yet. Based on previous conversations, I _think_ this is supposed to just kill locks in ZK, and then add a flag that the AbstractServer base class looks for, in order to prevent starting up anything if the flag is seen. Yep, it will also check for running Fate transactions and fail with a warning message for users to go clean them up in preparation for an upgrade. > However, I have a couple of concerns/questions: > > 1. What if a user starts up 2.1.3 services? 2.1.x should be forwards/backwards compatible, so this should be fine. Now, we cannot trust that this flag in ZK actually prevented any services from holding locks. 2.1.3 services wouldn't respect the flag so they would operate as normal. You would only see effects from this in 2.1.4 & up versions. > 2. What if the user runs this to upgrade to 2.1.5? Is there enough information in the node in the new node in ZK to allow that as an upgrade path? What about if they jump straight to 2.1.6? Those should have the same data versions as 2.1.4, but will they be prevented from starting up? The command description states that it should _"prepare Accumulo for an upgrade to the next non-bugfix release"_. So the expectation is that a version upgrade to 2.1.5 or 2.1.6 shouldn't require the command to be run. If it were to be run, then the server logs will state what node needs to be removed: ``` log.info("Instance {} prepared for upgrade. Server processes will not start while" + " in this state. To undo this state and abort upgrade preparations delete" + " the zookeeper node: {}", iid.canonical(), zUpgradepath); ``` However, manually removing things from ZK without controls isn't an ideal operation for users. ZooPropEditor was created to cleanly handle modifying Zookeeper properties that cause startup failures. We could add a cli option flag to ZooPropEditor for removing the upgrade node and even reference it in the log statement from the AbstractServer code to avoid bugfix version headaches. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org