dlmarion commented on issue #5041: URL: https://github.com/apache/accumulo/issues/5041#issuecomment-2473535848
@keith-turner and I discussed this - the idea behind this issue is to use ZooKeeper as a mechanism to understand the intended state of the system. If we know the intended state of the system via paths in ZooKeeper, then we can identify which servers are down by looking for paths that don't have an associated lock. For users using `accumulo-cluster`, ZooKeeper would be cleaned up automatically when performing a `stop` operation as `accumulo-cluster` calls ZooZap. My idea here was that a utility could be created and run periodically by the user to clean up ZooKeeper paths when their intended deployment layout has changed. @keith-turner suggested that it might be too easy for ZooKeeper to get polluted with old paths if, for example, the user doesn't run the utility or in a case where the user is using an orchestration system like Kubernetes that stops and starts pods when needed that end up having different hostnames. @keith-turner suggested allowing the user to specify how many proc esses by resource group and type that they are intending to run, then we can compare the actual running count against the intended deployment. We might be able to convey this information in a property value (json?). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
