zymap commented on PR #17228: URL: https://github.com/apache/pulsar/pull/17228#issuecomment-1237630258
---- This is another issue I want to mention, I can send it to the mailing list if you prefer to. I take a deep look yesterday, what we want to resolve by this PR is trying to make the ledgers map consistent between the memory and zookeeper server when offloading fails. I saw in Pulsar Metadata handler, we retry the operation when zookeeper throws connection loss exception. But the operation may fail after the retry. For example, we update the ledgers map in memory after successfully updating the LedgerInfo in the zookeeper . If the zookeeper update operation executes successfully on server but throws connection loss on the client, and we have to retry on the connection loss exception, then the callback may receive a BadVersion exception. At this moment, the memory ledgers list is different from the zookeeper server. And that may cause some other issues on the broker. I'm not sure if I missing something. But looks like there have many places in our code we do not consider that situation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
