turcsanyip commented on PR #8013: URL: https://github.com/apache/nifi/pull/8013#issuecomment-1890147926
@exceptionfactory Pushed the state clean-up. While I was testing it, I found a more generic issue unfortunately: If you run ConsumeAzureEventHub in a cluster, the processor instances use the shared cluster state provider (e.g. Zookeeper) and maintain the state together, handle the partition balancing, etc. It works properly until all nodes use the shared cluster state. But if you disconnect a node from the cluster (on the UI or I guess the same happens when there is a network issue between the nodes), the disconnected node becomes a standalone node and it does not use the shared state anymore but starts to maintain its own state in the local state provider from scratch. As the disconnected node does not know about the other consumer instances, it believes it is the only consumer in the group and can own all the partitions. Both the remaining part of the cluster and the disconnected node try to own the same partitions. The checkpoints will be maintained in two places (cluster vs local state) and it will lead to message duplication. It would be better if the processor on the disconnected node was stopped and did not create its local state at all. Do you think it is possible? I'm afraid it is how the framework works though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
