[
https://issues.apache.org/jira/browse/CAMEL-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Claus Ibsen updated CAMEL-15903:
--------------------------------
Fix Version/s: 3.16.0
(was: 3.15.0)
> Master component do not retry endpoint startup on failure
> ---------------------------------------------------------
>
> Key: CAMEL-15903
> URL: https://issues.apache.org/jira/browse/CAMEL-15903
> Project: Camel
> Issue Type: Bug
> Components: camel-master
> Reporter: EDGAR CHERNICK
> Priority: Minor
> Fix For: 3.16.0
>
>
> The cluster view implementations have a listener attribute where the master
> component hooks itself to receive leadership change events.
> When the app instance becomes leader the cluster view will mark that instance
> as leader then it will trigger the leadershipchangedevent, this will trigger
> the master component event handler and it will start the delegated consumer
> and endpoint.
> The issue happens when the delegated consumer or endpoint fail to start. The
> exception throw by them will go up in the stack, however, this exception does
> not affect the leadership, i.e., once the app instance becomes leader it will
> stay so even if the delegated components fail to start.
> Both KubernetesClusterView and FileLockClusterView have this issue.
> KubernetesClusterView uses KubernetesLeadershipController to run the
> leadership check at an interval. When it acquires the leadership it updates
> the configmap with that info and call TimedLeaderNotifier refreshLeadership
> method to check if the leadership has changed. The issue here is that it will
> mark itself as leader before firing the leadership changed event. Another
> issue is that the event is fired in a separete thread, so, when the start of
> the delegated components fail the exception will "die" together with the
> thread. When the next scheduled leadership check runs the app instance is
> already the leader and it will not fire the leadership changed event and the
> delegated component will never start.
> FileLockClusterView has a similar issue, it acquires the file lock prior to
> firing the event, even if the event processing fails it does not rollback the
> leader selection.
> Other cluster view implementations might have the same issue.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)