EDGAR CHERNICK created CAMEL-15903:
--------------------------------------
Summary: Master component do not retry endpoint startup on failure
Key: CAMEL-15903
URL: https://issues.apache.org/jira/browse/CAMEL-15903
Project: Camel
Issue Type: Bug
Reporter: EDGAR CHERNICK
The cluster view implementations have a listener attribute where the master
component hooks itself to receive leadership change events.
When the app instance becomes leader the cluster view will mark that instance
as leader then it will trigger the leadershipchangedevent, this will trigger
the master component event handler and it will start the delegated consumer and
endpoint.
The issue happens when the delegated consumer or endpoint fail to start. The
exception throw by them will go up in the stack, however, this exception does
not affect the leadership, i.e., once the app instance becomes leader it will
stay so even if the delegated components fail to start.
Both KubernetesClusterView and FileLockClusterView have this issue.
KubernetesClusterView uses KubernetesLeadershipController to run the leadership
check at an interval. When it acquires the leadership it updates the configmap
with that info and call TimedLeaderNotifier refreshLeadership method to check
if the leadership has changed. The issue here is that it will mark itself as
leader before firing the leadership changed event. Another issue is that the
event is fired in a separete thread, so, when the start of the delegated
components fail the exception will "die" together with the thread. When the
next scheduled leadership check runs the app instance is already the leader and
it will not fire the leadership changed event and the delegated component will
never start.
FileLockClusterView has a similar issue, it acquires the file lock prior to
firing the event, even if the event processing fails it does not rollback the
leader selection.
Other cluster view implementations might have the same issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)