[
https://issues.apache.org/jira/browse/FALCON-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114754#comment-15114754
]
Pallavi Rao commented on FALCON-1602:
-------------------------------------
My initial thought was that the JobCompletionService should use a combination
of notification and polling. If there are no notifications for "some time",
poll Oozie to see if the job completed. "Some time" can be a function of
duration of previous instance runs or SLA when it exists.
> Recoverability of Falcon Processes when ActiveMQ down for sometime
> -------------------------------------------------------------------
>
> Key: FALCON-1602
> URL: https://issues.apache.org/jira/browse/FALCON-1602
> Project: Falcon
> Issue Type: Task
> Reporter: pavan kumar kolamuri
>
> With Falcon Native Scheduler activemq is used for job completion
> notifications from oozie. When activemq is down for sometime, and oozie fails
> to send notifications of completion of workflows of process instances even
> after retries. Then those instances won't mark as completed in Falcon state
> store. Then for that processes new instances won't be launched assuming old
> one's still running. There should be some recoverability in these cases.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)