[
https://issues.apache.org/jira/browse/UIMA-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421561#comment-15421561
]
Lou DeGenaro commented on UIMA-5057:
------------------------------------
Whenever Orchestrator does a publication it calls
org.apache.uima.ducc.orchestrator.jd.scheduler.JdScheduler.handle(IDuccWorkMap
dwm). Here we consider expanding or contracting the JD share pool. But we
fail to consider that an already allocated JD node has gone down. We add a
new call to monitor() to consider this possibility.
Note that we rely upon database updated by Resource Manager to determine node
downness.
> DUCC Orchestrator (OR) handle down JD node
> ------------------------------------------
>
> Key: UIMA-5057
> URL: https://issues.apache.org/jira/browse/UIMA-5057
> Project: UIMA
> Issue Type: Bug
> Components: DUCC
> Reporter: Lou DeGenaro
> Assignee: Lou DeGenaro
> Fix For: 2.2.0-Ducc
>
>
> If a node that goes down happens to be a JD node then:
> 1. cancel any running jobs whose JD was assigned there, and
> 2. allocate a new JD node, if needed and possible
> Currently when the Agent hosting the JD of running Job is killed the Job
> hangs because the Agent is not there to carry out the Orchestrator directed
> clean-up procedures.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)