[ 
https://issues.apache.org/jira/browse/UIMA-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421561#comment-15421561
 ] 

Lou DeGenaro commented on UIMA-5057:
------------------------------------

Whenever Orchestrator does a publication it calls 
org.apache.uima.ducc.orchestrator.jd.scheduler.JdScheduler.handle(IDuccWorkMap 
dwm).  Here we consider expanding or contracting the JD share pool.  But we 
fail to consider that an already allocated JD node has gone down.   We add a 
new call to monitor() to consider this possibility.

Note that we rely upon database updated by Resource Manager to determine node 
downness.

> DUCC Orchestrator (OR) handle down JD node
> ------------------------------------------
>
>                 Key: UIMA-5057
>                 URL: https://issues.apache.org/jira/browse/UIMA-5057
>             Project: UIMA
>          Issue Type: Bug
>          Components: DUCC
>            Reporter: Lou DeGenaro
>            Assignee: Lou DeGenaro
>             Fix For: 2.2.0-Ducc
>
>
> If a node that goes down happens to be a JD node then:
> 1. cancel any running jobs whose JD was assigned there, and
> 2. allocate a new JD node, if needed and possible
> Currently when the Agent hosting the JD of running Job is killed the Job 
> hangs because the Agent is not there to carry out the Orchestrator directed 
> clean-up procedures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to