[
https://issues.apache.org/jira/browse/UIMA-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584313#comment-13584313
]
Lou DeGenaro commented on UIMA-2689:
------------------------------------
Also, improve Reason on Jobs page to be (green) MonitorTimeoutNominal or (red)
MonitorTimeoutWarning instead of WaitTimeout.
> DUCC webserver (WS) auto-cancel feature should not be enabled for "shadows"
> ---------------------------------------------------------------------------
>
> Key: UIMA-2689
> URL: https://issues.apache.org/jira/browse/UIMA-2689
> Project: UIMA
> Issue Type: Bug
> Reporter: Lou DeGenaro
> Assignee: Lou DeGenaro
> Priority: Minor
>
> The auto-cancel feature is enabled when --cancel_on_interrupt flag is
> specified at submit time. When this flag is specified, the WS expects to
> receive regular pings from the submitter, which are automatically supplied by
> the associated monitor. If pings are absent for too long (config'd in
> ducc.properties via ducc.ws.job.automatic.cancel.minutes) the WS proceeds to
> cancel the job.
> It is possible to run multiple WS instances (aka "shadows"). Since only one
> is ping'd, the other(s) will detect a ping timeout and incorrectly cancel the
> job.
> The solution is to have only the "primary" WS as the monitor+canceler. This
> is determined by matching the specification in ducc.properties for WS
> host:port with that of the actual host:port of the WS. Only if these match
> will the auto-cancel code be enabled in the WS.
> To bypass this problem until fix is delivered, run only a single WS.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira