[
https://issues.apache.org/jira/browse/QUARKS-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193998#comment-15193998
]
ASF GitHub Bot commented on QUARKS-8:
-------------------------------------
Github user vdogaru commented on a diff in the pull request:
https://github.com/apache/incubator-quarks/pull/9#discussion_r56062002
--- Diff: runtime/etiao/src/main/java/quarks/runtime/etiao/EtiaoJob.java ---
@@ -51,6 +53,10 @@ public EtiaoJob(DirectGraph graph, String topologyName,
ServiceContainer contain
ControlService cs = container.getService(ControlService.class);
if (cs != null)
cs.registerControl(JobMXBean.TYPE, getId(), getName(),
JobMXBean.class, new EtiaoJobBean(this));
+
+ this.jobs = (JobRegistry)
container.getService(JobRegistryService.class);
--- End diff --
Thanks for catching this. My initial thinking was that a
JobRegistryService provides restricted functionality to the application code:
query, subscribe to job changes, and remove jobs, while **adding** jobs to the
registry is the responsibility of the runtime, which would directly call the
service implementation. A job is registered with a CONSTRUCTED state when a
DirectTopology is created, then it changes to RUNNING when the topology is
successfully submitted. Allowing application code to register jobs might clash
with the current implementation if applications add jobs already registered by
the runtime.
Currently the JobRegistryService docs indicate "registration" of Jobs as
well, but there is no add() method, which is confusing. Which of the following
is preferable?
1. JobRegistryService restricted to job query / removal (I'll update the
javadocs accordingly.)
2. Add registration to JobRegistryService.
> Restart topology on uncaught exception
> --------------------------------------
>
> Key: QUARKS-8
> URL: https://issues.apache.org/jira/browse/QUARKS-8
> Project: Quarks
> Issue Type: New Feature
> Components: Runtime
> Reporter: Victor Dogaru
> Assignee: Victor Dogaru
> Labels: failure-recovery
>
> If a Quarks thread abruptly terminates due to an uncaught exception the
> runtime shuts down.
> A mechanism is needed to prevent the Quarks application from becoming
> unavailable:
> * Have a monitor service which resubmits the topology in case it shuts down.
> * Restart the failed oplet. In the ETIAO runtime, a source and all downstream
> oplet invocations (up to an Isolate) will execute in the same thread, so the
> runtime needs to restart the failed path.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)