[
https://issues.apache.org/jira/browse/QUARKS-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193855#comment-15193855
]
ASF GitHub Bot commented on QUARKS-8:
-------------------------------------
GitHub user vdogaru opened a pull request:
https://github.com/apache/incubator-quarks/pull/9
QUARKS-8 [WIP] Restart topology on uncaught exception
[Job registry and sample
monitoring](https://github.com/apache/incubator-quarks/commit/6587c5c4c66104d386585349d1e74439c8ca95cd)
contains:
* JobRegistryService which allows clients to register jobs, access
registered jobs, and get notified when jobs are added / removed / updated. Job
event handlers are BiConsumer implementations which take an event type and the
related Job instance as arguments
* JobRegistry - JobRegistryService implementation used by the ETIAO runtime
* Sample code demonstrating a system monitoring application which polls
the registry for the current state of registered jobs.
Does it sound like I am on the right track here? I'd appreciate some early
feedback.
The following are still to be done:
* Sample system monitoring application using events rather than polling.
* System app which restarts closed jobs
* Tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vdogaru/incubator-quarks QUARKS-8-vdogaru
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-quarks/pull/9.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9
----
commit 6587c5c4c66104d386585349d1e74439c8ca95cd
Author: Victor Dogaru <[email protected]>
Date: 2016-03-14T18:11:12Z
Job registry and sample monitoring app
----
> Restart topology on uncaught exception
> --------------------------------------
>
> Key: QUARKS-8
> URL: https://issues.apache.org/jira/browse/QUARKS-8
> Project: Quarks
> Issue Type: New Feature
> Components: Runtime
> Reporter: Victor Dogaru
> Assignee: Victor Dogaru
> Labels: failure-recovery
>
> If a Quarks thread abruptly terminates due to an uncaught exception the
> runtime shuts down.
> A mechanism is needed to prevent the Quarks application from becoming
> unavailable:
> * Have a monitor service which resubmits the topology in case it shuts down.
> * Restart the failed oplet. In the ETIAO runtime, a source and all downstream
> oplet invocations (up to an Isolate) will execute in the same thread, so the
> runtime needs to restart the failed path.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)