[ 
https://issues.apache.org/jira/browse/QUARKS-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193855#comment-15193855
 ] 

ASF GitHub Bot commented on QUARKS-8:
-------------------------------------

GitHub user vdogaru opened a pull request:

    https://github.com/apache/incubator-quarks/pull/9

    QUARKS-8 [WIP] Restart topology on uncaught exception

    [Job registry and sample 
monitoring](https://github.com/apache/incubator-quarks/commit/6587c5c4c66104d386585349d1e74439c8ca95cd)
 contains:
     * JobRegistryService which allows clients to register jobs, access 
registered jobs, and get notified when jobs are added / removed / updated.  Job 
event handlers are BiConsumer implementations which take an event type and the 
related Job instance as arguments
     * JobRegistry - JobRegistryService implementation used by the ETIAO runtime
     * Sample code demonstrating a system monitoring application which polls 
the registry for the current state of registered jobs.
    
    Does it sound like I am on the right track here? I'd appreciate some early 
feedback.  
    
    The following are still to be done:
     * Sample system monitoring application using events rather than polling.
     * System app which restarts closed jobs
     * Tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vdogaru/incubator-quarks QUARKS-8-vdogaru

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-quarks/pull/9.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9
    
----
commit 6587c5c4c66104d386585349d1e74439c8ca95cd
Author: Victor Dogaru <[email protected]>
Date:   2016-03-14T18:11:12Z

    Job registry and sample monitoring app

----


> Restart topology on uncaught exception
> --------------------------------------
>
>                 Key: QUARKS-8
>                 URL: https://issues.apache.org/jira/browse/QUARKS-8
>             Project: Quarks
>          Issue Type: New Feature
>          Components: Runtime
>            Reporter: Victor Dogaru
>            Assignee: Victor Dogaru
>              Labels: failure-recovery
>
> If a Quarks thread abruptly terminates due to an uncaught exception the 
> runtime shuts down. 
> A mechanism is needed to prevent the Quarks application from becoming 
> unavailable:
> * Have a monitor service which resubmits the topology in case it shuts down.
> * Restart the failed oplet. In the ETIAO runtime, a source and all downstream 
> oplet invocations (up to an Isolate) will execute in the same thread, so the 
> runtime needs to restart the failed path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to