johnyangk opened a new pull request #71: [NEMO-55] Handle NCS 
Master-to-Executor RPC failures
URL: https://github.com/apache/incubator-nemo/pull/71
 
 
   JIRA: [NEMO-55: Handle NCS Master-to-Executor RPC 
failures](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-55)
   
   **Major changes:**
   - Ignores NCS RPC failures assuming that executor failures will be handled 
by the FailedEvaluator event
   - Introduces the concept of 'poisoned' resources for integration tests
   - Improves the scheduling logic in the master, and exception handling logic 
in the data plane to pass the added integration test
   
   **Minor changes to note:**
   - Reorders some methods to group similar methods together
   - Pretty logs, more helpful comments
   
   **Tests for the changes:**
   - AlternatingLeastSquareITCase#testPadoWithPoison : Fails the TRANSIENT 
resource every 1-3 seconds. On my mac the resource is failed and reacquired 
around 3~6 times before the job completes and the test passes.
   
   **Other comments:**
   - https://issues.apache.org/jira/browse/NEMO-140 is filed for more general 
handling of RPCs
   - Will file issues soon for refactoring the data plane, and making it more 
easy to see how exceptions are handled
   
   resolves 
[NEMO-55](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-55)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to