[
https://issues.apache.org/jira/browse/HAMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219126#comment-13219126
]
Suraj Menon commented on HAMA-498:
----------------------------------
Oops! got caught for disabling rat check in my pom.xml. Same goes with my
indifference to warnings. Sorry :)
There is a reason why I had the Future.get in the last test case and not in the
first three. I felt that the whole point of implementation was that there
should be a minimum number of pings coming if the task has run for a particular
period of time. The problem where the test case could fail is when the task
takes too long to start. When such a case happens repeatedly, then I think the
test cases should fail and we should reconsider the leeway given to each task
to start. For the last test case, I had to ensure that I got the first ping,
then kill the RPC connection and then wait for the process to die for its exit
status. I had a deterministic sequence of events to wait for.
I had discussed the enigma of port 40000 with you.:) I thought I got over it. I
am not running BSPMaster for any of these test cases. I shall check and find a
fix. This should be happening only for the last test case where server closes
the connection before proxy. I shall find a fix.
I ran into some issues when I used the LocalBSPRunner.LocalSyncClient class.
Shall look into it.
> BSPTask should periodically ping its parent.
> --------------------------------------------
>
> Key: HAMA-498
> URL: https://issues.apache.org/jira/browse/HAMA-498
> Project: Hama
> Issue Type: Sub-task
> Components: bsp
> Affects Versions: 0.4.0
> Reporter: Edward J. Yoon
> Assignee: Suraj Menon
> Labels: newbie
> Fix For: 0.5.0
>
> Attachments: HAMA-498.patch
>
>
> As described in http://wiki.apache.org/hama/GroomServerFaultTolerance
> BSPTask should periodically ping its parent 'GroomServer' for their health
> status.
> 1. If Tasks are unable to ping their parent 'GroomServer', it should be
> killed themselves.
> 2. And, if GroomServer does not receive ping from the childs, GroomServer
> should check whether that child is running.
> You don't need to implement recovery logic in this issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira