[ 
https://issues.apache.org/jira/browse/HAMA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212568#comment-13212568
 ] 

Suraj Menon commented on HAMA-498:
----------------------------------

Have 2 questions - 
1. What reasonable value for period are we looking at for ping here?  I am 
currently setting it at 1 sec. Is it too high or low?
2. BSPPeerChild waits for the completion of the task. Would we be getting rid 
of this once we have this feature? If not, how is pinging helping the cause? 
Say the main logic of BSPTask(or BSPTaskRunner) hangs, but the pinging thread 
in BSPTask thread is active. The current code excerpt looks like this - 

private static class PingGroomServer implements Runnable{
 
    private BSPPeerProtocol pingRPC;
    private TaskAttemptID taskId;
 
    public PingGroomServer(BSPPeerProtocol umbilical, TaskAttemptID id){
      pingRPC = umbilical;
      taskId = id;
    }
 
    @Override
    public void run() {
 
      try {
        LOG.debug("Pinging at time " + Calendar.getInstance().toString());
        pingRPC.ping(taskId);
      } catch (IOException e) {
        LOG.error(
            new StringBuilder("IOException pinging GroomServer from task - ")
            .append(taskId), e);
        //System.exit(1);
      }
      catch (Exception e){
        LOG.error(
            new StringBuilder("Exception pinging GroomServer from task - ")
            .append(taskId), e);
        //System.exit(1);
      }
 
    }
  }
...// body of BSPTask ..
 
this.pingService = Executors.newScheduledThreadPool(1);
 
 
private void startPingingGroom(BSPJob job, BSPPeerProtocol umbilical){
    LOG.debug("Scheduling ping service");
    long pingPeriod = job.getConf().getLong(Constants.GROOM_PING_PERIOD,
        Constants.DEFAULT_GROOM_PING_PERIOD)/2;
    LOG.debug("Scheduling with fixed delay for bsp task" + taskId);
    try{
      if(pingPeriod > 0){
        pingService.scheduleWithFixedDelay(    
            new PingGroomServer(umbilical, taskId),
            0, pingPeriod,TimeUnit.MILLISECONDS);
      }
    }
    catch(Exception e){
      LOG.error("Error scheduling ping service", e);
    }
   
    LOG.debug("Scheduled ping service");
  }
 
  private void stopPingingGroom(){
    if(pingService != null){
      pingService.shutdownNow();
    }
  }

                
> BSPTask should periodically ping its parent.
> --------------------------------------------
>
>                 Key: HAMA-498
>                 URL: https://issues.apache.org/jira/browse/HAMA-498
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.4.0
>            Reporter: Edward J. Yoon
>              Labels: newbie
>             Fix For: 0.5.0
>
>
> As described in http://wiki.apache.org/hama/GroomServerFaultTolerance
> BSPTask should periodically ping its parent 'GroomServer' for their health 
> status.
> 1. If Tasks are unable to ping their parent 'GroomServer', it should be 
> killed themselves.
> 2. And, if GroomServer does not receive ping from the childs, GroomServer 
> should check whether that child is running.
> You don't need to implement recovery logic in this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to