HOD refactoring to ease integration with scheduler/resource managers other than 
torque
--------------------------------------------------------------------------------------

                 Key: HADOOP-5441
                 URL: https://issues.apache.org/jira/browse/HADOOP-5441
             Project: Hadoop Core
          Issue Type: Improvement
          Components: contrib/hod
         Environment: All
            Reporter: Nate Woody


Situation: HOD currently uses the pbsdsh (a distributed shell that works via 
Torque's TM interface to start remote processes) command to start processes on 
all nodes in the job.  This call is provided as part of a torqueInterface class 
that is meant to abstract interactions with the torque resource managers (RMs). 
 However, this is not functionality typically provided by other RMs, and is 
instead typically performed by an distributed command available on the HPC 
system, mpiexec, ssh, or site-specific scripts.  The specificity of pbsdsh to 
Torque makes writing HOD interfaces to other RMs somewhat difficult as it 
forces the implementer to choose the remote start method on a somewhat faulty 
per-RM basis.

Proposal: Refactor the torqueInterface and nodePool classes so that the choice 
of remote start method is available as a configuration option in hodrc.  This 
involves fairly simple changes to remove the pbsdsh command from the Scheduler 
class and addition configuration step of starting the appropriate remote start 
wrapper.  The selection of the nodePool class will be altered to allow dynamic 
loading of classes, so that new interfaces people choose to write will not 
require altering HOD code.  Provide remote start classes for pbsdsh, mpiexec, 
ssh, as well as custom scripts (sites often provide mpiexec wrappers that 
ensure proper selection of network interfaces, etc).  Provide interface classes 
to SGE and Moab, as well as updated Torque class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to