Re: GridEngine module for Hadoop on Demand

Hemanth Yamijala Thu, 07 May 2009 21:52:21 -0700

Daniel Templeton wrote:

Hi,
I have a functioning module for Grid Engine for HoD, but some parts ofit are currently hard-coded to my workstation. In cleaning up thoseelements, I need some advice. Hopefully this is the right forum.
So, in the hodlib/NodePools/torque.py file, there's a runWorkers()method. In that method, it makes a single call to pbsdsh to start theNameNode, DataNodes, JobTracker, and TaskTracker. I know nada aboutTorque, so please tell me if I'm interpreting this correctly. Itwould appear that the pbsdsh somehow reads out of the environment howmany hodring processes it should start up and executes them remotely,and each hodring then figures out what service it should run.

Roughly right. In Torque, when a set of nodes are assigned to a job, thefirst node in that list is special (it's called mother superior - MS),the other nodes are called sisters. The job that's submitted to torqueis a HOD process called 'ringmaster'. The ringmaster starts on the MSand invokes runWorkers which executes pbsdsh. AFAIK, pbsdsh reads theenvironment and gets a 'nodes' file that Torque writes out. This filecontains all the sisters allocated for the job (including the MS). Itexecutes the command passed to pbsdsh - another HOD process, calledhodring - on all of these nodes. The Hodring processes work with theringmaster and decide which service to run. In a sense the ringmastercoordinates which service to start where, and inform the hodring tostart that service.

In Grid Engine, the rough equivalent of pbsdsh is qrsh. (I think.)With qrsh, the master assigns the HoD job a set of nodes, and I thenhave to step through that set of nodes and qrsh to each one to startthe hodring services. As far as I can tell, the total number ofhodring services I need to start is 1 for the NameNode + 1 for theJobTracker + n for the DataNodes + m for the TaskTrackers.

HOD has a facility to use a HDFS service that's started outside of HOD.In that mode, it does not start NameNode or DataNodes. Also, the numberof DataNodes always equals the number of TaskTrackers (if HDFS servicesare started with HOD).

The thing that I'm not grokking is how the hodrings know what servicesto start, and how I should be parceling them out across the nodes ofthe cluster.

This is decided by the ringmaster process. The logic is independent ofthe resource manager in use, and hence need not be worried about whenporting to a new resource manager.

Should I be making sure I have two hodrings per node, one for theDataNode and one of the TaskTracker?

No, a single hodring gets to start both the daemons.

If I were to go start a dozen hodrings, one on each of a dozenmachines, would they work out among themselves how many should beDataNodes and how many should be TaskTrackers? One more thing. If theabove is on the mark, that means you're consuming a queue slot foreach DataNode unless you use an external hdfs service. That seemslike a waste of cluster resources since slots tend to correspond moreto compute resources than I/O. I have to wonder if it wouldn't bemore efficient from a cluster perspective to have each hodring start aDataNode and a TaskTracker. It would slightly oversubscribe that jobslot, but that may be better than grossly undersubscribing two.

Explained above.

Thanks
Hemanth

Re: GridEngine module for Hadoop on Demand

Reply via email to