Hello all: I'm the administrator of an existing, shared HPC cluster. The majority of the users of this cluster use MPI jobs, and do not wish to change to hadoop or other systems. However, one of my users wishes to run hadoop jobs on this cluster.
In order to accommodate the variety of users, our cluster operates under the principal that ALL nodes are scheduled via the resource manager (torque+maui). No compute nodes are allowed to be used without being requested via the scheduler. This applies to anyone whishing to run hadoop processes on this cluster as well. To that end, I found HOD, which appears was designed to do just what I needed: request nodes via the scheduler, set up a hadoop "cluster" on those nodes, and then exit cleanly when done, returning the nodes to the available pool. Unfortunately, I've been having problems getting it to work. For testing purposes, I'm setting everything up in my home directory; once I have a better grasp of what's required to make it work, I'll set it up for other users. The issue I'm presently running into appears to involve how resources are requested via the resource manager. My cluster consists of 24 8-core nodes and 6 16-core nodes. Normally, when an MPI user requests 8 nodes, they're very likely to get one physical node with all 8 cores allocated to that user/job. There are additional constrains a user can request when they request resources. For example, they can request nodes=8:ppn=8, which which en essence says, "give me 8 nodes, each with 8 cores on that node". Nodes can also be called out by name. Presently, hod is simply taking the -n specified on the command line (number of nodes) and requesting that of torque. So, if I request fewer than 8 nodes, all of them end up on the same physical node, and the process of setting up a hadoop cluster blows up (file already exists errors). If I request more than 8 nodes, then I end up with 2 physical nodes, and the startup appears to succeed (with mapreduce and hdfs running on separate physical nodes). Unfortunately, its still trying to start that number of hadoop processes. When I go to use that hadoop cluster, it can't communicate with it (java exceptions when trying to connect to the IP:port of the ringmaster). I logged into that node and checked with netstat, and sure enough, nothing was listening on that port. I also checked the port that the startup process spit out for the http status, and there was nothing listening on that port either. I'm guessing that the multiple hadoop processes are still stepping on each other on the same machine somehow. So, I was trying to somehow trick/force hod to request resources from the scheduler in a way such that each hadoop node is on a separate physical node. Unfortunately, I've been unable to do so, as it appears hod overwrites any attempt to override the resource specification of "node". For example, I've tried: hod allocate -d /home/kusznir/hadoop-test-1 -n 4 --resource_manager.options='l:nodes=compute-0-15+compute-0-16+compute-0-17+compute-0-18' hod allocate -d /home/kusznir/hadoop-test-1 -n 4 --resource_manager.options='l:nodes=32' hod allocate -d /home/kusznir/hadoop-test-1 -n 4 --resource_manager.options='l:nodes=4:ppn=8' All with failure (and the last one hod refuses to accept). So, how exactly does one make hod work on clusters with multiple cores? --Jim