We have an existing HPC cluster which runs torque+moab.

There has been requests for a persistent HDFS on a subset of the nodes (~20 nodes of the 800).

We can use Moab to force HOD jobs to go only to those 20 nodes. Thus there should be hadoop map-reduce workers on nodes that have HDFS local also.

Question is then, will work given to an HOD instance, will hadoop move the computation to the closest nodes with the data on HDFS?

I know hadoop does this when it runs the whole cluster, but what does HOD do with external HDFS, even if HOD nodes overlap some (maybe not all) the HDFS nodes.

Rack locality wont matter, currently, the 20 nodes will be all the same blade,

Our goal is to keep running our normal HPC workload, but provide a HDFS that sticks around, and also provides decent performance, relative to normal Hadoop clusters.



Brock Palen
bro...@mlds-networks.com
www.mlds-networks.com
MLDS Owner Senior Tech.


Reply via email to