Craig,
While HOD does not do this automatically, please note that since you are bringing up a Map/Reduce cluster on the allocated nodes, you can submit map/reduce parameters with which to bring up the cluster when allocating jobs. The relevant options are --gridservice-mapred.server-params (or -M in shorthand). Please refer to http://hadoop.apache.org/core/docs/r0.19.0/hod_user_guide.html#Options+for+Configuring+Hadoop for details.
I was aware of this, but the issue is that unless you obtain dedicated nodes (as above), this option is not suitable, as it isn't set on a per-node basis. I think it would be /fairly/ straightfoward to add to HOD, as I detailed in my initial email, so that it "does the correct thing" out the box.

True, I did assume you obtained dedicated nodes. It has been fairly simpler to operate HOD in this manner, and if I understand correctly, would help to solve the requirement you are having as well.
According to hadoop-default.xml, the number of maps is "Typically set to a prime several times greater than number of available hosts." - Say that we relax this recommendation to read "Typically set to a NUMBER several times greater than number of available hosts" then it should be straightforward for HOD to set it automatically then?


Actually, AFAIK, the number of maps for a job is determined more or less exclusively by the M/R framework based on the number of splits. I've seen messages on this list before about how the documentation for this configuration item is misleading. So, this might actually not make a difference at all, whatever is specified.

Thanks
Hemanth

Reply via email to