Bryan A. Pendleton wrote:
I would still like to see some of these site preferences be more
dynamic. For instance, I will soon be using both single CPU and dual
CPU machines, with varying amounts of RAM. I'd happily have an extra
job or 2 scheduled on the dual CPU machines, to keep them utilized and
take better advantage of the RAM (which is mostly serving as disk
cache for my current loads). But, there's no way to set a different
tasks.maximum for each node (or a concept of "class of node") at this
point.
Sure there is: a separate config file per node.
If you'd like to make this automatic, that would be great. We'd need
portable Java code to detect the amount of memory and number of CPUs.
Perhaps this could be done by running some shell commands, parsing their
output, relying on cygwin for Windows support?
Owen's recent benchmark posting showed how machines with 5x performance
variation were effectively used during map, but that slow machines still
affect reduce performance. He's submitted a bug and will likely fix it
(if past experience is any guide):
http://issues.apache.org/jira/browse/HADOOP-253
Adapting to variability of resource is still a big problem across
hadoop. Performance still drops off very rapidly in many cases if you
have a weak node - there's no speculative reduce execution, bugs in
speculative map execution, bad handling of filled-up space during DFS
writes, as well as MapOutputFile writes. In fact, anything that calls
"getLocalPath" gets uniformly spread across available drives, with no
"full" checking - filling up any one drive on the entire cluster can
cause all kinds of things to fail.
Sounds like a good list of things to work on. Want to take on solving
any of these? They won't fix themselves...
Doug