On 2/12/10 1:42 PM, "himanshu chandola" <[email protected]> wrote:
> So my question is whether its possible for hadoop to select or for us to be
> able to notify hadoop of the nodes which have larger disk space so that it
> doesn't waste time on nodes with low disk space.
The only way I know of is for you to build a custom scheduler that takes
space into consideration.
Another possiblity is to have two job trackers, one with the big nodes, the
other with the small nodes. Then run jobs on the appropriate job trackers.