On Dec 19, 2010, at 10:23 AM, Jane Chen wrote: > Suppose that the output is written to a database, that only runs on certain > nodes. It will be desirable to schedule the reducer tasks to run on the > nodes local or close to the database nodes.
a) That's a side-effect--pretty much "against the rules". Very little support is provided for such things. b) At a minimum, you'll need to write your own scheduler.