On Tue, Feb 28, 2012 at 5:15 PM, Geoffry Roberts <geoffry.robe...@gmail.com> wrote:
> If I create an executable jar file that contains all dependencies required > by the MR job do all said dependencies get distributed to all nodes? You can make a single jar and that will be distributed to all of the machines that run the task, but it is better in most cases to use the distributed cache. See http://hadoop.apache.org/common/docs/r1.0.0/mapred_tutorial.html#DistributedCache > If I specify but one reducer, which node in the cluster will the reducer > run on? The scheduling is done by the JobTracker and it isn't possible to control the location of the reducers. -- Owen