Hi, I have some questions related to basic functionality in Hadoop.
1. When a Mapper process the intermediate output data, how it knows how many partitions to do(how many reducers will be) and how much data to go in each partition for each reducer ? 2. A JobTracker when assigns a task to a reducer, it will also specify the locations of intermediate output data where it should retrieve it right ? But how a reducer will know from each remote location with intermediate output what portion it has to retrieve only ? Could somebody help me with these questions together with pointing me out where I can find the java code doing that ? I am running Hadoop 1.0.3. Thanks, Robert