Thanks, Chris. This cleared a lot of things up for me. Martin
On Thu, Sep 20, 2012 at 1:21 AM, Chris Douglas <[email protected]> wrote: > On Tue, Sep 18, 2012 at 7:02 AM, Martin Dobmeier > <[email protected]> wrote: > > Ah, alright. But why is Hadoop telling me that there are 117 segments > given > > that only 96 reducers have been configured? > > (btw, I'm using Hadoop 1.0.0) > > There were 117 spills, so the merger starts with 117 files, does an > intermediate merge of 54 segments (#reducers = 96 times), then a final > merge of 64 segments (96 times). All of those layers produce log > statements. > > > So the merger is called "number of reducers" times because it combines > the > > data for a particular reducer which is spread over all spill files, > right? > > Yup, you have it. -C >
