And keep in mind that one split is not necessary 1 file. That depends on the
InputFormat. For example, the MultipleInputFormat, clubs together multiple
files in 1 split.


On Thu, Sep 23, 2010 at 3:16 PM, Greg Roelofs <[email protected]> wrote:

> > Can a map task work on more than one input split?
>
> As far as I can tell from reading the code, no (at least, not yet).  Code
> such as createCache() in JobInProgress implicitly assumes a one-to-one
> mapping
> between maps[] and splits[].
>
> MR-1220 (small-jobs "combo task" optimization) will change that in some
> sense,
> but fundamentally, the correspondence between maps and splits is pretty
> well
> baked in, I believe.  (In fact, I'm pretty sure splits are created based on
> some goal for the number of maps--i.e., maps and splits are one-to-one
> almost
> by definition.)
>
> I might be wrong about all this, of course. :-)
>
> Greg
>

Reply via email to