How can then I produce an output/file per mapper not map-task? Thank you, Maha
On Feb 20, 2011, at 10:22 PM, Ted Dunning wrote: > This is the most important thing that you have said. The map function > is called once per unit of input but the mapper object persists for > many input units of input. > > You have a little bit of control over how many mapper objects there > are and how many machines they are created on and how many pieces your > input is broken into. That control is limited, however, unless you > build your own input format. The standard input formats are optimized > for very large inputs and may not give you the flexibility that you > want for your experiments. That is unfortunate for the purpose of > learning about hadoop but hadoop is designed mostly for dealing with > very large data and isn't usually designed to be easy to understand. > Where easy coincides with powerful then easy is good but powerful > isn't always easy. > > On Sunday, February 20, 2011, maha <[email protected]> wrote: >> So first question: is there a difference between Mappers and maps ?
