Ondřej, While NLineInputFormat will indeed give you N lines per task, it does not guarantee that the N map tasks that come out for a file from it will all be sent to different nodes. Which one is your need exactly - Simply having N lines per map task, or N wider distributed maps?
On Sat, Jun 16, 2012 at 3:01 PM, Ondřej Klimpera <klimp...@fit.cvut.cz> wrote: > I tried this approach, but the job is not distributed among 10 mapper nodes. > Seems Hadoop ignores this property :( > > My first thought is, that the small file size is the problem and Hadoop > doesn't care about it's splitting in proper way. > > Thanks any ideas. > > > > On 06/16/2012 11:27 AM, Bejoy KS wrote: >> >> Hi Ondrej >> >> You can use NLineInputFormat with n set to 10. >> >> ------Original Message------ >> From: Ondřej Klimpera >> To: common-user@hadoop.apache.org >> ReplyTo: common-user@hadoop.apache.org >> Subject: Setting number of mappers according to number of TextInput lines >> Sent: Jun 16, 2012 14:31 >> >> Hello, >> >> I have very small input size (kB), but processing to produce some output >> takes several minutes. Is there a way how to say, file has 100 lines, i >> need 10 mappers, where each mapper node has to process 10 lines of input >> file? >> >> Thanks for advice. >> Ondrej Klimpera >> >> >> Regards >> Bejoy KS >> >> Sent from handheld, please excuse typos. >> > -- Harsh J