On 10/17/07 10:37 AM, "Lance Amundsen" <[EMAIL PROTECTED]> wrote: > 1 file per map, 1 record per file, isSplitable(true or false): yields 1 > record per mapper Yes. > 1 file total, n records, isSplitable(true): Yields variable n records per > variable m mappers Yes. > 1 file total, n records, isSplitable(false): Yields n records into 1 > mapper Yes. > What I am immediately looking for is a way to do: > > 1 file total, n records, isSplitable(true): Yields 1 record into n mappers > > But ultimately need to control fully control the file/record distributions. Why in the world do you need this level of control? Isn't that the point of frameworks like Hadoop? (to avoid the need for this)
- InputFiles, Splits, Maps, Tasks Questions 1.3 Base Lance Amundsen
- Re: InputFiles, Splits, Maps, Tasks Questions 1.3 Base Arun C Murthy
- Re: InputFiles, Splits, Maps, Tasks Questions 1.3 ... Lance Amundsen
- Re: InputFiles, Splits, Maps, Tasks Questions ... Ted Dunning
- Re: InputFiles, Splits, Maps, Tasks Questi... Lance Amundsen
- Re: InputFiles, Splits, Maps, Tasks Q... Ted Dunning
- Re: InputFiles, Splits, Maps, Tas... Lance Amundsen
- Re: InputFiles, Splits, Maps,... Doug Cutting
- Re: InputFiles, Splits, Maps,... Lance Amundsen
- Re: InputFiles, Splits, Maps,... Doug Cutting
- Re: InputFiles, Splits, Maps,... Lance Amundsen
- Re: InputFiles, Splits, Maps,... Doug Cutting
- Re: InputFiles, Splits, Maps,... Owen O'Malley
- Re: InputFiles, Splits, Maps,... Lance Amundsen