Re: Question about input file breakdown

Ming Yang Mon, 15 Oct 2007 10:09:53 -0700

thank you guys. the information is very helpful!

Ming


2007/10/15, Rick Cox <[EMAIL PROTECTED]>:
> You can also gzip each input file. Hadoop will not split a compressed
> input file (but will automatically decompress it before feeding it to
> your mapper).
>
> rick
>
> On 10/15/07, Ted Dunning <[EMAIL PROTECTED]> wrote:
> >
> >
> > Use a list of file names as your map input.  Then your mapper can read a
> > line, use that to open and read a file for processing.
> >
> > This is similar to the problem of web-crawling where the input is a list of
> > URL's.
> >
> > On 10/15/07 6:57 AM, "Ming Yang" <[EMAIL PROTECTED]> wrote:
> >
> > > I was writing a test mapreduce program and noticed that the
> > > input file was always broken down into separate lines and fed
> > > to the mapper. However, in my case I need to process the whole
> > > file in the mapper since there are some dependency between
> > > lines in the input file. Is there any way I can achieve this --
> > > process the whole input file, either text or binary, in the mapper?
> >
> >
>

Re: Question about input file breakdown

Reply via email to