Yes, found the problem, it was something dumb, not setting the output to
SequenceFileOutputFormat. now things work.

Now that things work I've noticed the output of a MR using
SequenceFileOutputFormat is not compressed, but when I create a
SequenceFile.Writer it is by default compressed.

How to I set the MR output to be compressed in the JobConf? I can set
compression for the Map output but not for the MR output.

Thxs.

Alejandro


On 2/3/07, Bryan A. P. Pendleton <[EMAIL PROTECTED]> wrote:

For that to work, the output of the previous job will have to set to
SequenceFileOuputFormat.

Note that, unless there are no tab characters in the keys of the output
from
the first job, there's no way to read the existing output accurately back
in.

On 2/2/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
>
> You need to set the input format of the second job.  It defaults to
> TextInputFormat which is why you are seeing it become text.  Use a line
> like below in the second job.
>
> secondjob.setInputFormat(SequenceFileInputFormat.class);
> secondjob.setInputKeyClass(Text.class);
> secondjob.setInputValueClass(Text.class);
>
> Dennis Kubes
>
> Alejandro Abdelnur wrote:
> > I may be missing something silly here,
> >
> > I have a MR that generates an output type (Text,Text)
> >
> > Consuming that output for another MR it becomes a plain text file thus
> the
> > input is (LongWriteable, Text) with the long key being the line number
> and
> > the text value is the key+value separated by a tab and my second MR
blow
> as
> > it was expecting (Text,Text) plus that the key is wrong.
> >
> > Doing a cat of the file I see it become a flat file with lines having
> "key
> > \t value".
> >
> > How can I force the output of the first MR to remain a sequence file
of
> > (Text, Text)?
> >
> > Thxs.
> >
> > A
> >
>



--
Bryan A. P. Pendleton
Ph: (877) geek-1-bp

Reply via email to