Yes, found the problem, it was something dumb, not setting the output to SequenceFileOutputFormat. now things work.
Now that things work I've noticed the output of a MR using SequenceFileOutputFormat is not compressed, but when I create a SequenceFile.Writer it is by default compressed. How to I set the MR output to be compressed in the JobConf? I can set compression for the Map output but not for the MR output. Thxs. Alejandro On 2/3/07, Bryan A. P. Pendleton <[EMAIL PROTECTED]> wrote:
For that to work, the output of the previous job will have to set to SequenceFileOuputFormat. Note that, unless there are no tab characters in the keys of the output from the first job, there's no way to read the existing output accurately back in. On 2/2/07, Dennis Kubes <[EMAIL PROTECTED]> wrote: > > You need to set the input format of the second job. It defaults to > TextInputFormat which is why you are seeing it become text. Use a line > like below in the second job. > > secondjob.setInputFormat(SequenceFileInputFormat.class); > secondjob.setInputKeyClass(Text.class); > secondjob.setInputValueClass(Text.class); > > Dennis Kubes > > Alejandro Abdelnur wrote: > > I may be missing something silly here, > > > > I have a MR that generates an output type (Text,Text) > > > > Consuming that output for another MR it becomes a plain text file thus > the > > input is (LongWriteable, Text) with the long key being the line number > and > > the text value is the key+value separated by a tab and my second MR blow > as > > it was expecting (Text,Text) plus that the key is wrong. > > > > Doing a cat of the file I see it become a flat file with lines having > "key > > \t value". > > > > How can I force the output of the first MR to remain a sequence file of > > (Text, Text)? > > > > Thxs. > > > > A > > > -- Bryan A. P. Pendleton Ph: (877) geek-1-bp
