Why do you need to see the intermediate data as text? What are the types of your key and values?
-Joey On Sep 5, 2011 6:54 PM, "ilyal levin" <nipponil...@gmail.com> wrote: > o.k , so now i'm using SequenceFileInputFormat and SequenceFileOutputFormat > and it works fine but the output of the reducer is > now a binary file (not txt) so i can't understand the data. how can i solve > this? i need the data (in txt form ) of the Intermediate stages in the > chain. > > Thanks > > On Tue, Sep 6, 2011 at 1:33 AM, ilyal levin <nipponil...@gmail.com> wrote: > >> Thanks for the help. >> >> >> On Mon, Sep 5, 2011 at 10:50 PM, Roger Chen <rogc...@ucdavis.edu> wrote: >> >>> The binary file will allow you to pass the output from the first reducer >>> to the second mapper. For example, if you outputed Text, IntWritable from >>> the first one in SequenceFileOutputFormat, then you are able to retrieve >>> Text, IntWritable input at the head of the second mapper. The idea of >>> chaining is that you know what kind of output the first reducer is going to >>> give already, and that you want to perform some secondary operation on it. >>> >>> One last thing on chaining jobs: it's often worth looking to see if you >>> can consolidate all of your separate map and reduce tasks into a single >>> map/reduce operation. There are many situations where it is more intuitive >>> to write a number of map/reduce operations and chain them together, but more >>> efficient to have just a single operation. >>> >>> >>> >>> On Mon, Sep 5, 2011 at 12:21 PM, ilyal levin <nipponil...@gmail.com >wrote: >>> >>>> Thanks for the reply. >>>> I tried it but it creates a binary file which i can not understand (i >>>> need the result of the first job). >>>> The other thing is how can i use this file in the next chained mapper? >>>> i.e how can i retrieve the keys and the values in the map function? >>>> >>>> >>>> Ilyal >>>> >>>> >>>> On Mon, Sep 5, 2011 at 7:41 PM, Joey Echeverria <j...@cloudera.com >wrote: >>>> >>>>> Have you tried SequenceFileOutputFormat and SequenceFileInputFormat? >>>>> >>>>> -Joey >>>>> >>>>> On Mon, Sep 5, 2011 at 11:49 AM, ilyal levin <nipponil...@gmail.com> >>>>> wrote: >>>>> > Hi >>>>> > I'm trying to write a chained mapreduce program. i'm doing so with a >>>>> simple >>>>> > loop where in each iteration i >>>>> > create a job ,execute it and every time the current job's output is >>>>> the next >>>>> > job's input. >>>>> > how can i configure the outputFormat of the current job and the >>>>> inputFormat >>>>> > of the next job so that >>>>> > i will not use the TextInputFormat (TextOutputFormat), because if i do >>>>> use >>>>> > it, i need to parse the input file in the Map function? >>>>> > i.e if possible i want the next job to "consider" the input file as >>>>> > <key,value> and not plain Text. >>>>> > Thanks a lot. >>>>> > >>>>> > >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> Joseph Echeverria >>>>> Cloudera, Inc. >>>>> 443.305.9434 >>>>> >>>> >>>> >>> >>> >>> -- >>> Roger Chen >>> UC Davis Genome Center >>> >> >>