It's an example M-R application in Phoenix coded in C. I've no idea whether there's a popular hadoop version for it and I ported it into hadoop-style application.
FYI. Src attached. On 7/9/08, heyongqiang <[EMAIL PROTECTED]> wrote: > > where i can find the Reverse-Index application? > > > > > heyongqiang > 2008-07-09 > > > > 发件人: Shengkai Zhu > 发送时间: 2008-07-09 09:06:38 > 收件人: [email protected] > 抄送: > 主题: Re: modified word count example > > Another Map Reduce application, Reverse-Index, behaviors similarly as you > description. > You can refer to that. > > > On 7/9/08, heyongqiang <[EMAIL PROTECTED] > wrote: > > > > InputFormat's method RecordReader <K, V > getRecordReader(InputSplit > split, > > JobConf job, Reporter reporter) throws IOException; return a > RecordReader. > > You can implement your own InputFormat and RecordReader: > > 1)the RecorderReader remember the FileSplit(subclass of InputSplit) field > > in its class > > 2) RecordReader's createValue() method always return the FileSplit's file > > field. > > > > hope this helps. > > > > > > > > heyongqiang > > 2008-07-09 > > > > > > > > 发件人: Sandy > > 发送时间: 2008-07-09 01:45:15 > > 收件人: [email protected] > > 抄送: > > 主题: modified word count example > > > > Hi, > > > > Let's say I want to run a map reduce job on a series of text files (let's > > say x.txt y.txt and z.txt) > > > > Given the following mapper function in python (from WordCount.py): > > > > class WordCountMap(Mapper, MapReduceBase): > > one = IntWritable(1) # removed > > def map(self, key, value, output, reporter): > > for w in value.toString().split(): > > output.collect(Text(w), self.one) #how can I modify this line? > > > > Instead of creating pairs for each word found and the numeral one as the > > example is doing, is there a function I can invoke to store the name of > the > > file it came from instead? > > > > thus, i'd have pairs like <"water", "x.txt" > <"hadoop", y.txt > > > <"hadoop", > > "z.txt" > etc. > > > > I took a look at javadoc, but i'm not sure if I've checked in the right > > places. Could someone point me in the right direction? > > > > Thanks! > > > > -SM > > >
rindex.tar.gz
Description: GNU Zip compressed data
