Yes. Use the configure method which is called each time a new file is used in the map. Save the file name in a field of the mapper.
The other alternative is to derive a new InputFormat that remembers the input file name. On 3/4/08 5:38 PM, "Tarandeep Singh" <[EMAIL PROTECTED]> wrote: > Hi, > > I need to identify from which file, a key came from, in the map phase. > Is it possible ? > > What I have is multiple types of log files in one directory that I > need to process for my application. Right now, I am relying on the > structure of the log files (e.g if a line starts with "weblog", the > line came from Log File A or if the number of tab-separated fields in > the line is N, then it is Log File B) > > Is there a better way to do this ? > > Is there a way that the Hadoop framework passes me as a key the path > of the file (right now it is the offset in the file, I guess) ? > > One more related question - can I set 2 directories as input to my map > reduce program ? This is just to avoid copying files from one log > directory to another. > > thanks, > Taran
