pig-user  

How to access filenames after loading a directory to an Alias [pig scripting]

Latha
Sun, 05 Oct 2008 10:36:12 -0700

Greetings!
Hi , When I load a directory(from hdfs)  into an alias and try to dump it, I
find all the lines of various files in that directory appearing one after
another.
However, not able to figure out how to access filenames from alias. Tried
understanding script1-hadoop.pig. Still ,am not able to find out the same.

A = load "inputDir" using PigStorage();
dump A;
Output:
------------------------------------------------
( line1 from inputDir/insideDir/file1.txt)
( line 2 from inputDir/insideDir/file1.txt)
.
(line 1 from inputDir/insideDir/innermost/fileone.txt)
...
etc.,
------------------------------------------------

Am interested in filewise results , where I can retain the filename and get
the results filewise.

filename1
( line1 )
( line2 )

filename2
(line 1)
(line 2)
etc.,

Is there any way I can access filenames from alias to which a directory is
loaded? Requirement is to iterate through all the files, and in each file,
would like to process every line. please point me the right approach.

Regards,
Srilatha