You can use MultiStorage from Piggybank, like so: https://cwiki.apache.org/confluence/display/PIG/FAQ#FAQ-Q%3AIloaddatafrom
Just beware of this bug https://issues.apache.org/jira/browse/PIG-2462 If you are using pig-0.9.1 or pig 0.8 Yulia On 2/2/12 8:11 PM, "Ranjan Bagchi" <[email protected]> wrote: >Hi, > >I've a bunch of [for example] apache logfiles that I'm searching through. > I can process them with: > >logs = load 's3://bucket/directory/*' USING LogLoader as (remoteAddr, >remoteLogname, user, time :chararray, method, uri :chararray, proto, >status, bytes, referer, userAgent); > >Is there any way of getting the name of the file from which logs was >pulled added to the relation? > >Thanks, > >Ranjan
