You can use MultiStorage from Piggybank, like so:
https://cwiki.apache.org/confluence/display/PIG/FAQ#FAQ-Q%3AIloaddatafrom


Just beware of this bug
https://issues.apache.org/jira/browse/PIG-2462

If you are using pig-0.9.1 or pig 0.8

Yulia

On 2/2/12 8:11 PM, "Ranjan Bagchi" <[email protected]> wrote:

>Hi,
>
>I've a bunch of [for example] apache logfiles that I'm searching through.
> I can process them with:
>
>logs = load 's3://bucket/directory/*' USING LogLoader as (remoteAddr,
>remoteLogname, user, time :chararray, method, uri :chararray, proto,
>status, bytes, referer, userAgent);
>
>Is there any way of getting the name of the file from which logs was
>pulled added to the relation?
>
>Thanks,
>
>Ranjan

Reply via email to