Dirk, if you look at the code for pigStorage, you'll see some code in there 
that looks at file names and chooses the right input format to use based on 
that. You should just add the same thing to regexloader.  

On Jun 17, 2011, at 1:44 AM, "[email protected]" <[email protected]> wrote:

> Hello Pig mailing list,
> 
> I have around 10 TB of apache log files (1 TB as .gz compressed files)
> and analyze these files with pig.
> Obviously apache log files can be compressed pretty good with gzip, so
> it would be great if Pig would accept the log files in compressed
> form.
> 
> Is this possible with the CombinedLogLoader from contrib/piggybank or
> is there any other way to do this? It is pretty easy with the normal
> TextLoader. It automatically detects if the file is a .gz file.
> 
> If there is no way, would the RegExLoader be the correct class to extend?
> 
> Regards
> Dirk

Reply via email to