Hi,

Several things you can try:

1)Try using com.twitter.elephantbird.pig.load.LzoPigStorage()  and print out
a few lines just to make sure you can read clear text from the lzo files.
2) You can use this in combination with pigs REGEX_EXTRACT(String
expression, String regex, int matchIndex) built int function
3) Have you tried LzoRegexLoader(String pattern)?

Cheers,
 Gerrit



On Mon, Mar 21, 2011 at 9:11 PM, Saptarshi Guha <[email protected]>wrote:

> Hello,
>
> I have some LZO files, which i
>
> a) indexed via DistributedLzoIndexer to create index files
> b) did not index, so just some LZO files in a directory.
>
> Using  both approaches, I tried creating a subclass LzoBaseRegexLoader
> that returns a pattern.
> Sadly, not a single line matched. This is not a problem of the regex
> (checked it works with other strings),
> i modified LzoBaseRegexLoader.java to print the strings coming in and
> I'm getting binary  e.g.
>
> http://pastebin.com/wAveGzDy
>
> I'm using Pig 0.8 and ElephantBird checked out from
> https://github.com/gerritjvv/elephant-bird
>
> Any suggestions?
>
> Saptarshi
>

Reply via email to