I ended up fixing this issue - i did change it to a bag after but the main 
problem was that regexextractall was returning everything as a string (bia 
group) which meant that max, avg etc... was not matched as a matching function 
for a bag of tuple doubles. 

I ended up writing a new udf for extractall to return types based on whether \d 
or \w was used in the regexp. Flattening that to specfic types didnt work. 

That solved the issue, would appreciate the feedback on the udf and approach - 
will post it early next week on pastebin. If there's a better way then please 
let me know. 

This whole solution was because I  wanted to get around the issue of creating a 
new udf for each log line type I needed to parse.

Many thanks,
Jon

On 24 Jun 2011, at 23:45, Dmitriy Ryaboy <[email protected]> wrote:

> <mime-attachment.txt>

Reply via email to