Looks like no one is going to migrate my udfs for me, but getting some help
would be nice. Looks like several things have changed and I am just wondering
if there is a guide out there to help, or if I can get some personal help. I
did find this page http://wiki.apache.org/pig/PigTypesDesign
Here's a specific example. In the current RegExLoader, I do this
ArrayList<Datum> list = new ArrayList<Datum>();
for (int i = 1; i <= matcher.groupCount(); i++) {
list.add(new DataAtom(matcher.group(i)));
}
return new Tuple(list);
Well, it looks like DataAtom and Datum don't exist and I am not sure what to
change.
Though biased :), I think my udfs rather helpful, including
1. load files based on a regex -
https://issues.apache.org/jira/browse/PIG-472
2. load apache common logs -
https://issues.apache.org/jira/browse/PIG-473
3. load files based on regex from pig latin -
https://issues.apache.org/jira/browse/PIG-474
4. pull dates from apache logs -
https://issues.apache.org/jira/browse/PIG-476,
https://issues.apache.org/jira/browse/PIG-503
5. extract search engine from a referer -
https://issues.apache.org/jira/browse/PIG-486
6. extract host from a url -
https://issues.apache.org/jira/browse/PIG-487
7. extract search terms from a referer -
https://issues.apache.org/jira/browse/PIG-488
8. load combined logs - https://issues.apache.org/jira/browse/PIG-509
Guessing others may find such things useful so any help would be appreciated.
I think with a little help to get started I could likely do much of the work
myself.
Thoughts?
Thanks,
Earl
http://blog.spack.net
http://holaservers.com