Hi Adam, Anyone have a working UDF jar for GeoIP lookups using MaxMind's data? > I saw one being discussed a few months ago, but haven't seen it in any > contrib branches. >
The original discussion about a Hive GeoIP UDF is here: http://markmail.org/message/acqj3nal4opbpcmw#query:+page:1+mid:4w5ly57x6zcysol6+state:results Looks like the idea of including it in contrib was nixed because of licensing issues, but I bet Ed would be willing to share his work with you. There's also a Cloudera Blog post from a while back about analyzing GeoIP data using Pig here: http://www.cloudera.com/blog/2009/06/analyzing-apache-logs-with-pig/ While less efficient than a UDF, I think you can probably call this Perl script from a Hive TRANSFORM query without making any changes. See http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform Thanks. Carl
