Hi, I currently run a MapReduce job to rewrite a tab delimited file, and then I use Hive for everything after that stage.
Am I correct in thinking that I can create a Jar with my own method which can then be called in SQL? Would the syntax be: hive> ADD JAR /tmp/parse.jar; hive> INSERT OVERWRITE TABLE target SELECT s.id, s.canonical, parsedName FROM source s MAP s.canonical using 'parse' as parsedName; and parse be a MR job? If so what are the input and output formats please for the parse? Or is it a class implementing an interface perhaps and Hive take care of the rest? Thanks for any pointers, Tim
