Hi,

I currently run a MapReduce job to rewrite a tab delimited file, and then I
use Hive for everything after that stage.

Am I correct in thinking that I can create a Jar with my own method which
can then be called in SQL?

Would the syntax be:

  hive> ADD JAR /tmp/parse.jar;
  hive> INSERT OVERWRITE TABLE target SELECT s.id,
s.canonical, parsedName FROM source s MAP s.canonical using 'parse' as
parsedName;

and parse be a MR job?  If so what are the input and output formats please
for the parse?  Or is it a class implementing an interface perhaps and Hive
take care of the rest?

Thanks for any pointers,
Tim

Reply via email to