One nice way to do this stuff is using a special SERDE, possible like the
JsonSerde:

A simpler scenario, where you have to load a multidelimiter CSV file, can
be addressed using the RegexSerde : which maps columns to each group
matching.

In your case, the JsonSerde could be used essentially to match your fields
and map them to columns.
For a simple example of how to use Regex Serdes to do the same for CSV
files: check out

https://github.com/jayunit100/bigpetstore/blob/master/src/main/java/org/bigtop/bigpetstore/etl/HiveETL.java

I *think* JsonSerde comes with HCatalog now, so if you dont have it on your
classpath, you can easily add that jar into hadoop/lib and then all your
mappers will be able to see your json data as tabular hive data.

Reply via email to