Hi,
I have log files with a dozen different entry types, and i would like to
have them loaded into several different relations.
I couldn't figure out how to attach the schema to a single tuple, so now I'm
loading into a tuple with a type id and a map of values in the udf, and then
split by type and create the final tuples in pig. Is there a better/more
efficient way to do this?
I would like to avoid having loading logic in both the udf and the pig
script, and generate all "final" tuples in the udf, and then just use a
split in pig.
Thanks,
Marko