It would be a nice addition to the conversion tools. A first pass of converting Avro schemas to ORC would be pretty easy with:
boolean -> boolean int -> int long -> long float -> float double -> double bytes -> binary string -> string enum -> string fixed -> binary map<X> -> map<string,X> array<X> -> array<X> record<X,Y,Z> -> struct<X,Y,Z> union<X,Y,Z> -> union<X,Y,Z> with special handling for union<null,X> -> X In terms of the conversion, you would just need to extend ConvertTool to create RecordReaders for Avro. There are already examples of JSON and CSV. .. Owen On Mon, Dec 4, 2017 at 11:31 PM, Oleg Ruchovets <[email protected]> wrote: > Hello. > I wonder if there Utility to convert AVRO to ORC similar JSON to ORC ? > > Background of what I am doing: > I am reading SQL data using NIFI. NIFI returns data in AVRO format. I > want to store this data on s3 in ORC format and use it for hive external > table. for that, I need to convert AVRO to ORC and derive hive schema. NIFI > has component AVRO to ORC but it supports older version of HIVE and ORC. > > So the question how to convert AVRO to ORC and derive hive schema. I > really like Utility that you guys build for JSON. it has both conversions > to ORC and HIVE schema extraction. What is the way to achieve the same in > case of AVRO format? > > Thanks > Oleg. >
