I wrote my own Avro -> Orc. It was a pain and is a bit on the garbage code 
side (not something I would like to share without cleaning it up)
I would like to contribute the code if I thought enough folks would like this 
kind of option.
Thoughts ?


    On Friday, December 8, 2017, 12:43:57 AM MST, Oleg Ruchovets 
<[email protected]> wrote:  
 
 Hello Owen.   That is interesting. From your experience will it support hive 
external / managed table. My Idea was to prepare ORC object ( without HIVE ) 
and after that register it as external Hive table. Motivation is to prevent 
hive schema maintenance 
ThanksOleg.
On Thu, Dec 7, 2017 at 2:55 AM, Owen O'Malley <[email protected]> wrote:

It would be a nice addition to the conversion tools. A first pass of converting 
Avro schemas to ORC would be pretty easy with:
boolean -> boolean
int -> int
long -> long
float -> float
double -> double
bytes -> binary
string -> stringenum -> stringfixed -> binarymap<X> -> map<string,X>array<X> -> 
array<X>record<X,Y,Z> -> struct<X,Y,Z>union<X,Y,Z> -> union<X,Y,Z>
with special handling for union<null,X> -> X
In terms of the conversion, you would just need to extend ConvertTool to create 
RecordReaders for Avro. There are already examples of JSON and CSV.
.. Owen

On Mon, Dec 4, 2017 at 11:31 PM, Oleg Ruchovets <[email protected]> wrote:

Hello.    I wonder if there Utility to convert AVRO to ORC similar JSON to ORC 
? 
Background of what I am doing:   I am reading SQL data using NIFI. NIFI returns 
data in AVRO format. I want to store this data on s3 in ORC format and use it 
for hive external table. for that, I need to convert AVRO to ORC and derive 
hive schema. NIFI has component AVRO to ORC but it supports older version of 
HIVE and ORC.
So the question how to convert AVRO to ORC and derive hive schema. I really 
like Utility that you guys build for JSON. it has both conversions to ORC and 
HIVE schema extraction.  What is the way to achieve the same in case of AVRO 
format?
ThanksOleg.



  

Reply via email to