Benjamin Garrett created NIFI-4647:
--------------------------------------

             Summary: ConvertAvroToORC fails if the avro schema uses a union of 
a null, string, and long
                 Key: NIFI-4647
                 URL: https://issues.apache.org/jira/browse/NIFI-4647
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
    Affects Versions: 1.4.0
            Reporter: Benjamin Garrett


If an avro schema has a field defined like this:

{"name":"myStringOrLong","type":["null","string","long"],"default":null}

then when we run ConvertAvroToORC we get the following error:

Object Type for class org.apache.avro.util.Utf8 not in Union declaration

This is because Avro uses its internal Utf8 class (instead of 'String') as a 
performance optimization as discussed elsewhere on the internet, such as here:  
http://apache-avro.679487.n3.nabble.com/why-Utf8-vs-String-td3247788.html

but the NiFiOrcUtils class is not expecting the Utf8 class to be used for 
'string' types. 

A simple workaround is to modify NiFiOrcUtils lines 72-73 to be something like 
this:

                Class clazzToCompareTo = o.getClass();
                if (o instanceof org.apache.avro.util.Utf8 ) {
                  clazzToCompareTo = String.class;
                }
                TypeInfo objectTypeInfo = 
TypeInfoUtils.getTypeInfoFromObjectInspector(
                        ObjectInspectorFactory.getReflectionObjectInspector(
                            clazzToCompareTo, 
ObjectInspectorFactory.ObjectInspectorOptions.JAVA));

This solution works for me.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to