[ https://issues.apache.org/jira/browse/NIFI-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Savitsky updated NIFI-5735: -------------------------------- Attachment: NIFI-5735.patch > Record-oriented processors/services do not properly support Avro Unions > ----------------------------------------------------------------------- > > Key: NIFI-5735 > URL: https://issues.apache.org/jira/browse/NIFI-5735 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework, Extensions > Affects Versions: 1.7.1 > Reporter: Daniel Solow > Priority: Major > Labels: AVRO, avro > Attachments: > 0001-NIFI-5735-added-preliminary-support-for-union-resolu.patch, > NIFI-5735.patch > > > The [Avro spec|https://avro.apache.org/docs/1.8.2/spec.html#Unions] states: > {quote}Unions may not contain more than one schema with the same type, > *except for the named types* record, fixed and enum. For example, unions > containing two array types or two map types are not permitted, but two types > with different names are permitted. (Names permit efficient resolution when > reading and writing unions.) > {quote} > However record oriented processors/services in Nifi do not support multiple > named types per union. This is a problem, for example, with the following > schema: > {code:javascript} > { > "type": "record", > "name": "root", > "fields": [ > { > "name": "children", > "type": { > "type": "array", > "items": [ > { > "type": "record", > "name": "left", > "fields": [ > { > "name": "f1", > "type": "string" > } > ] > }, > { > "type": "record", > "name": "right", > "fields": [ > { > "name": "f2", > "type": "int" > } > ] > } > ] > } > } > ] > } > {code} > This schema contains a field name "children" which is array of type union. > The union type contains two possible record types. Currently the Nifi avro > utilities will fail to process records of this schema with "children" arrays > that contain both "left" and "right" record types. > I've traced this bug to the [AvroTypeUtils > class|https://github.com/apache/nifi/blob/rel/nifi-1.7.1/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/apache/nifi/avro/AvroTypeUtil.java]. > Specifically there are bugs in the convertUnionFieldValue method and in the > buildAvroSchema method. Both of these methods make the assumption that an > Avro union can only contain one child type of each type. As stated in the > spec, this is true for primitive types and non-named complex types but not > for named types. > There may be related bugs elsewhere, but I haven't been able to locate them > yet. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)