[
https://issues.apache.org/jira/browse/NIFI-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pierre Villard updated NIFI-5735:
---------------------------------
Resolution: Feedback Received
Status: Resolved (was: Patch Available)
Apache NiFi 1.x is no longer maintained and no new release is planned on the
1.x release line. Marking as resolved as part of a cleanup operation. Please
open a new one with an updated description if this is still relevant for NiFi
2.x.
> Record-oriented processors/services do not properly support Avro Unions
> -----------------------------------------------------------------------
>
> Key: NIFI-5735
> URL: https://issues.apache.org/jira/browse/NIFI-5735
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework, Extensions
> Affects Versions: 1.7.1
> Reporter: Daniel Solow
> Priority: Major
> Labels: AVRO, avro
> Attachments:
> 0001-NIFI-5735-added-preliminary-support-for-union-resolu.patch,
> NIFI-5735.patch
>
>
> The [Avro spec|https://avro.apache.org/docs/1.8.2/spec.html#Unions] states:
> {quote}Unions may not contain more than one schema with the same type,
> *except for the named types* record, fixed and enum. For example, unions
> containing two array types or two map types are not permitted, but two types
> with different names are permitted. (Names permit efficient resolution when
> reading and writing unions.)
> {quote}
> However record oriented processors/services in Nifi do not support multiple
> named types per union. This is a problem, for example, with the following
> schema:
> {code:javascript}
> {
> "type": "record",
> "name": "root",
> "fields": [
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": [
> {
> "type": "record",
> "name": "left",
> "fields": [
> {
> "name": "f1",
> "type": "string"
> }
> ]
> },
> {
> "type": "record",
> "name": "right",
> "fields": [
> {
> "name": "f2",
> "type": "int"
> }
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> This schema contains a field name "children" which is array of type union.
> The union type contains two possible record types. Currently the Nifi avro
> utilities will fail to process records of this schema with "children" arrays
> that contain both "left" and "right" record types.
> I've traced this bug to the [AvroTypeUtils
> class|https://github.com/apache/nifi/blob/rel/nifi-1.7.1/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/apache/nifi/avro/AvroTypeUtil.java].
> Specifically there are bugs in the convertUnionFieldValue method and in the
> buildAvroSchema method. Both of these methods make the assumption that an
> Avro union can only contain one child type of each type. As stated in the
> spec, this is true for primitive types and non-named complex types but not
> for named types.
> There may be related bugs elsewhere, but I haven't been able to locate them
> yet.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)