[ 
https://issues.apache.org/jira/browse/NIFI-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325186#comment-17325186
 ] 

ASF subversion and git services commented on NIFI-8365:
-------------------------------------------------------

Commit a50957161cef12a63a1ff76bcaf718ecab2e71b5 in nifi's branch 
refs/heads/main from Tamas Palfy
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=a509571 ]

NIFI-8365 Fix JSON AbstractJsonRowRecordReader to handle deep CHOICE-typed 
records properly: change the logic that selects the first  compatible schema 
which can have missing fields compared to the real value and search for a more 
strict match first and fallback to the existing logic only if not one found.
- AbstractJsonRowRecordReader - Handle (meaning log a warning and not fail 
completely) multi-array CHOICE type when data has extra fields (not defined by 
the schema) and can't determine correct type.
- AvroTypeUtil - Allow multiple different record types in avro union type. 
Minor refactors. Added documentation fro EqualsWrapper.


> JSON record reader mishandles deep CHOICE types
> -----------------------------------------------
>
>                 Key: NIFI-8365
>                 URL: https://issues.apache.org/jira/browse/NIFI-8365
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Tamas Palfy
>            Priority: Major
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The AbstractJsonRowRecordReader when trying to find the correct schema for a 
> given record it may come with a wrong one.
> For example:
> Suppose the following record:
> {code:json}
> {
>   "dataCollection":[
>     {
>       "record": {
>         "integer": 1,
>         "boolean": true
>       }
>     },
>     {
>       "record": {
>         "integer": 2,
>         "string": "stringValue2"
>       }
>     }
>   ]
> }
> {code}
> Even if the schema is correctly set (which may not be the case as infer 
> schema itself has a similar issue),
> the second record
> {code:json}
>     {
>       "record": {
>         "integer": 2,
>         "string": "stringValue2"
>       }
>     }
> {code}
> will be assigned the schema of the first (["integer" : "INT", "boolean" : 
> "BOOLEAN"] instead of ["integer" : "INT", "string" : "STRING"]).
> This will cause the fields that are not present in the schema (in this case 
> "string") to be omitted when writing it out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to