Matt Burgess created NIFI-10956:
-----------------------------------
Summary: Schema Inference returns incorrect datatype for records
where some arrays are empty
Key: NIFI-10956
URL: https://issues.apache.org/jira/browse/NIFI-10956
Project: Apache NiFi
Issue Type: Bug
Reporter: Matt Burgess
If in a FlowFile there is an array field in the schema and for at least one
record the value is an empty array and for at least one other record the value
is for example a record, the inference logic returns a choice between
array<string> and array<record>, and it is possible for the array<string> to be
used for the array elements even if they are records.
For text-based writers such as JsonRecordSetWriter, this results in a string
representation of the record, something like "MapRecord[{a=1,b=2}]" instead of
an actual record object. This is a result of empty arrays defaulting to
array<string> even if they are part of a choice where there are non-empty
arrays. Instead the inference logic should determine if any of the possible
choice datatypes are empty arrays and remove them from the list of possible
choices (unless that is the only choice, in which case it should default to
array<string> as it does now).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)