[
https://issues.apache.org/jira/browse/NIFI-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17932772#comment-17932772
]
Daniel Stieglitz commented on NIFI-14331:
-----------------------------------------
It looks like the main issue is that in method convertJsonNodeToRecord of
JsonTreeRowRecordReader with signature
{code:java}
private Record convertJsonNodeToRecord(final JsonNode jsonNode, final
RecordSchema schema, final String fieldNamePrefix,
final boolean coerceTypes, final
boolean dropUnknown) throws IOException, MalformedRecordException{code}
the variable jsonNodeForSerialization (which is of type JsonNode) is not
updated accordingly when recursively climbing the JSON. The record data stored
in variable values ( of type Map<String, Object>) has the correct filtered
data. Hence at the end of the method
{code:java}
final Supplier<String> supplier = jsonNodeForSerialization::toString;
return new MapRecord(schema, values, SerializedForm.of(supplier,
"application/json"), false, dropUnknown);{code}
has the incorrect serialized form of the JSON.
> Unknown embedded fields not dropped by JSON Writer as expected by specified
> schema
> ----------------------------------------------------------------------------------
>
> Key: NIFI-14331
> URL: https://issues.apache.org/jira/browse/NIFI-14331
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 2.2.0
> Reporter: Daniel Stieglitz
> Assignee: Daniel Stieglitz
> Priority: Major
> Attachments: convertRecordResults.json, person.avsc,
> person_dropfield.json
>
>
> NIFI-13843 was aimed to eliminate any fields found in the JSON which were not
> defined in a specifed Avro schema. While that fix seems to have solved the
> issue for top level items it did not solve the issue for an undefined key
> within a defined object and for an undefined key in a defined object for an
> array. Attached are the person.avsc Avro schema and the person_dropfield.json
> which includes undefined top level fields such as single key value pair
> ("undefinedKey"), array ("undefinedScalarArray"), object ("undefinedObject")
> and object array ("undefinedObjectArray"). It also includes undefined field
> ("undefinedKeyInObject") inside the defined "name" top level object and an
> undefined field ("undefinedKeyInObject") in a "job" object found in the
> "jobs" array. The result after calling ConvertRecord can be seen in the
> attached convertRecordResults.json. Note fields "undefinedKey",
> "undefinedScalarArray", "undefinedObject" and "undefinedObjectArray" all get
> dropped while fields "undefinedKeyInObject" still exist in the "name" object
> and the "job" object inside the "jobs" array.
> Currently this behavior is seen in both ConvertRecord and MergeRecord when
> both are configured with a JsonTreeReader and JsonRecordSetWriter.
> It is interesting to note this behavior is seen in NIFI 1.28.1 only for
> MergeRecord while ConvertRecord drops all unknown fields.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)