[
https://issues.apache.org/jira/browse/NIFI-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Handermann updated NIFI-14496:
------------------------------------
Fix Version/s: 2.4.0
Resolution: Fixed
Status: Resolved (was: Patch Available)
> ConvertRecord processor cannot convert Avro bytes typed field to string
> properly
> --------------------------------------------------------------------------------
>
> Key: NIFI-14496
> URL: https://issues.apache.org/jira/browse/NIFI-14496
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.3.0
> Reporter: Yuanhao Zhu
> Assignee: Pierre Villard
> Priority: Blocker
> Fix For: 2.4.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> When using ConvertRecord processor in 2.x we found that it is not able to
> convert an avro bytes field into string properly.
> The setup is as following, the ConvertRecord uses an avro reader which uses
> the built-in schema from the avro file. the record writer is a
> JsonRecordSetWriter which uses a custom schema(copied from the avro file's
> schema only that the "Body" field is marked as string(in avro file ":Body"
> field is marked as bytes in the built-in schema)
>
> In 1.x the "Body" field will be converted into string that contains json
> objects and we would use evaluateJsonPath to extract further. However, in 2.x
> this behavior becomes that the result of "Body" field would always be
> something like "[Ljava.lang.Object;@279aa943" which is the toString returned
> value from an Object array
>
> After some investigation in nifi repo, I think the reason is that In 1.x
> DataTypeUtils conversion, the toString method also deals with the scenario
> where incoming value is an array of object,
> [https://github.com/apache/nifi/blob/883338fe28883733417d10f6ffa9319e75f5ea06/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java#L975]
>
> where it will convert each of the object into a string. While in the 2.x,
> where the conversion is moved to ObjectStringFieldConverter.java,
> [https://github.com/apache/nifi/blob/0fde8be07270e41433d07fa1e3f940b1a08674d9/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/field/ObjectStringFieldConverter.java#L102]
> this scenario is not covered and instead the default toString method of the
> incoming object will be invoked which also explained why we see that
> "[Ljava.lang.Object;@279aa943" in 2.x .
> Not sure why the Avro reader reads the byte array in as an Object array
> though.
> Would you mind take a look into it? Thanks!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)