[ 
https://issues.apache.org/jira/browse/NIFI-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Handermann updated NIFI-14496:
------------------------------------
    Fix Version/s: 2.4.0
       Resolution: Fixed
           Status: Resolved  (was: Patch Available)

> ConvertRecord processor cannot convert Avro bytes typed field to string 
> properly
> --------------------------------------------------------------------------------
>
>                 Key: NIFI-14496
>                 URL: https://issues.apache.org/jira/browse/NIFI-14496
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.3.0
>            Reporter: Yuanhao Zhu
>            Assignee: Pierre Villard
>            Priority: Blocker
>             Fix For: 2.4.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> When using ConvertRecord processor in 2.x we found that it is not able to 
> convert an avro bytes field into string properly.
> The setup is as following, the ConvertRecord uses an avro reader which uses 
> the built-in schema from the avro file. the record writer is a 
> JsonRecordSetWriter which uses a custom schema(copied from the avro file's 
> schema only that the "Body" field is marked as string(in avro file ":Body" 
> field is marked as bytes in the built-in schema) 
>  
> In 1.x the "Body" field will be converted into string that contains json 
> objects and we would use evaluateJsonPath to extract further. However, in 2.x 
> this behavior becomes that the result of "Body" field would always be 
> something like "[Ljava.lang.Object;@279aa943" which is the toString returned 
> value from an Object array
>  
> After some investigation in nifi repo, I think the reason is that In 1.x 
> DataTypeUtils conversion, the toString method also deals with the scenario 
> where incoming value is an array of object,
> [https://github.com/apache/nifi/blob/883338fe28883733417d10f6ffa9319e75f5ea06/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java#L975]
>  
> where it will convert each of the object into a string. While in the 2.x, 
> where the conversion is moved to ObjectStringFieldConverter.java, 
> [https://github.com/apache/nifi/blob/0fde8be07270e41433d07fa1e3f940b1a08674d9/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/field/ObjectStringFieldConverter.java#L102]
> this scenario is not covered and instead the default toString method of the 
> incoming object will be invoked which also explained why we see that 
> "[Ljava.lang.Object;@279aa943" in 2.x .
> Not sure why the Avro reader reads the byte array in as an Object array 
> though. 
> Would you mind take a look into it? Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to