[ 
https://issues.apache.org/jira/browse/NIFI-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André Lison updated NIFI-9357:
------------------------------
    Description: 
Hello,

given a message schema that defines a field being either string, double or null:

{{{}}
{{ {{...}}}}
{{ {{"fields": [}}}}
{{ {{}{{"name": "Value",}}{{"type": [}}\{{  "null",}}\{{  "double",}}\{{  
"string"}}{{]}]}}}}
{{ {{}}}}}

This schema is used by a JSONTreeRecordReader to parse following message:
 {{[{}}
 {{"Value": "12345"}}
 {{}]}}

Unfortunately the DateTypeUtils detect that the string contains a number and 
converts it to a double. This behaviour is undesired. In my case this leads to 
the situation, that a double is written to an influx field. However, in the 
next message there could be a string which is not parsable to a number and thus 
the data type stays unchanged "string". Trying to write a string to a shard 
which is already prepopulated with a double results in an error.

The desired behaviour is, to preserve the string when the schema allowes a 
string. Likewise for other datatypes.

I tracked it down to the 
org.apache.nifi.serialization.record.util.DataTypeUtils#findMostSuitableTypeByStringValue
 method.

This method sorts the possible data types from the schema by their enum type 
and returns the first match. In my above case this is always double.

Best,
 André

 

  was:
Hello,

given a message schema that defines a field being either string, double or null:

{{{}}
{{...}}
{{"fields": [}}
{{{}}{{"name": "Value",}}{{"type": [}}{{  "null",}}{{  "double",}}{{  
"string"}}{{]}]}}
{{}}}

This schema is used by a JSONTreeRecordReader to parse following message:
{{[{}}
{{"Value": "12345"}}
{{}]}}

Unfortunately the DateTypeUtils detect that the string contains a number and 
converts it to a double. This behaviour is undesired. In my case this leads to 
the situation, that a double is written to an influx field. However, in the 
next message there could be a string which is not parsable to a number and thus 
the data type stays unchanged "string". Trying to write a string to a shard 
which is already prepopulated with a double results in an error.

The desired behaviour is, to preserve the string when the schema allowes a 
string. Likewise for other datatypes.

I tracked it down to the 
org.apache.nifi.serialization.record.util.DataTypeUtils#findMostSuitableTypeByStringValue
 method.

This method sorts the possible data types from the schema by their enum type 
and returns the first match. In my above case this is always double.

Best,
André

 


> DataTypeUtils: Wrong conversion of String to Double in case of choice in 
> schema
> -------------------------------------------------------------------------------
>
>                 Key: NIFI-9357
>                 URL: https://issues.apache.org/jira/browse/NIFI-9357
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.13.2
>         Environment: Linux & Windows
>            Reporter: André Lison
>            Priority: Major
>
> Hello,
> given a message schema that defines a field being either string, double or 
> null:
> {{{}}
> {{ {{...}}}}
> {{ {{"fields": [}}}}
> {{ {{}{{"name": "Value",}}{{"type": [}}\{{  "null",}}\{{  "double",}}\{{  
> "string"}}{{]}]}}}}
> {{ {{}}}}}
> This schema is used by a JSONTreeRecordReader to parse following message:
>  {{[{}}
>  {{"Value": "12345"}}
>  {{}]}}
> Unfortunately the DateTypeUtils detect that the string contains a number and 
> converts it to a double. This behaviour is undesired. In my case this leads 
> to the situation, that a double is written to an influx field. However, in 
> the next message there could be a string which is not parsable to a number 
> and thus the data type stays unchanged "string". Trying to write a string to 
> a shard which is already prepopulated with a double results in an error.
> The desired behaviour is, to preserve the string when the schema allowes a 
> string. Likewise for other datatypes.
> I tracked it down to the 
> org.apache.nifi.serialization.record.util.DataTypeUtils#findMostSuitableTypeByStringValue
>  method.
> This method sorts the possible data types from the schema by their enum type 
> and returns the first match. In my above case this is always double.
> Best,
>  André
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to