[
https://issues.apache.org/jira/browse/NIFI-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
André Lison updated NIFI-9357:
------------------------------
Description:
Hello,
given a message schema that defines a field being either string, double or null:
{{{}}
{{ ...}}
{{ "fields": [}}
{{ {}}
{{ "name": "Value",}}
{{ "type": [}}
{{ "null",}}
{{ "double",}}
{{ "string"]}}
{{ }}}
{{]}}
{{}}}
This schema is used by a JSONTreeRecordReader to parse following message:
{{[{}}
{{"Value": "12345"}}
{{}]}}
Unfortunately the DateTypeUtils detect that the string contains a number and
converts it to a double. This behaviour is undesired. In my case this leads to
the situation, that a double is written to an influx field. However, in the
next message there could be a string which is not parsable to a number and thus
the data type stays unchanged "string". Trying to write a string to a shard
which is already prepopulated with a double results in an error.
The desired behaviour is, to preserve the string when the schema allowes a
string. Likewise for other datatypes.
I tracked it down to the
org.apache.nifi.serialization.record.util.DataTypeUtils#findMostSuitableTypeByStringValue
method.
This method sorts the possible data types from the schema by their enum type
and returns the first match. In my above case this is always double.
Best,
André
was:
Hello,
given a message schema that defines a field being either string, double or null:
{{{}}
{{ ...}}
{{ "fields": [}}
{{ {}}
{{ "name": "Value",}}
{{ "type": [}}
{{ "null",}}
{{ "double",}}
{{ "string"]}}
{{ }}}
{{]}}
{{}}}
This schema is used by a JSONTreeRecordReader to parse following message:
{{[{}}
{{"Value": "12345"}}
{{}]}}
Unfortunately the DateTypeUtils detect that the string contains a number and
converts it to a double. This behaviour is undesired. In my case this leads to
the situation, that a double is written to an influx field. However, in the
next message there could be a string which is not parsable to a number and thus
the data type stays unchanged "string". Trying to write a string to a shard
which is already prepopulated with a double results in an error.
The desired behaviour is, to preserve the string when the schema allowes a
string. Likewise for other datatypes.
I tracked it down to the
org.apache.nifi.serialization.record.util.DataTypeUtils#findMostSuitableTypeByStringValue
method.
This method sorts the possible data types from the schema by their enum type
and returns the first match. In my above case this is always double.
Best,
André
> DataTypeUtils: Wrong conversion of String to Double in case of choice in
> schema
> -------------------------------------------------------------------------------
>
> Key: NIFI-9357
> URL: https://issues.apache.org/jira/browse/NIFI-9357
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.13.2
> Environment: Linux & Windows
> Reporter: André Lison
> Priority: Major
>
> Hello,
> given a message schema that defines a field being either string, double or
> null:
> {{{}}
> {{ ...}}
> {{ "fields": [}}
> {{ {}}
> {{ "name": "Value",}}
> {{ "type": [}}
> {{ "null",}}
> {{ "double",}}
> {{ "string"]}}
> {{ }}}
> {{]}}
> {{}}}
> This schema is used by a JSONTreeRecordReader to parse following message:
> {{[{}}
> {{"Value": "12345"}}
> {{}]}}
> Unfortunately the DateTypeUtils detect that the string contains a number and
> converts it to a double. This behaviour is undesired. In my case this leads
> to the situation, that a double is written to an influx field. However, in
> the next message there could be a string which is not parsable to a number
> and thus the data type stays unchanged "string". Trying to write a string to
> a shard which is already prepopulated with a double results in an error.
> The desired behaviour is, to preserve the string when the schema allowes a
> string. Likewise for other datatypes.
> I tracked it down to the
> org.apache.nifi.serialization.record.util.DataTypeUtils#findMostSuitableTypeByStringValue
> method.
> This method sorts the possible data types from the schema by their enum type
> and returns the first match. In my above case this is always double.
> Best,
> André
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)