[jira] [Updated] (NIFI-10508) Improve Record type inference between integer and floating-point fields

Mark Payne (Jira) Thu, 15 Sep 2022 10:12:07 -0700


     [ 
https://issues.apache.org/jira/browse/NIFI-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mark Payne updated NIFI-10508:
------------------------------
    Description: 
When we configure a Record Reader to infer the type of an object, and it finds 
two values, one of which is an integer (or short, long, etc.) and the other is 
a floating point number (float, double, etc) the inferred type becomes a CHOICE 
between an int and a double, for example.

But this is really not a great inference. It is common to see JSON or CSV data, 
for example, where numbers are truncated if they have no decimals (i.e., 5.0 
becomes 5). It doesn't really make sense to then write this as a union between 
an int and a double, as the real type of the field is a double.

We should improve the inference logic to allow float and doubles to encapsulate 
byte/short/int/long values.

  was:
When we configure a Record Reader to infer the type of an object, and it finds 
two values, one of which is an integer (or short, long, etc.) and the other is 
a floating point number (float, double, etc) the inferred type becomes a CHOICE 
between an int and a double, for example.

But this is really not a great inference. It is VERY common to see JSON or CSV 
data, for example, where numbers are truncated if they have no decimals (i.e., 
5.0 becomes 5). It doesn't really make sense to then write this as a union 
between an int and a double, as the real type of the field is a double.

We should improve the inference logic to allow float and doubles to encapsulate 
byte/short/int/long values.


> Improve Record type inference between integer and floating-point fields
> -----------------------------------------------------------------------
>
>                 Key: NIFI-10508
>                 URL: https://issues.apache.org/jira/browse/NIFI-10508
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>
> When we configure a Record Reader to infer the type of an object, and it 
> finds two values, one of which is an integer (or short, long, etc.) and the 
> other is a floating point number (float, double, etc) the inferred type 
> becomes a CHOICE between an int and a double, for example.
> But this is really not a great inference. It is common to see JSON or CSV 
> data, for example, where numbers are truncated if they have no decimals 
> (i.e., 5.0 becomes 5). It doesn't really make sense to then write this as a 
> union between an int and a double, as the real type of the field is a double.
> We should improve the inference logic to allow float and doubles to 
> encapsulate byte/short/int/long values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-10508) Improve Record type inference between integer and floating-point fields

Reply via email to