[
https://issues.apache.org/jira/browse/NIFI-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Payne updated NIFI-10508:
------------------------------
Description:
When we configure a Record Reader to infer the type of an object, and it finds
two values, one of which is an integer (or short, long, etc.) and the other is
a floating point number (float, double, etc) the inferred type becomes a CHOICE
between an int and a double, for example.
But this is really not a great inference. It is common to see JSON or CSV data,
for example, where numbers are truncated if they have no decimals (i.e., 5.0
becomes 5). It doesn't really make sense to then write this as a union between
an int and a double, as the real type of the field is a double.
We should improve the inference logic to allow float and doubles to encapsulate
byte/short/int/long values.
was:
When we configure a Record Reader to infer the type of an object, and it finds
two values, one of which is an integer (or short, long, etc.) and the other is
a floating point number (float, double, etc) the inferred type becomes a CHOICE
between an int and a double, for example.
But this is really not a great inference. It is VERY common to see JSON or CSV
data, for example, where numbers are truncated if they have no decimals (i.e.,
5.0 becomes 5). It doesn't really make sense to then write this as a union
between an int and a double, as the real type of the field is a double.
We should improve the inference logic to allow float and doubles to encapsulate
byte/short/int/long values.
> Improve Record type inference between integer and floating-point fields
> -----------------------------------------------------------------------
>
> Key: NIFI-10508
> URL: https://issues.apache.org/jira/browse/NIFI-10508
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
>
> When we configure a Record Reader to infer the type of an object, and it
> finds two values, one of which is an integer (or short, long, etc.) and the
> other is a floating point number (float, double, etc) the inferred type
> becomes a CHOICE between an int and a double, for example.
> But this is really not a great inference. It is common to see JSON or CSV
> data, for example, where numbers are truncated if they have no decimals
> (i.e., 5.0 becomes 5). It doesn't really make sense to then write this as a
> union between an int and a double, as the real type of the field is a double.
> We should improve the inference logic to allow float and doubles to
> encapsulate byte/short/int/long values.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)