pcgrenier commented on a change in pull request #4282:
URL: https://github.com/apache/nifi/pull/4282#discussion_r427542274



##########
File path: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/queryrecord/FlowFileTable.java
##########
@@ -223,12 +225,69 @@ private RelDataType getRelDataType(final DataType 
fieldType, final JavaTypeFacto
             case BIGINT:
                 return typeFactory.createJavaType(BigInteger.class);
             case CHOICE:
+                final ChoiceDataType choiceDataType = (ChoiceDataType) 
fieldType;
+                DataType widestDataType = 
choiceDataType.getPossibleSubTypes().get(0);
+                for (final DataType possibleType : 
choiceDataType.getPossibleSubTypes()) {
+                    if (possibleType == widestDataType) {
+                        continue;
+                    }
+                    if 
(possibleType.getFieldType().isWiderThan(widestDataType.getFieldType())) {
+                        widestDataType = possibleType;
+                        continue;
+                    }
+                    if 
(widestDataType.getFieldType().isWiderThan(possibleType.getFieldType())) {
+                        continue;
+                    }
+
+                    // Neither is wider than the other.
+                    widestDataType = null;
+                    break;
+                }
+
+                // If one of the CHOICE data types is the widest, use it.
+                if (widestDataType != null) {
+                    return getRelDataType(widestDataType, typeFactory);
+                }
+
+                // None of the data types is strictly the widest. Check if all 
data types are numeric.
+                // This would happen, for instance, if the data type is a 
choice between float and integer.
+                // If that is the case, we can use a String type for the table 
schema because all values will fit
+                // into a String. This will still allow for casting, etc. if 
the query requires it.
+                boolean allNumeric = true;
+                for (final DataType possibleType : 
choiceDataType.getPossibleSubTypes()) {
+                    if (!isNumeric(possibleType)) {
+                        allNumeric = false;
+                        break;
+                    }
+                }
+
+                if (allNumeric) {

Review comment:
       While I do agree, but given the example if you have split the records to 
sort out all the "number" values, the schema and the data could disagree. So 
logically you could know the data is all numbers but the schema disagrees. So 
if we are hacking to a string anyway, I don't think it would hurt to just allow 
it to pass on to let Calcite try and process it. This is just allowing a "best 
case" situation where Calcite isn't passed something it has no idea what to do 
with. So I do see a benefit in allowing the string to pass through, since if it 
is not a number it will still fail at query time but will at least attempt it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to