Re: [PR] INSERT/REPLACE dimension target column types are validated against source input expressions (druid)

via GitHub Wed, 20 Mar 2024 05:30:58 -0700


kgyrtkirk commented on code in PR #15962:
URL: https://github.com/apache/druid/pull/15962#discussion_r1531846727



##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java:
##########
@@ -226,6 +235,20 @@ public void validateInsert(final SqlInsert insert)
     // Determine the output (target) schema.
     final RelDataType targetType = validateTargetType(scope, insertNs, insert, 
sourceType, tableMetadata);
 
+    // WITH node type is computed to be the type of the body recursively in
+    // org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery}. If this 
computed type
+    // is different than the type validated and stored for the node in memory 
a nasty relational
+    // algebra error will occur in 
org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType.
+    // During the validateTargetType call above, the WITH body node validated 
type may be updated
+    // with any coercions applied. We update the validated node type of the 
WITH node here so
+    // that they are consistent.
+    if (source instanceof SqlWith) {
+      final RelDataType withBodyType = getValidatedNodeTypeIfKnown(((SqlWith) 
source).body);

Review Comment:
   I think we can get around this and the `WITH` issue with the following 
approach:
   * override `inferUnknownTypes`
   * `node` will have the full expression including the alias -> access by 
fieldName will be ok; 
   * instead of passing an  `unknownType` - I've passed some type with 
artifically created fields to a referene to the druid table (this could 
probably be done differently)
   * this way `validateTargetType` could become kinda part of the validation
   
   however...it have lead to consistency errors as well -> we are reading a 
select which produces a `BIGINT` as `DOUBLE` and similar...which is clearly 
invalid.
   
   my [branch is 
here](https://github.com/kgyrtkirk/druid/tree/catalog-review-15962)
   
   I think the right way to fix this will be to add proper rewrites at the 
right point in the query and let the system do its job - I think that won't be 
even a hack as opposed to multiple long long comments about doing something 
which is clearly questionable.
   
   I think to do that a way could be something like:
   * override `performUnconditionalRewrites` to be able to process `INSERT` 
nodes as well
   * identify columns by name from `SqlInsert#source`
   * rewrite all columns which are of interest with a dummy typed function like 
`druidDouble` or similar
   * use the method's validation logic to ensure that the conversion will 
happen/okay/etc
   * the output type of the function will be the support type for that column - 
so that won't cause any problems
   * leave the function there - however as it was just a placeholder it can be 
removed during native translation without any risks..
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] INSERT/REPLACE dimension target column types are validated against source input expressions (druid)

Reply via email to