[ 
https://issues.apache.org/jira/browse/CALCITE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-7607:
------------------------------------
    Labels: pull-request-available  (was: )

> DML type coercion inserts implicit narrowing casts for target-column 
> assignments
> --------------------------------------------------------------------------------
>
>                 Key: CALCITE-7607
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7607
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: zzwqqq
>            Assignee: zzwqqq
>            Priority: Major
>              Labels: pull-request-available
>
> For {{{}INSERT{}}}, {{{}UPDATE{}}}, and {{{}MERGE{}}}, Calcite may coerce a 
> source expression to the target column type if the two types are 
> cast-compatible. This happens even when the SQL does not contain an explicit 
> {{{}CAST{}}}.
> That can change DML assignment semantics. A target table may need to reject 
> the value, or may need to apply its own checked assignment rules. Calcite 
> should not turn a bare source expression into an ordinary cast unless the 
> assignment is safe to rewrite.
> This includes length, precision, and range narrowing. For example, inserting 
> {{VARCHAR(64)}} into {{{}VARCHAR(16){}}}, {{DECIMAL(18, 4)}} into 
> {{{}DECIMAL(10, 2){}}}, or {{BIGINT}} into {{INTEGER}} should not be hidden 
> by an implicit cast in the DML plan. Several databases reject those 
> assignments when the actual value does not fit the target column.
> Tests in postgresql, mysql and oracle:
> [https://onecompiler.com/postgresql/44sf3dz68]
> [https://onecompiler.com/mysql/44sf3efz4]
> [https://onecompiler.com/oracle/44sf3ewmh]
> h3. Steps to Reproduce
> Add this case in {{{}TypeCoercionConverterTest{}}}:
> {code:java}
> @Test void testInsertVarcharNarrowingKeepsSourceType() {
> final String sql = "insert into t1 select "
> + "CAST('AVeryLongLongStringValue' AS VARCHAR(64)), "
> + "t2_smallint, t2_int, t2_bigint,\n"
> + "t2_real, t2_double, t2_decimal, t2_timestamp, "
> + "t2_date, t2_binary, t2_boolean from t2";
> sql(sql).ok();
> }{code}
> The plan is
> {code:java}
> LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], 
> flattened=[false])
>   LogicalProject(t1_varchar20=['AVeryLongLongStringV':VARCHAR(20)], 
> t1_smallint=[$1], t1_int=[$2], t1_bigint=[$3], t1_real=[$4], t1_double=[$5], 
> t1_decimal=[$6], t1_timestamp=[$7], t1_date=[$8], t1_binary=[$9], 
> t1_boolean=[$10])
>     LogicalTableScan(table=[[CATALOG, SALES, T2]])
>  {code}
> The source expression is {{{}CAST('AVeryLongLongStringValue' AS 
> VARCHAR(64)){}}}, but the plan has already truncated it to 
> {{{}'AVeryLongLongStringV':VARCHAR(20){}}}. This bypasses target-column 
> assignment checks.
> Desired behavior: the plan keeps the source type:
> {code:java}
> LogicalTableModify(table=[[CATALOG, SALES, T1]], operation=[INSERT], 
> flattened=[false])
>   LogicalProject(t1_varchar20=['AVeryLongLongStringValue':VARCHAR(64)], 
> t1_smallint=[$1], t1_int=[$2], t1_bigint=[$3], t1_real=[$4], t1_double=[$5], 
> t1_decimal=[$6], t1_timestamp=[$7], t1_date=[$8], t1_binary=[$9], 
> t1_boolean=[$10])
>     LogicalTableScan(table=[[CATALOG, SALES, T2]])
>  {code}
> h3. Possible Fix
> One possible approach is to keep store assignment separate from ordinary cast 
> coercion:
>  * In validation, use assignment compatibility for bare DML sources, and 
> leave ordinary cast compatibility to user-written {{CAST}} expressions.
>  * In query-source coercion, rewrite only assignments that are known to be 
> safe to rewrite.
>  * Preserve source/target assignment metadata through {{TableModify}} so 
> later planning and execution stages can still see the original source type.
>  * For Calcite's enumerable execution path, reject kept-source values that do 
> not fit the target column.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to