gengliangwang opened a new pull request #25239: [SPARK-28495][SQL] 
AssignableCast: A new type coercion following store assignment rules of ANSI SQL
URL: https://github.com/apache/spark/pull/25239
 
 
   ## What changes were proposed in this pull request?
   
   In Spark version 2.4 and earlier, when inserting into a table, Spark will 
cast the data type of input query to the data type of target table by coercion. 
This can be super confusing, e.g. users make a mistake and write string values 
to an int column.
   
   In data source V2, by default, only upcasting is allowed when inserting data 
into a table. E.g. int -> long and int -> string are allowed, while decimal -> 
double or long -> int are not allowed. The rules of UpCast was originally 
created for Dataset type coercion. They are quite strict and different from the 
behavior of all existing popular DBMS. This is breaking change. It is possible 
that it would hurt some Spark users after 3.0 releases.
   
   This PR proposes that we can follow the rules of store assignment(section 
9.2) in ANSI SQL. Two significant differences from Up-Cast:
   1. Any numeric type can be assigned to another numeric type.
   2. TimestampType can be assigned DateType
   
   The new behavior is consistent with PostgreSQL. It is more explainable and 
acceptable than using UpCast .
   
   ## How was this patch tested?
   
   Unit test
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to