maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r501390470



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+#### Type Coercion in operations between different types 
+
+The following is the hierarchy of data type compatibility and the possible 
implicit conversions that can be made. In an operation involving different and 
compatible data types, these will be promoted to the lowest common top type to 
perform the operation.
+
+For example, if you have an add operation between an integer and a float, the 
integer will be treated as a float, the least common compatible type, resulting 
the operation in a float.
+
+|Data type|Hierarchy compatible types|
+|---------|--------------------------|
+|ByteType |ByteType, ShortType, IntegerType, LongType, FloatType, DoubleType|
+|ShortType |ShortType, IntegerType, LongType, FloatType, DoubleType|
+|IntegerType |IntegerType, LongType, FloatType, DoubleType|
+|LongType |LongType, FloatType, DoubleType|
+|FloatType |FloatType, DoubleType|
+|DoubleType |DoubleType|
+|StringType |DoubleType (in numeric operations), StringType |
+|BinaryType |BinaryType|
+|BooleanType |BooleanType|
+|TimestampType |TimestampType, DateType|
+|DateType |DateType|
+
+The case of DecimalType, is treated differently, for example, there is no 
common type for double and decimal because double's range is larger than 
decimal, and yet decimal is more precise than double, but in an operation, we 
would cast the decimal into double.
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations 
that need to cast types to different store types, the following matrix shows if 
the conversion is allowed
+
+|         |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType 
|StringType |BinaryType |BooleanType |TimestampType |DateType|
+|---------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|ByteType |--        |X         |X           |X        |X         |X          
|X          |X          |X           |X             |        |
+|ShortType|*         |--        |X           |X        |X         |X          
|X          |X          |X           |X             |        |
+|IntegerType|*       |*         |--          |X        |X         |X          
|X          |X          |X           |X             |        |
+|LongType |*         |*         |*           |--       |X         |X          
|X          |X          |X           |X             |        |
+|FloatType |*        |*         |*           |*        |--        |X          
|X          |           |X           |X             |        |
+|DoubleType |*       |*         |*           |*        |*         |--         
|X          |           |X           |X             |        |
+|StringType |*       |*         |*           |*        |*         |*          
|--         |X          |X           |X             |X       |
+|BinaryType |        |          |            |         |          |           
|           |--         |            |              |        |
+|BooleanType |X      |X         |X           |X        |X         |X          
|X          |           |--          |              |        |
+|TimestampType |*    |*         |*           |X        |X         |X          
|X          |           |            |--            |X       |
+|DateType |*         |*         |X           |X        |X         |X          
|X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur, check ANSI compliance for the result in this case 
(cast ShortType in ByteType)

Review comment:
       The overflow behaviour is different between ANSI/non-ANSI modes. Could 
you explain about it here and add a link to the ANSI page? 
https://github.com/apache/spark/blob/master/docs/sql-ref-ansi-compliance.md#type-conversion
 We might need a subsection for the explanation.

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility

Review comment:
       How about organizing the structure of this section by referring to the 
Oracle doc like this? 
https://docs.oracle.com/cd/B28359_01/server.111/b28286/sql_elements002.htm#SQLRF51043
   ```
   #### Type Conversion
   <What does "Type Conversion" means? How does Spark handle type conversion? 
brabrabra...>
   
   #### Type Coercion
   <Coercion Matrix> 
   
   ##### Type Coercion Examples
   <examples...>
   
   #### Explicit Casting and Store Assignment Casting
   <Casting Matrix> 
   
   ##### Type Casting Examples
   <examples...>
   ```
   cc: @gatorsmile @HyukjinKwon @huaxingao 

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility

Review comment:
       `Data type compatibility` => `Type Conversion`?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+#### Type Coercion in operations between different types 
+
+The following is the hierarchy of data type compatibility and the possible 
implicit conversions that can be made. In an operation involving different and 
compatible data types, these will be promoted to the lowest common top type to 
perform the operation.
+
+For example, if you have an add operation between an integer and a float, the 
integer will be treated as a float, the least common compatible type, resulting 
the operation in a float.
+
+|Data type|Hierarchy compatible types|

Review comment:
       Could we use a matrix form for type coercion, too?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to