maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502785507



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 
+Spark supports both implicit conversions by type coercion and explicit 
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+

Review comment:
       Could you describe what' a type coercion at the beginning?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 
+Spark supports both implicit conversions by type coercion and explicit 
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly 
converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType 
|DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType 
|DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|-- 
        |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  
|DoubleType|DoubleType|DoubleType |--         |

Review comment:
       This matrix means specific type coercion rules in binary arithmetic 
operations, right? Cold you add a matrix for type coercion in more general 
cases, too, like `Table 2-10 Implicit Type Conversion Matrix` in the Oracle 
doc? You could refer to the rules in `TypeCoercion.scala` 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L983-L1064

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 

Review comment:
       `in order to solve the expressions.` -> `in order to resolve type 
mismatches.`?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 
+Spark supports both implicit conversions by type coercion and explicit 
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly 
converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType 
|DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType 
|DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|-- 
        |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  
|DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no 
common type for double and decimal because double's range is larger than 
decimal, and yet decimal is more precise than double, but in an expresion, we 
would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations 
that need to cast types to different store types, the following matrix shows if 
the conversion is allowed
+
+|             |ByteType  |ShortType |IntegerType |LongType |FloatType 
|DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |--        |X         |X           |X        |X         |X       
   |X          |X          |X           |X             |        |
+|**ShortType**|*         |--        |X           |X        |X         |X       
   |X          |X          |X           |X             |        |
+|**IntegerType**|*       |*         |--          |X        |X         |X       
   |X          |X          |X           |X             |        |
+|**LongType** |*         |*         |*           |--       |X         |X       
   |X          |X          |X           |X             |        |
+|**FloatType** |*        |*         |*           |*        |--        |X       
   |X          |           |X           |X             |        |
+|**DoubleType** |*       |*         |*           |*        |*         |--      
   |X          |           |X           |X             |        |
+|**StringType** |*       |*         |*           |*        |*         |*       
   |--         |X          |X           |X             |X       |
+|**BinaryType** |        |          |            |         |          |        
   |           |--         |            |              |        |
+|**BooleanType** |X      |X         |X           |X        |X         |X       
   |X          |           |--          |              |        |
+|**TimestampType** |*    |*         |*           |X        |X         |X       
   |X          |           |            |--            |X       |
+|**DateType** |*         |*         |X           |X        |X         |X       
   |X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled 
is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store 
assignment casting) an exception will be thrown. 

Review comment:
       spark.sql.storeAssignmentPolicy=ANSI -> 
\`spark.sql.storeAssignmentPolicy=ANSI\`

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 

Review comment:
       `... can contain different data types, type conversion ...` -> `... can 
contain different data types and type conversion ...`?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion

Review comment:
       `conversion` -> `Conversion`

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 
+Spark supports both implicit conversions by type coercion and explicit 
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly 
converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType 
|DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType 
|DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|-- 
        |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  
|DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no 
common type for double and decimal because double's range is larger than 
decimal, and yet decimal is more precise than double, but in an expresion, we 
would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations 
that need to cast types to different store types, the following matrix shows if 
the conversion is allowed
+
+|             |ByteType  |ShortType |IntegerType |LongType |FloatType 
|DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |--        |X         |X           |X        |X         |X       
   |X          |X          |X           |X             |        |
+|**ShortType**|*         |--        |X           |X        |X         |X       
   |X          |X          |X           |X             |        |
+|**IntegerType**|*       |*         |--          |X        |X         |X       
   |X          |X          |X           |X             |        |
+|**LongType** |*         |*         |*           |--       |X         |X       
   |X          |X          |X           |X             |        |
+|**FloatType** |*        |*         |*           |*        |--        |X       
   |X          |           |X           |X             |        |
+|**DoubleType** |*       |*         |*           |*        |*         |--      
   |X          |           |X           |X             |        |
+|**StringType** |*       |*         |*           |*        |*         |*       
   |--         |X          |X           |X             |X       |
+|**BinaryType** |        |          |            |         |          |        
   |           |--         |            |              |        |
+|**BooleanType** |X      |X         |X           |X        |X         |X       
   |X          |           |--          |              |        |
+|**TimestampType** |*    |*         |*           |X        |X         |X       
   |X          |           |            |--            |X       |
+|**DateType** |*         |*         |X           |X        |X         |X       
   |X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled 
is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store 
assignment casting) an exception will be thrown. 

Review comment:
       spark.sql.ansi.enabled -> \`spark.sql.ansi.enabled\`

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 
+Spark supports both implicit conversions by type coercion and explicit 
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly 
converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType 
|DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType 
|DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|-- 
        |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  
|DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no 
common type for double and decimal because double's range is larger than 
decimal, and yet decimal is more precise than double, but in an expresion, we 
would cast the decimal into double.

Review comment:
       Could you describe more about decimal type coercion by referring to 
`DecimalPrecision`?  
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala#L29-L63

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 
+Spark supports both implicit conversions by type coercion and explicit 
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly 
converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType 
|DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType 
|DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|-- 
        |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  
|DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no 
common type for double and decimal because double's range is larger than 
decimal, and yet decimal is more precise than double, but in an expresion, we 
would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+

Review comment:
       Could you put the output for following the other doc examples?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is 
the transformation of some data types into others in order to solve the 
expressions. 
+Spark supports both implicit conversions by type coercion and explicit 
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly 
converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType 
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType 
|DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType 
|DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|-- 
        |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  
|DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no 
common type for double and decimal because double's range is larger than 
decimal, and yet decimal is more precise than double, but in an expresion, we 
would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting

Review comment:
       cc: @gengliangwang 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to