maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502785507
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
+Spark supports both implicit conversions by type coercion and explicit
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types
+
Review comment:
Could you describe what' a type coercion at the beginning?
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
+Spark supports both implicit conversions by type coercion and explicit
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types
+
+The following matrix shows the resulting type to which they are implicitly
converted to resolve an expression involving different data types
+
+Numeric expresions:
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType** |-- |ShortType |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**ShortType** |ShortType |-- |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|-- |LongType |FloatType
|DoubleType |DoubleType |
+|**LongType** |LongType |LongType |LongType |-- |FloatType
|DoubleType |DoubleType |
+|**FloatType** |FloatType |FloatType |FloatType |FloatType |--
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType |DoubleType|DoubleType|--
|DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType
|DoubleType|DoubleType|DoubleType |-- |
Review comment:
This matrix means specific type coercion rules in binary arithmetic
operations, right? Cold you add a matrix for type coercion in more general
cases, too, like `Table 2-10 Implicit Type Conversion Matrix` in the Oracle
doc? You could refer to the rules in `TypeCoercion.scala`
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L983-L1064
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
Review comment:
`in order to solve the expressions.` -> `in order to resolve type
mismatches.`?
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
+Spark supports both implicit conversions by type coercion and explicit
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types
+
+The following matrix shows the resulting type to which they are implicitly
converted to resolve an expression involving different data types
+
+Numeric expresions:
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType** |-- |ShortType |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**ShortType** |ShortType |-- |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|-- |LongType |FloatType
|DoubleType |DoubleType |
+|**LongType** |LongType |LongType |LongType |-- |FloatType
|DoubleType |DoubleType |
+|**FloatType** |FloatType |FloatType |FloatType |FloatType |--
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType |DoubleType|DoubleType|--
|DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType
|DoubleType|DoubleType|DoubleType |-- |
+
+The case of DecimalType, is treated differently, for example, there is no
common type for double and decimal because double's range is larger than
decimal, and yet decimal is more precise than double, but in an expresion, we
would cast the decimal into double.
+
+Time expresions:
+
+| |DateType |TimestampType |
+|------------------|-------------|--------------|
+|**DateType** |-- |TimestampType |
+|**TimestampType** |TimestampType|-- |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int |null |
+|doubleColumn |double |null |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+| result| double| null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+| col_name|data_type|comment|
++---------------+---------+-------+
+| dateColumn| date| null|
+|timestampColumn|timestamp| null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations
that need to cast types to different store types, the following matrix shows if
the conversion is allowed
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |-- |X |X |X |X |X
|X |X |X |X | |
+|**ShortType**|* |-- |X |X |X |X
|X |X |X |X | |
+|**IntegerType**|* |* |-- |X |X |X
|X |X |X |X | |
+|**LongType** |* |* |* |-- |X |X
|X |X |X |X | |
+|**FloatType** |* |* |* |* |-- |X
|X | |X |X | |
+|**DoubleType** |* |* |* |* |* |--
|X | |X |X | |
+|**StringType** |* |* |* |* |* |*
|-- |X |X |X |X |
+|**BinaryType** | | | | | |
| |-- | | | |
+|**BooleanType** |X |X |X |X |X |X
|X | |-- | | |
+|**TimestampType** |* |* |* |X |X |X
|X | | |-- |X |
+|**DateType** |* |* |X |X |X |X
|X | | |X |-- |
+
+X: Conversion allowed (cast ByteType in ShortType)
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled
is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store
assignment casting) an exception will be thrown.
Review comment:
spark.sql.storeAssignmentPolicy=ANSI ->
\`spark.sql.storeAssignmentPolicy=ANSI\`
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
Review comment:
`... can contain different data types, type conversion ...` -> `... can
contain different data types and type conversion ...`?
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
Review comment:
`conversion` -> `Conversion`
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
+Spark supports both implicit conversions by type coercion and explicit
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types
+
+The following matrix shows the resulting type to which they are implicitly
converted to resolve an expression involving different data types
+
+Numeric expresions:
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType** |-- |ShortType |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**ShortType** |ShortType |-- |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|-- |LongType |FloatType
|DoubleType |DoubleType |
+|**LongType** |LongType |LongType |LongType |-- |FloatType
|DoubleType |DoubleType |
+|**FloatType** |FloatType |FloatType |FloatType |FloatType |--
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType |DoubleType|DoubleType|--
|DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType
|DoubleType|DoubleType|DoubleType |-- |
+
+The case of DecimalType, is treated differently, for example, there is no
common type for double and decimal because double's range is larger than
decimal, and yet decimal is more precise than double, but in an expresion, we
would cast the decimal into double.
+
+Time expresions:
+
+| |DateType |TimestampType |
+|------------------|-------------|--------------|
+|**DateType** |-- |TimestampType |
+|**TimestampType** |TimestampType|-- |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int |null |
+|doubleColumn |double |null |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+| result| double| null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+| col_name|data_type|comment|
++---------------+---------+-------+
+| dateColumn| date| null|
+|timestampColumn|timestamp| null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations
that need to cast types to different store types, the following matrix shows if
the conversion is allowed
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |-- |X |X |X |X |X
|X |X |X |X | |
+|**ShortType**|* |-- |X |X |X |X
|X |X |X |X | |
+|**IntegerType**|* |* |-- |X |X |X
|X |X |X |X | |
+|**LongType** |* |* |* |-- |X |X
|X |X |X |X | |
+|**FloatType** |* |* |* |* |-- |X
|X | |X |X | |
+|**DoubleType** |* |* |* |* |* |--
|X | |X |X | |
+|**StringType** |* |* |* |* |* |*
|-- |X |X |X |X |
+|**BinaryType** | | | | | |
| |-- | | | |
+|**BooleanType** |X |X |X |X |X |X
|X | |-- | | |
+|**TimestampType** |* |* |* |X |X |X
|X | | |-- |X |
+|**DateType** |* |* |X |X |X |X
|X | | |X |-- |
+
+X: Conversion allowed (cast ByteType in ShortType)
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled
is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store
assignment casting) an exception will be thrown.
Review comment:
spark.sql.ansi.enabled -> \`spark.sql.ansi.enabled\`
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
+Spark supports both implicit conversions by type coercion and explicit
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types
+
+The following matrix shows the resulting type to which they are implicitly
converted to resolve an expression involving different data types
+
+Numeric expresions:
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType** |-- |ShortType |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**ShortType** |ShortType |-- |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|-- |LongType |FloatType
|DoubleType |DoubleType |
+|**LongType** |LongType |LongType |LongType |-- |FloatType
|DoubleType |DoubleType |
+|**FloatType** |FloatType |FloatType |FloatType |FloatType |--
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType |DoubleType|DoubleType|--
|DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType
|DoubleType|DoubleType|DoubleType |-- |
+
+The case of DecimalType, is treated differently, for example, there is no
common type for double and decimal because double's range is larger than
decimal, and yet decimal is more precise than double, but in an expresion, we
would cast the decimal into double.
Review comment:
Could you describe more about decimal type coercion by referring to
`DecimalPrecision`?
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala#L29-L63
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
+Spark supports both implicit conversions by type coercion and explicit
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types
+
+The following matrix shows the resulting type to which they are implicitly
converted to resolve an expression involving different data types
+
+Numeric expresions:
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType** |-- |ShortType |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**ShortType** |ShortType |-- |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|-- |LongType |FloatType
|DoubleType |DoubleType |
+|**LongType** |LongType |LongType |LongType |-- |FloatType
|DoubleType |DoubleType |
+|**FloatType** |FloatType |FloatType |FloatType |FloatType |--
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType |DoubleType|DoubleType|--
|DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType
|DoubleType|DoubleType|DoubleType |-- |
+
+The case of DecimalType, is treated differently, for example, there is no
common type for double and decimal because double's range is larger than
decimal, and yet decimal is more precise than double, but in an expresion, we
would cast the decimal into double.
+
+Time expresions:
+
+| |DateType |TimestampType |
+|------------------|-------------|--------------|
+|**DateType** |-- |TimestampType |
+|**TimestampType** |TimestampType|-- |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int |null |
+|doubleColumn |double |null |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+| result| double| null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+| col_name|data_type|comment|
++---------------+---------+-------+
+| dateColumn| date| null|
+|timestampColumn|timestamp| null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
Review comment:
Could you put the output for following the other doc examples?
##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
| 3| Infinity|
+---------+---------+
```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is
the transformation of some data types into others in order to solve the
expressions.
+Spark supports both implicit conversions by type coercion and explicit
conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types
+
+The following matrix shows the resulting type to which they are implicitly
converted to resolve an expression involving different data types
+
+Numeric expresions:
+
+| |ByteType |ShortType |IntegerType |LongType |FloatType
|DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType** |-- |ShortType |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**ShortType** |ShortType |-- |IntegerType |LongType |FloatType
|DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|-- |LongType |FloatType
|DoubleType |DoubleType |
+|**LongType** |LongType |LongType |LongType |-- |FloatType
|DoubleType |DoubleType |
+|**FloatType** |FloatType |FloatType |FloatType |FloatType |--
|DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType |DoubleType|DoubleType|--
|DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType
|DoubleType|DoubleType|DoubleType |-- |
+
+The case of DecimalType, is treated differently, for example, there is no
common type for double and decimal because double's range is larger than
decimal, and yet decimal is more precise than double, but in an expresion, we
would cast the decimal into double.
+
+Time expresions:
+
+| |DateType |TimestampType |
+|------------------|-------------|--------------|
+|**DateType** |-- |TimestampType |
+|**TimestampType** |TimestampType|-- |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int |null |
+|doubleColumn |double |null |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+| result| double| null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+| col_name|data_type|comment|
++---------------+---------+-------+
+| dateColumn| date| null|
+|timestampColumn|timestamp| null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
Review comment:
cc: @gengliangwang
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]