Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/19389#discussion_r150528207
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
---
@@ -468,14 +460,16 @@ object PartitioningUtils {
}
/**
- * Given a collection of [[Literal]]s, resolves possible type conflicts
by up-casting "lower"
- * types.
+ * Given a collection of [[Literal]]s, resolves possible type conflicts
by
+ * [[TypeCoercion.findWiderCommonType]]. See
[[TypeCoercion.findWiderTypeForTwo]].
*/
private def resolveTypeConflicts(literals: Seq[Literal], timeZone:
TimeZone): Seq[Literal] = {
- val desiredType = {
- val topType =
literals.map(_.dataType).maxBy(upCastingOrder.indexOf(_))
--- End diff --
I think we will have the input types for this `resolveTypeConflicts`:
```scala
Seq(
NullType, IntegerType, LongType, DoubleType,
*DecimalType(...), DateType, TimestampType, StringType)
```
*`DecimalType` only when it's bigger than `LongType`:
https://github.com/apache/spark/blob/04975a68b583a6175f93da52374108e5d4754d9a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L384-L393
Because:
this particular `resolveTypeConflicts` seems being only called through:
https://github.com/apache/spark/blob/04975a68b583a6175f93da52374108e5d4754d9a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L142
https://github.com/apache/spark/blob/04975a68b583a6175f93da52374108e5d4754d9a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L337
https://github.com/apache/spark/blob/04975a68b583a6175f93da52374108e5d4754d9a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L474
In the first call, I am seeing `pathsWithPartitionValues` is constructed by
`partitionValues`, which is the output from `parsePartition`:
https://github.com/apache/spark/blob/04975a68b583a6175f93da52374108e5d4754d9a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L108
which parses the input by `parsePartitionColumn`:
https://github.com/apache/spark/blob/04975a68b583a6175f93da52374108e5d4754d9a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L209
which calls this `inferPartitionColumnValue`:
https://github.com/apache/spark/blob/04975a68b583a6175f93da52374108e5d4754d9a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala#L384-L428
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]