Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19389#discussion_r150687775
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
---
@@ -468,14 +460,16 @@ object PartitioningUtils {
}
/**
- * Given a collection of [[Literal]]s, resolves possible type conflicts
by up-casting "lower"
- * types.
+ * Given a collection of [[Literal]]s, resolves possible type conflicts
by
+ * [[TypeCoercion.findWiderCommonType]]. See
[[TypeCoercion.findWiderTypeForTwo]].
*/
private def resolveTypeConflicts(literals: Seq[Literal], timeZone:
TimeZone): Seq[Literal] = {
- val desiredType = {
- val topType =
literals.map(_.dataType).maxBy(upCastingOrder.indexOf(_))
--- End diff --
Partitioned columns are different from normal type coercion cases, they are
literally all string type, and we are just trying to find a most reasonable
type of them.
The previous behavior was there since the very beginning, which I think
didn't go through a decent discussion. This is the first time we seriously
design the type merging logic for partition discovery. I think it doesn't need
to be blocked by the type coercion stabilization work, as they can diverge.
@HyukjinKwon can you send the proposal to dev list? I think we need more
feedback, e.g. people may want more strict rules and have more cases to
fallback to string.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]