[GitHub] spark pull request #19389: [SPARK-22165][SQL] Resolve type conflicts between...

cloud-fan Mon, 13 Nov 2017 14:39:54 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19389#discussion_r150687775
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
 ---
    @@ -468,14 +460,16 @@ object PartitioningUtils {
       }
     
       /**
    -   * Given a collection of [[Literal]]s, resolves possible type conflicts 
by up-casting "lower"
    -   * types.
    +   * Given a collection of [[Literal]]s, resolves possible type conflicts 
by
    +   * [[TypeCoercion.findWiderCommonType]]. See 
[[TypeCoercion.findWiderTypeForTwo]].
        */
       private def resolveTypeConflicts(literals: Seq[Literal], timeZone: 
TimeZone): Seq[Literal] = {
    -    val desiredType = {
    -      val topType = 
literals.map(_.dataType).maxBy(upCastingOrder.indexOf(_))
    --- End diff --
    
    Partitioned columns are different from normal type coercion cases, they are 
literally all string type, and we are just trying to find a most reasonable 
type of them.
    
    The previous behavior was there since the very beginning, which I think 
didn't go through a decent discussion. This is the first time we seriously 
design the type merging logic for partition discovery. I think it doesn't need 
to be blocked by the type coercion stabilization work, as they can diverge.
    
    @HyukjinKwon can you send the proposal to dev list? I think we need more 
feedback, e.g. people may want more strict rules and have more cases to 
fallback to string.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19389: [SPARK-22165][SQL] Resolve type conflicts between...

Reply via email to