[GitHub] spark pull request #19602: [SPARK-22384][SQL] Refine partition pruning when ...

jinxing64 Sun, 27 May 2018 07:17:34 -0700

Github user jinxing64 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19602#discussion_r191078009
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
    @@ -657,18 +656,46 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
     
         val useAdvanced = SQLConf.get.advancedPartitionPredicatePushdownEnabled
     
    +    object ExtractAttribute {
    +      def unapply(expr: Expression): Option[Attribute] = {
    +        expr match {
    +          case attr: Attribute => Some(attr)
    +          case cast @ Cast(child, dt: StringType, _) if 
child.dataType.isInstanceOf[NumericType] =>
    +            unapply(child)
    +          case cast @ Cast(child, dt: NumericType, _) if child.dataType == 
StringType =>
    --- End diff --
    
    Are you worrying that Spark supports converting some invalid string to 
number but Hive doesn't? Yes, in that case, this change is dangerous. I didn't  
think out an example, or there maybe such case in the future? Could you give an 
example?
    
    On the other hand, it comes to me that this change is incorrect when 
there's precision truncation, e.g. cast("1.234" as int). But on a second 
thought, all the `ExtractAttribute` thing here happens within the scope of 
`getPartitionsByFilter` and such truncation would not happen?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19602: [SPARK-22384][SQL] Refine partition pruning when ...

Reply via email to