turboFei commented on a change in pull request #24685: [SPARK-27814][SQL] The 
cast operation for partition key may push down uncorrect filter, which  is 
fatal.
URL: https://github.com/apache/spark/pull/24685#discussion_r287711000
 
 

 ##########
 File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala
 ##########
 @@ -675,12 +675,23 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
     val useAdvanced = SQLConf.get.advancedPartitionPredicatePushdownEnabled
 
     object ExtractAttribute {
+      val partitionKeys = table.getPartitionKeys.asScala.map(_.getName).toSet
+      var castToStr = false
+
       def unapply(expr: Expression): Option[Attribute] = {
         expr match {
-          case attr: Attribute => Some(attr)
+          case attr: Attribute
 
 Review comment:
   Such as(PartitionedTablePerfStatsSuite) :
   ```
   genericTest("lazy partition pruning reads only necessary partition data")
   ```
   Relative query  is( partCol1 is a Int type partition key):
   ```
             spark.sql("select * from test where partCol1 = 999").count()
   
   ```
   The relative log is 
   ```
   5 did not equal 0
   ScalaTestFailureLocation: 
org.apache.spark.sql.hive.PartitionedTablePerfStatsSuite at 
(PartitionedTablePerfStatsSuite.scala:139)
   Expected :0
   Actual   :5
   <Click to see difference>
   
   org.scalatest.exceptions.TestFailedException: 5 did not equal 0
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to