wangyum commented on code in PR #47998:
URL: https://github.com/apache/spark/pull/47998#discussion_r1980741404


##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala:
##########
@@ -417,8 +417,11 @@ private[client] class Shim_v2_0 extends Shim with Logging {
       try {
         val partitionSchema = 
CharVarcharUtils.replaceCharVarcharWithStringInSchema(
           catalogTable.partitionSchema)
+        val lowerCasePredicates = predicates.map(_.transform {
+          case a: AttributeReference => 
a.withName(a.name.toLowerCase(Locale.ROOT))
+        })
         val boundPredicate = 
ExternalCatalogUtils.generatePartitionPredicateByFilter(
-          catalogTable, partitionSchema, predicates)
+          catalogTable, partitionSchema, lowerCasePredicates)

Review Comment:
   1. This `catalogTable` converted from `rawHiveTable`, so the partition 
column is lower case.
   
   
https://github.com/apache/spark/blob/c3fcd0a1ea11fc1d47119f4f94b9d80221484ebe/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L831-L832
   
   So we need to convert the filter column name to lowercase. This way we can 
correctly distinguish whether it is a partition filter in 
`generatePartitionPredicateByFilter`.
   
https://github.com/apache/spark/blob/19509d07983d050d0234c03760433bb67c823055/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala#L194-L202
   
   ---
   
   2. Even though we can use uppercase to distinguish whether it is a partition 
filter, we still need to use lowercase later.
   
https://github.com/apache/spark/blob/c2343f76325141aaff0279de2f29c77d0af0e311/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala#L435-L436
   
   So it is a good idea to convert all predicates to lowercase at the beginning.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to