[GitHub] [spark] peter-toth commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

GitBox Thu, 08 Sep 2022 09:55:23 -0700


peter-toth commented on code in PR #36027:
URL: https://github.com/apache/spark/pull/36027#discussion_r966199347



##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -1095,7 +1095,11 @@ private[hive] object HiveClientImpl extends Logging {
     table.bucketSpec match {
       case Some(bucketSpec) if !HiveExternalCatalog.isDatasourceTable(table) =>
         hiveTable.setNumBuckets(bucketSpec.numBuckets)
-        hiveTable.setBucketCols(bucketSpec.bucketColumnNames.toList.asJava)

Review Comment:
   The issue here is that this `toHiveTable()` is called 2 times during the 
test below.
   First when the table is created. At that time the `table.schema` contains 
uppercase `B_C` and so does `table.bucketSpec` too (`B_C`). So simply 
lowercasing `bucketColumnNames` here for `setBucketCols()`  would throw a 
similar exception to the one in the description, but bucketspec would be lower 
and schema column would be uppercase.
   
   Then 2nd time during the `collect()`, the Hive table is restored from 
metastore for a `listPartitionsByFilter()` Hive call. But this time `schema` 
contains lowercase `b_c` (columns are not case preserved) but `bucketSpec` 
contains uppercase `B_C` (bucket spec is case preserved for some reason) and 
`setBucketCols()` throws the exception in the description. I'm trying to fix 
this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] peter-toth commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

Reply via email to