Github user tejasapatil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16898#discussion_r100697621
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
 ---
    @@ -189,7 +215,7 @@ object FileFormatWriter extends Logging {
         committer.setupTask(taskAttemptContext)
     
         val writeTask =
    -      if (description.partitionColumns.isEmpty && 
description.bucketSpec.isEmpty) {
    +      if (description.partitionColumns.isEmpty && description.numBuckets 
== 0) {
    --- End diff --
    
    For someone reading the code, this might be non intuitive to understand 
that you are checking if there is no bucketing. `0` has been used in many 
places in this PR to check if table has bucketing. Maybe orthogonal to the PR, 
but in general we could have a util method to do this. I can send a tiny PR for 
this if you agree that its a good thing to do.
    
    PS: Having 0 buckets is a thing in Hive however logically it makes no sense 
and confusing. Under the hood, it treats that as a table with single bucket. 
Its good that Spark does not allow this.
    
    ```
    # hive-1.2.1
    
    hive> CREATE TABLE tejasp_temp_can_be_deleted (key string, value string) 
CLUSTERED BY (key) INTO 0 BUCKETS;
    Time taken: 1.144 seconds
    
    hive> desc formatted tejasp_temp_can_be_deleted;
    
    # Storage Information
    ...
    Num Buckets:                0
    Bucket Columns:             [key]
    Sort Columns:               []
    
    hive>INSERT OVERWRITE TABLE tejasp_temp_can_be_deleted SELECT * FROM ....;
    # doing `ls` on the output directory shows a single file
    ```
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to