stevenzwu commented on PR #7161:
URL: https://github.com/apache/iceberg/pull/7161#issuecomment-1861850713

   It is reverted because there are users depending on the previous behavior of 
keyBy all partition columns. 
https://github.com/apache/iceberg/pull/7161#issuecomment-1761169778
   
   We were assuming that if there is a bucket column, users only want to 
shuffle by the bucketing column. that is not the case from the user report 
linked in the above comment. so we decided to roll back for backward 
compatibility.
   
   @bendevera you are right that `BucketPartitioner` isn't public and can't be 
used at the moment. Now we need to discuss what's the best way moving forward? 
we are working on a more comprehensive smart shuffling (range partition) 
feature: https://github.com/apache/iceberg/projects/27. I am thinking maybe we 
can expose this in `range` distribution mode.
   
   before that, you may have to copy the code and manually apply the bucketing 
shuffling.
   ```
   input.partitionCustom(
                       new BucketPartitioner(partitionSpec),
                       new BucketPartitionKeySelector(partitionSpec, iSchema, 
flinkRowType));
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to