rdblue commented on a change in pull request #2064:
URL: https://github.com/apache/iceberg/pull/2064#discussion_r560365972



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -138,6 +138,9 @@ private TableProperties() {
   public static final String ENGINE_HIVE_ENABLED = "engine.hive.enabled";
   public static final boolean ENGINE_HIVE_ENABLED_DEFAULT = false;
 
+  public static final String WRITE_SHUFFLE_BY_PARTITION = 
"write.shuffle-by.partition";

Review comment:
       The sort order of a table is a recommendation, not a requirement. And 
you're right that it is for writing. That's why the DDL to update it is `WRITE 
ORDERED BY ...`.
   
   We don't guarantee a sort order on read except when a data or (eq) delete 
file has a sort order in metadata (see #1975). The sort order for a table may 
change and even if writes are globally sorted, multiple writes to the same 
partitions will produce different file sets that can't be read in order to 
produce sorted records. That's why we don't make guarantees about reads. 
Ordering on write is primarily a way to cluster rows for efficient filtering.
   
   If row-level ordering is expensive, as it is for Flink, then it is perfectly 
fine to ignore the recommendation. Flink may eventually provide a way to order 
within data files, but I think that is less important than clustering data 
across files so that data files can be skipped in queries. That's what Steven's 
idea would achieve, along with handling skew.
   
   It is still valuable to have a write order, even if Flink doesn't guarantee 
it. If Flink can cluster data by that order, then that's really helpful. And, 
other services can rewrite those data files after the data is available if row 
ordering is needed for page skipping within data files. A service that sorts 
data files after Flink writes them also needs to know the desired order.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to