openinx commented on a change in pull request #2064:
URL: https://github.com/apache/iceberg/pull/2064#discussion_r560006175
##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -138,6 +138,9 @@ private TableProperties() {
public static final String ENGINE_HIVE_ENABLED = "engine.hive.enabled";
public static final boolean ENGINE_HIVE_ENABLED_DEFAULT = false;
+ public static final String WRITE_SHUFFLE_BY_PARTITION =
"write.shuffle-by.partition";
Review comment:
@rdblue I like the table you provided, I have few questions : For an
iceberg table which has defined its __SortOder__ columns, the spark job will
write the sorted records (based on sort keys) into parquet files, should the
flink job also write the sorted records into parquet files ? Should we keep
the same semantic of __SortOrder__ among different engines although it's not
cheap to accomplish the goal ? ( I raise this question because I saw the
__Flink__ table does not require locally sorted or global sorted )
Or the definition of __SortOrder__ is to define the write behavior while
don't define the read behavior that means the records read from parquet file
don't have to be sorted by the sort-keys ? Seems like defining write behavior
is the behavior you guys try to accomplish from my understanding based the
above table.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]