aokolnychyi commented on a change in pull request #2276:
URL: https://github.com/apache/iceberg/pull/2276#discussion_r588793861
##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -78,6 +78,9 @@ private TableProperties() {
public static final String SPLIT_OPEN_FILE_COST =
"read.split.open-file-cost";
public static final long SPLIT_OPEN_FILE_COST_DEFAULT = 4 * 1024 * 1024; //
4MB
+ public static final String SPLIT_BY_PARTITION = "read.split.by-partition";
Review comment:
The problem with read options is that it requires modifying code to
change the value. I am also not sure whether having a table property for this
is going to help us much. Having this at table level will probably also affect
other engines that may not necessarily benefit from bucketed joins.
I am not sure how we can detect whether a table is used in a join or not,
though. I don't think Spark propagates that info to sources. Are there any
ideas for that?
Overall, I am fine either way. I think we will need a read option, though.
It will give us a way to force a particular value. We may want to default it
for bucketed tables in Spark to true unless set by the user in options.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]