rdblue commented on a change in pull request #3461:
URL: https://github.com/apache/iceberg/pull/3461#discussion_r744306962
##########
File path:
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkWriteBuilder.java
##########
@@ -112,42 +136,74 @@ public BatchWrite buildForBatch() {
// Get application id
String appId = spark.sparkContext().applicationId();
- SparkWrite write = new SparkWrite(spark, table, writeConf, writeInfo,
appId, writeSchema, dsSchema);
- if (overwriteByFilter) {
- return write.asOverwriteByFilter(overwriteExpr);
- } else if (overwriteDynamic) {
- return write.asDynamicOverwrite();
- } else if (overwriteFiles) {
- return write.asCopyOnWriteMergeWrite(mergeScan, isolationLevel);
+ Distribution distribution;
+ SortOrder[] ordering;
+
+ if (requestDistributionAndOrdering) {
+ distribution = buildRequiredDistribution();
+ ordering = buildRequiredOrdering(distribution);
} else {
- return write.asBatchAppend();
+ LOG.warn("Can't request distribution/ordering as extensions are disabled
and spec has non-identity transforms");
Review comment:
I think we generally try to hide from end users that partitioning by a
column actually uses an identity transform. Can we update this to state that
the table partitioning includes transforms? Like "Skipping distribution and
ordering request: extensions are disabled and partitioning uses unsupported
transforms"?
I also like to use "Skipping" rather than "Can't" because this is going
ahead with the write. If we "can't" do something we normally fail.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]