Guosmilesmile commented on code in PR #15566:
URL: https://github.com/apache/iceberg/pull/15566#discussion_r2935263805


##########
flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java:
##########
@@ -626,6 +633,92 @@ public Builder setSnapshotProperty(String property, String 
value) {
       return this;
     }
 
+    /**
+     * Enables or disables compaction (rewriting data files) as a post-commit 
maintenance task.
+     *
+     * @param enabled whether to enable compaction
+     * @see RewriteDataFilesConfig for the default config.
+     * @deprecated See {@code rewriteDatafiles(..)}
+     */
+    @Deprecated
+    public Builder compaction(boolean enabled) {
+      writeOptions.put(FlinkWriteOptions.COMPACTION_ENABLE.key(), 
Boolean.toString(enabled));
+      return this;
+    }
+
+    /**
+     * Enables or disables compaction (rewriting data files) as a post-commit 
maintenance task.
+     *
+     * @param enabled whether to enable compaction
+     * @see RewriteDataFilesConfig for the default config.
+     */
+    public Builder rewriteDataFiles(boolean enabled) {
+      writeOptions.put(FlinkWriteOptions.COMPACTION_ENABLE.key(), 
Boolean.toString(enabled));
+      return this;
+    }
+
+    /**
+     * Enables or disables compaction (rewriting data files) as a post-commit 
maintenance task.
+     *
+     * @param enabled whether to enable compaction
+     * @param config task-specific configuration, see {@link 
RewriteDataFilesConfig} for available
+     *     keys
+     */
+    public Builder rewriteDataFiles(boolean enabled, Map<String, String> 
config) {
+      rewriteDataFiles(enabled);
+      writeOptions.putAll(config);
+      return this;
+    }
+
+    /**
+     * Enables or disables expire snapshots as a post-commit maintenance task.
+     *
+     * @param enabled whether to enable expire snapshots
+     * @see ExpireSnapshotsConfig for the default config.
+     */
+    public Builder expireSnapshots(boolean enabled) {
+      writeOptions.put(FlinkWriteOptions.EXPIRE_SNAPSHOTS_ENABLE.key(), 
Boolean.toString(enabled));

Review Comment:
   **ExpireSnapshotsConfig**
   
   - `schedule.commit-count`  
   - `schedule.interval-second`  
   - `max-snapshot-age-seconds`  
   - `retain-last`  
   - `delete-batch-size`  
   - `clean-expired-metadata`  
   - `planning-worker-pool-size`  
   
   Regarding `clean-expired-metadata`, I think the default value could be 
`true`.  
   For `max-snapshot-age-seconds` and `retain-last`, the defaults are both 
`null`—do we need to set a default value for them?
   
   ---
   
   **DeleteOrphanFilesConfig**
   
   - `schedule.interval-second`  
   - `min-age-seconds`  
   - `delete-batch-size`  
   - `use-prefix-listing`  
   - `planning-worker-pool-size`  
   - `equal-schemes`  
   - `equal-authorities`  
   - `prefix-mismatch-mode`  
   
   Is it enough to keep just these?
   
   From my perspective, I’d like to set `use-prefix-listing` to `true`. We 
discussed before that we’d like to eventually drop the dependency on the Hadoop 
library, and setting this to `true` aligns better with that direction. 
   My concern is whether we might end up in a situation where the SQL defaults 
to `true`, but the default is `false` when someone manually creates a 
`DeleteOrphanFiles` task in single tablemainance job, which could be 
misleading. If the documentation clearly states the default value, that could 
avoid the issue, but I’d like to know your thoughts.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to