stream2000 commented on code in PR #8905:
URL: https://github.com/apache/hudi/pull/8905#discussion_r1226054407


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCleanConfig.java:
##########
@@ -179,6 +179,12 @@ public class HoodieCleanConfig extends HoodieConfig {
           + " table receives updates/deletes. Another reason to turn this on, 
would be to ensure data residing in bootstrap "
           + " base files are also physically deleted, to comply with data 
privacy enforcement processes.");
 
+  public static final ConfigProperty<Boolean> 
CLEANER_IGNORE_APPEND_WRITE_COMMITS = ConfigProperty
+      .key("hoodie.cleaner.ignore.append.write.commits")
+      .defaultValue(false)
+      .markAdvanced()
+      .withDocumentation("When set to true, cleaner will ignore partition 
affected by commits/delta commits. This is usefule for append write mode");
+

Review Comment:
   > Yeah, but we have no good manner to infer that config if the table service 
operations are all async. How about just config to disable the cleaning 
manually in the ingestion job?
   
   That's why we introduce the above config 
`hoodie.cleaner.ignore.append.write.commits` in our inner version.  We don't 
need to care about whether the table services are sync/async, running in the 
streaming ingestions job or just batch job like delete partitions by sparksql. 
User just add a single config where the clean service is running. What do you 
think? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to