scxwhite commented on code in PR #7309:
URL: https://github.com/apache/hudi/pull/7309#discussion_r1035551565
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -265,6 +265,13 @@ public class HoodieWriteConfig extends HoodieConfig {
.withDocumentation("When inserted records share same key, controls
whether they should be first combined (i.e de-duplicated) before"
+ " writing to storage.");
+ public static final ConfigProperty<String> PERSIST_BEFORE_INSERT =
ConfigProperty
Review Comment:
Thank you for your reply.
You mentioned two points here.
- The first is: "we should refrain from adding another config here"
The reason for adding this configuration is to consider the use of non
sorted bulk_insert and other write operations without additional intermediate
operations, persistence may not be necessary.
- The secnod is: "We should optimize the DAG instead"
The underlying principles of hudi are not clear to data developers. They may
not realize that even if they only need to upsert once, they need to cache data
to speed up writing.And this will also increase the development cost of data
developers.
Just my own opinion, I look forward to your reply。
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]