bhasudha commented on code in PR #9372:
URL: https://github.com/apache/hudi/pull/9372#discussion_r1307298317


##########
website/docs/concurrency_control.md:
##########
@@ -186,18 +221,32 @@ A Hudi Streamer job can then be triggered as follows:
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field impresssiontime \
   --target-base-path file:\/\/\/tmp/hudi-streamer-op \ 
-  --target-table uber.impressions \
+  --target-table taableName \
   --op BULK_INSERT
 ```
 
+## Early conflict Detection
+
+Multi writing using OCC allows multiple writers to concurrently write and 
atomically commit to the Hudi table if there is no overlapping data file to be 
written, to guarantee data consistency, integrity and correctness. Prior to the 
0.13.0 release, such conflict detection of overlapping data files is performed 
before commit metadata and after the data writing is completed. If any conflict 
is detected in the final stage, it could have wasted compute resources because 
the data writing is finished already.
+
+To improve the concurrency control, the 0.13.0 release introduced a new 
feature, early conflict detection in OCC, to detect the conflict during the 
data writing phase and abort the writing early on once a conflict is detected, 
using Hudi's marker mechanism. Hudi can now stop a conflicting writer much 
earlier because of the early conflict detection and release computing resources 
necessary to cluster, improving resource utilization.
+
+By default, this feature is turned off. To try this out, a user needs to set 
`hoodie.write.concurrency.early.conflict.detection.enable` to true, when using 
OCC for concurrency control (Refer 
[configs](https://hudi.apache.org/docs/next/configurations#Write-Configurations-advanced-configs)
 page for all relevant configs).

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to