[GitHub] [hudi] bhasudha commented on a diff in pull request #9372: [DOCS]Update Concurrency page

via GitHub Mon, 28 Aug 2023 04:20:58 -0700


bhasudha commented on code in PR #9372:
URL: https://github.com/apache/hudi/pull/9372#discussion_r1307296531



##########
website/docs/concurrency_control.md:
##########
@@ -186,18 +221,32 @@ A Hudi Streamer job can then be triggered as follows:
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field impresssiontime \
   --target-base-path file:\/\/\/tmp/hudi-streamer-op \ 
-  --target-table uber.impressions \
+  --target-table taableName \
   --op BULK_INSERT
 ```
 
+## Early conflict Detection
+
+Multi writing using OCC allows multiple writers to concurrently write and 
atomically commit to the Hudi table if there is no overlapping data file to be 
written, to guarantee data consistency, integrity and correctness. Prior to the 
0.13.0 release, such conflict detection of overlapping data files is performed 
before commit metadata and after the data writing is completed. If any conflict 
is detected in the final stage, it could have wasted compute resources because 
the data writing is finished already.

Review Comment:
   Thanks. Taking it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] bhasudha commented on a diff in pull request #9372: [DOCS]Update Concurrency page

Reply via email to