[GitHub] [hudi] bithw1 opened a new issue, #7989: [SUPPORT]Is there only 1 commit when using AWSDmsAvroPayload

via GitHub Fri, 17 Feb 2023 20:41:44 -0800


bithw1 opened a new issue, #7989:
URL: https://github.com/apache/hudi/issues/7989


   
   Hi,
   
   I am using AWSDmsAvroPayload as the payload class, and I write 5 small data 
set(each is seq of scala case class instances).
   
   
   and then I kick off 5 spark job to write the data to the hudi table, each 
job will write a dataset to the hudi.
   pseudocode is
   `(1 to 5).foreach(do_insert_update_delete)`
   
   When the 5 jobs are done, I do the following query to see the different 
commit times.
   
   ` spark.sql("select distinct _hoodie_commit_time from t order by 
_hoodie_commit_time desc ").show(truncate = false)`
   
   
   I am surprised to find that there is only one _hoodie_commit_time,   I would 
ask whether this works as expected.
   
   I have thought each spark job will create a commit,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] bithw1 opened a new issue, #7989: [SUPPORT]Is there only 1 commit when using AWSDmsAvroPayload

Reply via email to