[ 
https://issues.apache.org/jira/browse/HUDI-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886042#comment-17886042
 ] 

Y Ethan Guo commented on HUDI-7740:
-----------------------------------

This is no longer necessary as we have added new lock provider implementations 
that only use table's base path to guarantee the same lock to use on the same 
table (see the new DynamoDB-based lock provider HUDI-8005 
https://github.com/apache/hudi/pull/11667, and Zookeeper-based lock provider, 
HUDI-8090 https://github.com/apache/hudi/pull/11790).

> Sharing locks across distributed Hudi writers
> ---------------------------------------------
>
>                 Key: HUDI-7740
>                 URL: https://issues.apache.org/jira/browse/HUDI-7740
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: metadata
>            Reporter: Lin Liu
>            Assignee: Ethan Guo (this is the old account; please use "yihua")
>            Priority: Critical
>
> One Hudi table can be ingested by multiple writers concurrently. Without 
> proper synchronization, data corruptions can happen easily, and the most 
> straight forward method is to use locks. In-process locks can be used for 
> writers in the same JVM. For distributed writers, a distributed lock should 
> be utilized.
> Given a distributed lock has been generated for these writers, our immediate 
> question is to share the lock across all these writers reliably. In this 
> effort, we aim to utilize the `.hoodie` folder as the central place to share 
> lock information. 
> The goal of this effort is to make sure in this concurrent scenario,
>  # Each write operation should be guarded with a lock.
>  # All writers utilize the same lock at any moment.
>  # When there is an lock update, the lock update operation should satisfy the 
> above two conditions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to