[ 
https://issues.apache.org/jira/browse/HUDI-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated HUDI-6123:
-----------------------------
    Description: 
This is a follow-up fix for [#8111|https://github.com/apache/hudi/pull/8111].

Currently, the {{hoodie.auto.adjust.lock.configs}} opiton is by default false 
for batch mode ingestion,
and true for spark streaming sink and delta_streamer, while MDT is by default 
enabled.

For multiple streaming writers with no explicit lock provider set up, 
{{InProcessLockProvider}} should not be used.

Change list:
 # Restrict the option {{hoodie.auto.adjust.lock.configs}} to take effect in 
single writer scope, because for multi-writer, the
{{InProcessLockProvider}} can not work as expected among hosts/processes;
 # The LockManager #lock and #unlock are invoked from the TransactionManager
which already does the checks for the requirement of an explicit lock, remove 
the redundant check in LockManager.

  was:
Currently, the `hoodie.auto.adjust.lock.configs` opiton is by default false, 
while MDT is by default enabled,
for single writer with any async table services enabled, the MDT commit is not 
protected by any lock providers,
this could incur inconsistentcy between dataset and metadata.

At least by default, we should make it work. Imagine a simple use case: MOR 
single writer + async compaction.

 

Change list:
1. Change option `hoodie.auto.adjust.lock.configs` by default as true.

2. Restrict the option `hoodie.auto.adjust.lock.configs` to take effect only 
for single writer, because in multi-writer scenarios, the 
`InProcessLockProvider` can not work as expected for multiple processes.


> Auto adjust lock configs only for single writer
> -----------------------------------------------
>
>                 Key: HUDI-6123
>                 URL: https://issues.apache.org/jira/browse/HUDI-6123
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: writer-core
>    Affects Versions: 0.13.0
>            Reporter: Danny Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.13.1, 0.14.0
>
>
> This is a follow-up fix for [#8111|https://github.com/apache/hudi/pull/8111].
> Currently, the {{hoodie.auto.adjust.lock.configs}} opiton is by default false 
> for batch mode ingestion,
> and true for spark streaming sink and delta_streamer, while MDT is by default 
> enabled.
> For multiple streaming writers with no explicit lock provider set up, 
> {{InProcessLockProvider}} should not be used.
> Change list:
>  # Restrict the option {{hoodie.auto.adjust.lock.configs}} to take effect in 
> single writer scope, because for multi-writer, the
> {{InProcessLockProvider}} can not work as expected among hosts/processes;
>  # The LockManager #lock and #unlock are invoked from the TransactionManager
> which already does the checks for the requirement of an explicit lock, remove 
> the redundant check in LockManager.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to