[
https://issues.apache.org/jira/browse/HUDI-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706343#comment-17706343
]
Sagar Sumit commented on HUDI-2461:
-----------------------------------
Related to RFC-61 [https://github.com/apache/hudi/pull/7907]
> Support lock free multi-writer for metadata table
> -------------------------------------------------
>
> Key: HUDI-2461
> URL: https://issues.apache.org/jira/browse/HUDI-2461
> Project: Apache Hudi
> Issue Type: Improvement
> Components: metadata, multi-writer, writer-core
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Critical
> Fix For: 1.0.0
>
>
> Even with synchronous patch, we instantiate metadata table with single writer
> mode only.
> But we need to support async compaction and cleaning and hence we need to
> think about supporting multi-writer down the line.
> Details:
> all writes to metadata table happens within data table lock, including
> compaction and cleaning in metadata table since we do inline. But as we scale
> metadata table infra w/ more indexes, we need to support async compaction and
> cleaning and so we need multi-writer support.
> One possibility:
> - Special transaction management for metadata table.
> data table commits: all writes to metadata table will be guarded by datatable
> lock (regular writes, clustering, compaction, everything). regular writes
> will do usual conflict resolution, where as compaction and clustering may
> not.
> Now coming to metadata table commits, there won't be any conflict resolution
> in general for whole of metadata table. But we will ensure any commit happens
> by acquiring a lock. Our presumption is that, all the conflict resolution
> would have happened within data table before proceeding to make a commit in
> metadata table and so we don't need to do any conflict resolution
> specifically.
> Scheduling of compaction and cleaning will happen along w/ regular upserts.
> and we will have async compaction and cleaning support. so, when these async
> operations are looking to commit in metadata table, they will acquire lock,
> make the commit and release the lock. Only one writer will be in progress
> during metadata commit.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)