[ 
https://issues.apache.org/jira/browse/HUDI-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-2461:
--------------------------------------
    Description: 
Even with synchronous patch, we instantiate metadata table with single writer 
mode only. 

But we need to support async compaction and cleaning and hence we need to think 
about supporting multi-writer down the line. 

 

Details:

all writes to metadata table happens within data table lock, including 
compaction and cleaning in metadata table since we do inline. But as we scale 
metadata table infra w/ more indexes, we need to support async compaction and 
cleaning and so we need multi-writer support. 

One possibility:

- Special transaction management for metadata table. 

data table commits: all writes to metadata table will be guarded by datatable 
lock (regular writes, clustering, compaction, everything). regular writes will 
do usual conflict resolution, where as compaction and clustering may not. 

Now coming to metadata table commits, there won't be any conflict resolution in 
general for whole of metadata table. But we will ensure any commit happens by 
acquiring a lock. 

Scheduling of compaction and cleaning will happen along w/ regular upserts. and 
we will have async compaction and cleaning support. so, when these async 
operations are looking to commit in metadata table, they will acquire lock, 
make the commit and release the lock. Only one writer will be in progress 
during metadata commit. 

 

 

  was:
Even with synchronous patch, we instantiate metadata table with single writer 
mode only. 

But we need to support async compaction and cleaning and hence we need to think 
about supporting multi-writer down the line. 


> Support multi-writer for metadata table
> ---------------------------------------
>
>                 Key: HUDI-2461
>                 URL: https://issues.apache.org/jira/browse/HUDI-2461
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Writer Core
>            Reporter: sivabalan narayanan
>            Priority: Major
>
> Even with synchronous patch, we instantiate metadata table with single writer 
> mode only. 
> But we need to support async compaction and cleaning and hence we need to 
> think about supporting multi-writer down the line. 
>  
> Details:
> all writes to metadata table happens within data table lock, including 
> compaction and cleaning in metadata table since we do inline. But as we scale 
> metadata table infra w/ more indexes, we need to support async compaction and 
> cleaning and so we need multi-writer support. 
> One possibility:
> - Special transaction management for metadata table. 
> data table commits: all writes to metadata table will be guarded by datatable 
> lock (regular writes, clustering, compaction, everything). regular writes 
> will do usual conflict resolution, where as compaction and clustering may 
> not. 
> Now coming to metadata table commits, there won't be any conflict resolution 
> in general for whole of metadata table. But we will ensure any commit happens 
> by acquiring a lock. 
> Scheduling of compaction and cleaning will happen along w/ regular upserts. 
> and we will have async compaction and cleaning support. so, when these async 
> operations are looking to commit in metadata table, they will acquire lock, 
> make the commit and release the lock. Only one writer will be in progress 
> during metadata commit. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to