[ 
https://issues.apache.org/jira/browse/HUDI-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-2475:
--------------------------------------
    Description: 
Upgrade downgrade infra for enabling metadata.

 

If user is having a writer process and clustering/compaction running async.

 

- New synchronous metadata design, has a constraint that once metadata table is 
bootstrapped, all commits will happen synchronously. In other words, there is 
no catch up business wrt datatable.  

So, it may not be feasible to do rolling upgrade (i.e. upgrade writer first 
while async compaction is running) and then upgrade async compaction. 

Bootstrap has to be done by stopping all processes and then we can restart all 
other processes one by one (by using the upgraded hudi library) w/ metadata 
enabled.  

This is the only viable option I can think of. 

1. Stop all processes. Upgrade to hudi library w/ synchronous metadata. bring 
up writer process w/ metadata config enabled. this will bootstrap the metadata 
table. and from there on, any new commits by the writer will do synchronous 
updates to metadata.  

Note: users can choose to upgrade via hudi-cli if need be. but easier would be 
to just start the writer. Expect some delay for first commit since bootstrap 
will be happening. 

2. Upgrade the async table service and restart it. *Ensure metadata table is 
enabled across all processes.*  Even if missed on one, could result in data 
loss.

 

By this, once metadata table is bootstrapped, any new commits from all 
processes will be synced to metadata. 

 

 

 

  was:Upgrade downgrade infra for enabling metadata


> Upgrade downgrade infra for enabling metadata
> ---------------------------------------------
>
>                 Key: HUDI-2475
>                 URL: https://issues.apache.org/jira/browse/HUDI-2475
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Major
>             Fix For: 0.10.0
>
>
> Upgrade downgrade infra for enabling metadata.
>  
> If user is having a writer process and clustering/compaction running async.
>  
> - New synchronous metadata design, has a constraint that once metadata table 
> is bootstrapped, all commits will happen synchronously. In other words, there 
> is no catch up business wrt datatable.  
> So, it may not be feasible to do rolling upgrade (i.e. upgrade writer first 
> while async compaction is running) and then upgrade async compaction. 
> Bootstrap has to be done by stopping all processes and then we can restart 
> all other processes one by one (by using the upgraded hudi library) w/ 
> metadata enabled.  
> This is the only viable option I can think of. 
> 1. Stop all processes. Upgrade to hudi library w/ synchronous metadata. bring 
> up writer process w/ metadata config enabled. this will bootstrap the 
> metadata table. and from there on, any new commits by the writer will do 
> synchronous updates to metadata.  
> Note: users can choose to upgrade via hudi-cli if need be. but easier would 
> be to just start the writer. Expect some delay for first commit since 
> bootstrap will be happening. 
> 2. Upgrade the async table service and restart it. *Ensure metadata table is 
> enabled across all processes.*  Even if missed on one, could result in data 
> loss.
>  
> By this, once metadata table is bootstrapped, any new commits from all 
> processes will be synced to metadata. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to