[ 
https://issues.apache.org/jira/browse/HUDI-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841372#comment-17841372
 ] 

Vinoth Chandar edited comment on HUDI-1045 at 4/30/24 5:54 PM:
---------------------------------------------------------------

At first it may seem trivial to have clustering fail all the time, ceding 
preference to the incoming writes. But, aside from wasting resources, 
clustering can finish before writes and we cannot atomically both rollback 
clustering (note that restoring a completed action is considered/recommended as 
an offline maintenance) as well as finish the write..


was (Author: vc):
At first it may seem trivial to have clustering fail all the time, ceding 
preference to the incoming writes. But, aside from wasting resources, 
clustering can finish before writes and we cannot atomically both rollback 
clustering as well as finish the write..

> Support updates during clustering
> ---------------------------------
>
>                 Key: HUDI-1045
>                 URL: https://issues.apache.org/jira/browse/HUDI-1045
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: clustering, table-service
>            Reporter: leesf
>            Assignee: Vinoth Chandar
>            Priority: Blocker
>             Fix For: 1.0.0
>
>
> We need to allow a writer w writing to file groups f1, f2, f3, concurrently 
> while a clustering service C  reclusters them into  f4, f5. 
>  * Writes can be either updates, deletes or inserts. 
>  * Either clustering C or the writer W can finish first
>  * Both W and C need to be able to complete their actions without much 
> redoing of work. 
>  * The number of output file groups for C can be higher or lower than input 
> file groups. 
>  * Need to work across and be oblivious to whether the writers are operating 
> in OCC or NBCC modes
>  * Needs to interplay well with cleaning and compaction services.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to