[
https://issues.apache.org/jira/browse/HUDI-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378969#comment-17378969
]
ASF GitHub Bot commented on HUDI-1483:
--------------------------------------
zhangyue19921010 edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113
Hi @codope Just want to know, is this Async clustering function can handle
the following scenarios and losing no data:
There are 3 small file groups named fg1, fg2 and fg3 contained file slice1,
file slice2 and file slices3 separately.
When async schedule **start to make a cluster plan but not finished**, there
is an inflight or requested commit for fg1 which will create file slice 11
based on file slice1. In other words **file slice11 is creating but not
committed** ---> I believe this is this scene is similar to multi writer.
What does this async clustering function will do?
Will this clustering plan contains file slice1? if contained, I think the
new data in file slice11 will be lost.
Looking forward to your reply, thanks a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> async clustering for deltastreamer
> ----------------------------------
>
> Key: HUDI-1483
> URL: https://issues.apache.org/jira/browse/HUDI-1483
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: liwei
> Assignee: liwei
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)