[
https://issues.apache.org/jira/browse/HUDI-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-1590:
--------------------------------------
Sprint: 2022/05/02 (was: 2022/05/02, 2022/05/16)
> Support async clustering w/ test suite job
> ------------------------------------------
>
> Key: HUDI-1590
> URL: https://issues.apache.org/jira/browse/HUDI-1590
> Project: Apache Hudi
> Issue Type: Test
> Components: Testing, tests-ci
> Affects Versions: 0.9.0
> Reporter: sivabalan narayanan
> Assignee: Sagar Sumit
> Priority: Major
> Fix For: 0.12.0
>
> Original Estimate: 8h
> Remaining Estimate: 8h
>
> As of now, we only have inline clustering support w/ hoodie test suite job.
> we need to add support for async clustering.
> This might be tricky since the regular writes should not overstep w/
> clustering. if not the pipeline will fail. So, data generation has to go hand
> in hand w/ clustering configs. For eg, if clustering will get triggered every
> 4 commits, data generation should switch partitions for every 4 batches of
> input. That way there won't be any overstepping and pipeline can run for as
> many iterations as needed.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)