[
https://issues.apache.org/jira/browse/HUDI-6979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791711#comment-17791711
]
sivabalan narayanan commented on HUDI-6979:
-------------------------------------------
this will definitely be a good addition
> support EventTimeBasedCompactionStrategy
> ----------------------------------------
>
> Key: HUDI-6979
> URL: https://issues.apache.org/jira/browse/HUDI-6979
> Project: Apache Hudi
> Issue Type: New Feature
> Components: compaction
> Reporter: Kong Wei
> Assignee: Kong Wei
> Priority: Major
>
> The current compaction strategies are based on the logfile size, the number
> of logfile files, etc. The data time of the RO table generated by these
> strategies is uncontrollable. Hudi also has a DayBased strategy, but it
> relies on day based partition path and the time granularity is coarse.
> The *EventTimeBasedCompactionStrategy* strategy can generate event
> time-friendly RO tables, whether it is day based partition or not. For
> example, the strategy can select all logfiles whose data time is before 3 am
> for compaction, so that the generated RO table data is before 3 am. If we
> just want to query data before 3 am, we can just query the RO table which is
> much faster.
> With the strategy, I think we can expand the application scenarios of RO
> tables.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)