[
https://issues.apache.org/jira/browse/HUDI-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Surya Prasanna Yalla updated HUDI-3580:
---------------------------------------
Summary: [RFC-48] Support LogCompaction action for MOR tables (was:
[RFC-TBD] Support LogCompaction action for MOR tables)
> [RFC-48] Support LogCompaction action for MOR tables
> ----------------------------------------------------
>
> Key: HUDI-3580
> URL: https://issues.apache.org/jira/browse/HUDI-3580
> Project: Apache Hudi
> Issue Type: Epic
> Components: compaction, metadata
> Reporter: Surya Prasanna Yalla
> Priority: Major
> Labels: pull-request-available
>
> Record level index uses metadata table which is a MOR table.
> Each delta commit in metadata table, creates multiple hfile log blocks and so
> to read them multiple file handles has to be opened which might cause issues
> in read performance. To reduce the read performance, compaction can be run
> frequently which basically merges all the log blocks to base file and creates
> another base file. If this is done frequently, it would cause write
> amplification.
> Instead of merging all the log blocks to base file and doing a full
> compaction, minor compaction can be done which basically stitches log blocks
> and create one log block.
> This can be achieved by adding a new action to Hudi called logcompaction, and
> it operates at log file level. Compaction is creating base files and issues
> .commit upon completion, similarly minor compaction which is basically
> creates a new log block can issue a .deltacommit commit on the timeline after
> completion.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)