[ 
https://issues.apache.org/jira/browse/HUDI-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surya Prasanna Yalla updated HUDI-3580:
---------------------------------------
    Summary: [Umbrella] RFC-48 : Support LogCompaction action for MOR tables  
(was: [RFC-48] Support LogCompaction action for MOR tables)

> [Umbrella] RFC-48 : Support LogCompaction action for MOR tables
> ---------------------------------------------------------------
>
>                 Key: HUDI-3580
>                 URL: https://issues.apache.org/jira/browse/HUDI-3580
>             Project: Apache Hudi
>          Issue Type: Epic
>          Components: compaction, metadata
>            Reporter: Surya Prasanna Yalla
>            Priority: Major
>              Labels: pull-request-available
>
> Record level index uses metadata table which is a MOR table. 
> Each delta commit in metadata table, creates multiple hfile log blocks and so 
> to read them multiple file handles has to be opened which might cause issues 
> in read performance. To reduce the read performance, compaction can be run 
> frequently which basically merges all the log blocks to base file and creates 
> another base file. If this is done frequently, it would cause write 
> amplification.
> Instead of merging all the log blocks to base file and doing a full 
> compaction, minor compaction can be done which basically stitches log blocks 
> and create one log block. 
> This can be achieved by adding a new action to Hudi called logcompaction, and 
> it operates at log file level. Compaction is creating base files and issues 
> .commit upon completion, similarly minor compaction which is basically 
> creates a new log block can issue a .deltacommit commit on the timeline after 
> completion.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to