[ 
https://issues.apache.org/jira/browse/HUDI-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17484404#comment-17484404
 ] 

sivabalan narayanan commented on HUDI-3302:
-------------------------------------------

[~alexey.kudinkin] : may I know if there is any correctness issue or its about 
layering and abstractions. If correctness issue, wanted to target for 0.11. If 
not, can you tag w/ 0.12 (fix version)

 

> Re-evaluate handling of LogBlock appends when Compaction is pending
> -------------------------------------------------------------------
>
>                 Key: HUDI-3302
>                 URL: https://issues.apache.org/jira/browse/HUDI-3302
>             Project: Apache Hudi
>          Issue Type: Task
>            Reporter: Alexey Kudinkin
>            Priority: Major
>
> Currently, when (async) Compaction for particular File Group has been 
> scheduled but not yet completed, if writer will try to append additional Log 
> Blocks to the same file-group following will occur:
>  # FileSystemView (when fetched), will check whether any compaction is 
> pending and if it's it will inject "phantom" (ie non-existent) log-file into 
> the existing FileSlice, which will have the same FileGroup name, but will 
> bear instant of the scheduled Compaction commit (on the timeline) in its name 
> (as opposed to the instant of the base-file)
>  # Writer will pick up such log-file as the latest
>  # Writer will write into such "phantom" log-file
> [REF: 
> https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java#L199|https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java#L199]
>  
> This posses following problems: 
>  * Reader now has to be aware of such handling and therefore always include 
> pending compaction instants into its timeline when fetching the 
> FileSystemView, as otherwise it will miss newly added log-files.
>  * This pushes the decision-making point of where writes should be channeled 
> down into FileSystemView, which is clearly alien to its scope of 
> responsibilities.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to