[ 
https://issues.apache.org/jira/browse/HUDI-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7034:
----------------------------
    Fix Version/s: 0.14.1

> Refresh view does not work(due to cache)
> ----------------------------------------
>
>                 Key: HUDI-7034
>                 URL: https://issues.apache.org/jira/browse/HUDI-7034
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Vitali Makarevich
>            Priority: Blocker
>             Fix For: 0.14.1
>
>
> Starting from 0.13.1 `spark.catalog.refreshTable` works incorrectly. In 
> 0.12.3 it works ok.
> Reproduction is 
> [here|https://github.com/VitoMakarevich/hudi-incremental-issue/blob/master/src/main/scala/com/example/hudi/HudiRefreshBug.scala].
> What is happening - there is a `BaseHoodieTableFileIndex` class in Hudi - 
> it's saved in spark plan once the table is created. When I call to refresh, 
> the respective method `doRefresh` is called. This method reloads the metadata 
> view, and list of partitions, but now it does not refresh the list of files 
> in partitions - this causes a bug that partitions are stuck at the first file 
> version. So - updates are not picked up and after a couple of commits based 
> on cleaner settings - Spark starts to throw a file not found exception.
> More precisely - it looks to be broken in [this commit 
> |https://github.com/apache/hudi/commit/34b226c0cba7ff022eb8c02246f46c5f9cbe7ec5]
> I can try to provide a fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to