[ 
https://issues.apache.org/jira/browse/IMPALA-11904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11904 started by Ye Zihao.
-----------------------------------------
> Data cache should support dumping metadata for reloading
> --------------------------------------------------------
>
>                 Key: IMPALA-11904
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11904
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.3.0
>            Reporter: Ye Zihao
>            Assignee: Ye Zihao
>            Priority: Major
>
> Data cache mainly includes cache metadata and cache files. The cache files 
> are located on the disk and is responsible for storing cached data content, 
> while the cache metadata is located in the memory and is responsible for 
> indexing to the cache file according to the cache key.
> Currently, if the impalad process exits, the cache metadata will be lost.   
> After the Impalad process restarts, we cannot reuse the cache file even 
> though it is still on the disk, because there is no corresponding cache 
> metadata for index.
> If we can support dumping the cache metadata to disk when the process exits, 
> then the next time the process starts it can be reloaded back into memory and 
> the previous cache files can be reused. This would be helpful in a real 
> production environment, where cache data often exceeds TB in size (per 
> process), and loss of cache data due to a configuration change or version 
> upgrade can take days to recover.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to