[jira] [Updated] (HDDS-7935) [Snapshot] LRU Cache entries may get evicted/closed during long running processes

ASF GitHub Bot (Jira) Fri, 14 Apr 2023 06:58:04 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ASF GitHub Bot updated HDDS-7935:
---------------------------------
    Labels: pull-request-available  (was: )

> [Snapshot] LRU Cache entries may get evicted/closed during long running 
> processes
> ---------------------------------------------------------------------------------
>
>                 Key: HDDS-7935
>                 URL: https://issues.apache.org/jira/browse/HDDS-7935
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: George Jahad
>            Assignee: Siyao Meng
>            Priority: Major
>              Labels: pull-request-available
>
> The way the snapshot LRU cache is implemented, when the oldest snapshot is 
> evicted, the corresponding rocksdb instance is closed: 
> https://github.com/apache/ozone/blob/3f7ded2a34c0c35b89901e222ceaee0d1fdf08b6/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java#L124
> That is probably fine for shortlived tasks like users reading snapshots, but 
> is probably not safe for long lived tasks like snap diff and maybe snapshot 
> delete.
> The problem is that the cache is currently only refreshed when the snapshot 
> is initially retrieved from the cache; subsequent reads from the snapshot 
> itself don't refresh the cache.  Thus it is possible for rocksdb instances to 
> be evicted and closed in the middle of snap diff processing.
> One alternative I can think of is to add some kind of reference counting 
> scheme so that rocksdb instances aren't closed automatically on eviction.
> Another possibility is to have an entirely separate pool of snapshot entries, 
> outside of the cache, that are explicitly opened and closed by long running 
> tasks like snapdiff.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-7935) [Snapshot] LRU Cache entries may get evicted/closed during long running processes

Reply via email to