[ 
https://issues.apache.org/jira/browse/FLINK-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701939#comment-16701939
 ] 

ASF GitHub Bot commented on FLINK-10473:
----------------------------------------

azagrebin opened a new pull request #7188: [FLINK-10473][State TTL] TTL state 
incremental cleanup for heap backend
URL: https://github.com/apache/flink/pull/7188
 
 
   ## What is the purpose of the change
   
   Introduce TTL state incremental cleanup for heap backend.
   See docs in PR for high-level idea description.
   
   ## Brief change log
   
    - Add namespace/key iterator method to internal KV state interface.
    - Implement it for CopyOnWriteStateTable and NestedMapsStateTable in heap 
backend
    - Add TtlIncrementalCleanup class which is called back for each state 
access and optionally for each keyed context change (record processing) and 
performs the cleanup using state namespace/key iterator
    - Add config for incremental cleanup in StateTtlConfig
    - Construct TtlIncrementalCleanup in TtlStateFactory
    - Add unit test TtlStateTestBase.testIncrementalCleanup
    - smaller improvements in unit/e2e TTL tests
   
   ## Verifying this change
   
   Unit test TtlStateTestBase.testIncrementalCleanup.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / no)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
     - The serializers: (yes / no / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
     - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / no)
     - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> State TTL incremental cleanup using Heap backend key iterator
> -------------------------------------------------------------
>
>                 Key: FLINK-10473
>                 URL: https://issues.apache.org/jira/browse/FLINK-10473
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.7.0
>            Reporter: Andrey Zagrebin
>            Assignee: Andrey Zagrebin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.8.0
>
>
> This feature enables lazy background cleanup of state with time-to-live in 
> state keyed backend which stores state in JVM heap. The idea is to keep a 
> global state lazy iterator with loose consistency. Every time a state value 
> for some key is accessed or a record is processed, the iterator is advanced, 
> TTL of iterated state entries is checked and the expired entries are cleaned 
> up. When the iterator reaches the end of state storage it just starts over. 
> This way the state with TTL is regularly cleaned up to prevent ever growing 
> memory consumption. The caveat of this cleanup strategy is that if state is 
> not accessed or no records are processed then accumulated expired state still 
> occupies the storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to