[ 
https://issues.apache.org/jira/browse/FLINK-9938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565237#comment-16565237
 ] 

ASF GitHub Bot commented on FLINK-9938:
---------------------------------------

StefanRRichter commented on a change in pull request #6460: [FLINK-9938] Clean 
up full snapshot from expired state with TTL
URL: https://github.com/apache/flink/pull/6460#discussion_r206855467
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/heap/CopyOnWriteStateTableSnapshot.java
 ##########
 @@ -217,12 +223,29 @@ protected void reportAllElementKeyGroups() {
                        int flattenIndex = 0;
                        for (CopyOnWriteStateTable.StateTableEntry<K, N, S> 
entry : partitioningDestination) {
                                while (null != entry) {
-                                       final int keyGroup = 
KeyGroupRangeAssignment.assignToKeyGroup(entry.key, totalKeyGroups);
-                                       
reportKeyGroupOfElementAtIndex(flattenIndex, keyGroup);
-                                       partitioningSource[flattenIndex++] = 
entry;
+                                       
CopyOnWriteStateTable.StateTableEntry<K, N, S> filteredEntry = 
filterEntry(entry);
 
 Review comment:
   What I don't like about this line in particular is that it is a bit invasive 
and adds branching to an otherwise very tight loop, and I am not sure that this 
will simply be removed by the JIT. Can we do this in a way that does not 
involve filtering-calls if it is clear that no filtering is done, similar to 
the `NestedMapsStateTable`?
   
   Overall, maybe we can keep in mind that there will be a unified format for 
the different state backends, and that probably will not start by writing a 
count. After that is introduced, we can also push the filtering right before 
the writing to the stream for a nice separation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> State TTL cleanup during full state scan upon checkpointing
> -----------------------------------------------------------
>
>                 Key: FLINK-9938
>                 URL: https://issues.apache.org/jira/browse/FLINK-9938
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.6.0
>            Reporter: Andrey Zagrebin
>            Assignee: Andrey Zagrebin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.6.0
>
>
> We can try to piggyback full state scan during certain checkpoint processes 
> in backends, check TTL expiration for every entry and evict expired to speed 
> up cleanup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to