[jira] [Updated] (FLINK-23461) Consider disallowing in-memory state handles for materialized state

Roman Khachatryan (Jira) Wed, 21 Jul 2021 09:14:31 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-23461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Roman Khachatryan updated FLINK-23461:
--------------------------------------
    Description: 
*For non-mateialized part*, FLINK-21353 doesn't use nor 
FsCheckpointStreamFactory neither PlaceholderStreamStateHandle so it's not an 
issue. Adding it in the future doesn't make sense as for such small changes 
incremental checkpoint might work better.

 

*For materialized part, ByteStreamStateHandle* can be currently used. This can 
bring back to life issues like FLINK-21351 - if checkpoint subsumption on TM 
*will* be decoupled from the state backends state. Removing those assumptions 
is one of the goals of changing the ownership.
 An easy way to solve it is to just enforce zero threshold for writing to DFS 
instead of memory.

 

*PlaceholderStreamStateHandle can be used for the materialized state* 
(regardless of ByteStreamStateHandle; unless SnapshotStrategy is changed). 
However, it shouldn't cause any issues:
 - if the file is shared (i.e. after recovery) then by definition it should be 
managed by JM
 - otherwise, JM should still replace placeholders (FLINK-23137); and it should 
have received the original state objects before; no re-upload should happen 
(FLINK-23344) - so JM and TM will always refer to the same file

  was:
*For non-mateialized part*, FLINK-21353 doesn't use nor 
FsCheckpointStreamFactory neither PlaceholderStreamStateHandle so it's not an 
issue. Adding it in the future doesn't make sense as for such small changes 
incremental checkpoint might work better.

*For materialized part, ByteStreamStateHandle* can be currently used. This can 
bring back to life issues like FLINK-21351 - if checkpoint subsumption on TM 
*will* be decoupled from the state backends state. Removing those assumptions 
is one of the goals of changing the ownership.
 An easy way to solve it is to just enforce zero threshold for writing to DFS 
instead of memory.

 

*PlaceholderStreamStateHandle* can be used for the materialized state 
(regardless of ByteStreamStateHandle; unless SnapshotStrategy is changed). 
However, it shouldn't cause any issues:
 - if the file is shared (i.e. after recovery) then by definition it should be 
managed by JM
 - otherwise, JM should still replace placeholders (FLINK-23137); and it should 
have received the original state objects before; no re-upload should happen 
(FLINK-23344) - so JM and TM will always refer to the same file


> Consider disallowing in-memory state handles for materialized state
> -------------------------------------------------------------------
>
>                 Key: FLINK-23461
>                 URL: https://issues.apache.org/jira/browse/FLINK-23461
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>            Reporter: Roman Khachatryan
>            Priority: Major
>             Fix For: 1.14.0
>
>
> *For non-mateialized part*, FLINK-21353 doesn't use nor 
> FsCheckpointStreamFactory neither PlaceholderStreamStateHandle so it's not an 
> issue. Adding it in the future doesn't make sense as for such small changes 
> incremental checkpoint might work better.
>  
> *For materialized part, ByteStreamStateHandle* can be currently used. This 
> can bring back to life issues like FLINK-21351 - if checkpoint subsumption on 
> TM *will* be decoupled from the state backends state. Removing those 
> assumptions is one of the goals of changing the ownership.
>  An easy way to solve it is to just enforce zero threshold for writing to DFS 
> instead of memory.
>  
> *PlaceholderStreamStateHandle can be used for the materialized state* 
> (regardless of ByteStreamStateHandle; unless SnapshotStrategy is changed). 
> However, it shouldn't cause any issues:
>  - if the file is shared (i.e. after recovery) then by definition it should 
> be managed by JM
>  - otherwise, JM should still replace placeholders (FLINK-23137); and it 
> should have received the original state objects before; no re-upload should 
> happen (FLINK-23344) - so JM and TM will always refer to the same file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23461) Consider disallowing in-memory state handles for materialized state

Reply via email to