[
https://issues.apache.org/jira/browse/FLINK-23461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roman Khachatryan updated FLINK-23461:
--------------------------------------
Description:
*For non-mateialized part*, FLINK-21353 doesn't use nor
FsCheckpointStreamFactory neither PlaceholderStreamStateHandle so it's not an
issue. Adding it in the future doesn't make sense as for such small changes
incremental checkpoint might work better.
*For materialized part, ByteStreamStateHandle* can be currently used. This can
bring back to life issues like FLINK-21351 - if checkpoint subsumption on TM
*will* be decoupled from the state backends state. Removing those assumptions
is one of the goals of changing the ownership.
An easy way to solve it is to just enforce zero threshold for writing to DFS
instead of memory.
*PlaceholderStreamStateHandle can be used for the materialized state*
(regardless of ByteStreamStateHandle; unless SnapshotStrategy is changed).
However, it shouldn't cause any issues:
- if the file is shared (i.e. after recovery) then by definition it should be
managed by JM
- otherwise, JM should still replace placeholders (FLINK-23137); and it should
have received the original state objects before; no re-upload should happen
(FLINK-23344) - so JM and TM will always refer to the same file
was:
*For non-mateialized part*, FLINK-21353 doesn't use nor
FsCheckpointStreamFactory neither PlaceholderStreamStateHandle so it's not an
issue. Adding it in the future doesn't make sense as for such small changes
incremental checkpoint might work better.
*For materialized part, ByteStreamStateHandle* can be currently used. This can
bring back to life issues like FLINK-21351 - if checkpoint subsumption on TM
*will* be decoupled from the state backends state. Removing those assumptions
is one of the goals of changing the ownership.
An easy way to solve it is to just enforce zero threshold for writing to DFS
instead of memory.
*PlaceholderStreamStateHandle* can be used for the materialized state
(regardless of ByteStreamStateHandle; unless SnapshotStrategy is changed).
However, it shouldn't cause any issues:
- if the file is shared (i.e. after recovery) then by definition it should be
managed by JM
- otherwise, JM should still replace placeholders (FLINK-23137); and it should
have received the original state objects before; no re-upload should happen
(FLINK-23344) - so JM and TM will always refer to the same file
> Consider disallowing in-memory state handles for materialized state
> -------------------------------------------------------------------
>
> Key: FLINK-23461
> URL: https://issues.apache.org/jira/browse/FLINK-23461
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / State Backends
> Reporter: Roman Khachatryan
> Priority: Major
> Fix For: 1.14.0
>
>
> *For non-mateialized part*, FLINK-21353 doesn't use nor
> FsCheckpointStreamFactory neither PlaceholderStreamStateHandle so it's not an
> issue. Adding it in the future doesn't make sense as for such small changes
> incremental checkpoint might work better.
>
> *For materialized part, ByteStreamStateHandle* can be currently used. This
> can bring back to life issues like FLINK-21351 - if checkpoint subsumption on
> TM *will* be decoupled from the state backends state. Removing those
> assumptions is one of the goals of changing the ownership.
> An easy way to solve it is to just enforce zero threshold for writing to DFS
> instead of memory.
>
> *PlaceholderStreamStateHandle can be used for the materialized state*
> (regardless of ByteStreamStateHandle; unless SnapshotStrategy is changed).
> However, it shouldn't cause any issues:
> - if the file is shared (i.e. after recovery) then by definition it should
> be managed by JM
> - otherwise, JM should still replace placeholders (FLINK-23137); and it
> should have received the original state objects before; no re-upload should
> happen (FLINK-23344) - so JM and TM will always refer to the same file
--
This message was sent by Atlassian Jira
(v8.3.4#803005)