[
https://issues.apache.org/jira/browse/FLINK-17972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Feifan Wang updated FLINK-17972:
--------------------------------
Description:
(depends on rescaling for unaligned checkpoints (FLINK-17979))
Current structure is the following (this PR doesn't change it):
{code:java}
Each subtask reports to JM TaskStateSnapshot
each with zero ore more OperatorSubtaskState,
each with zero or more InputChannelStateHandle and
ResultSubpartitionStateHandle
each referencing an underlying StreamStateHandle
{code}
The underlying {{StreamStateHandle}} duplicates filename
({{{}ByteStreamStateHandle{}}} has it too at least because of
{{equals/hashcode}} I guess).
An alternative would be something like
{code:java}
Each subtask reports to JM TaskStateSnapshot
each with zero ore more OperatorSubtaskState
each with zero or one StreamStateHandle (for channel state)
each with zero or more InputChannelStateHandle and
ResultSubpartitionStateHandle{code}
{{{}(p{}}}{{{}robably, with StreamStateHandle{}}}{{ and InputChannelStateHandle
and ResultSubpartitionStateHandle}}{{ encapsulated)}}
It would be more effective (less data duplication) but probably also more
error-prone (implicit structure), less flexible (re-scaling).
(as discussed during introduction of {{StreamStateHandle.asBytesIfInMemory}}
[here|https://github.com/apache/flink/pull/12292#discussion_r429925802])
was:
(depends on rescaling for unaligned checkpoints (FLINK-17979))
Current structure is the following (this PR doesn't change it):
{code:java}
Each subtask reports to JM TaskStateSnapshot
each with zero ore more OperatorSubtaskState,
each with zero or more InputChannelStateHandle and
ResultSubpartitionStateHandle
each referencing an underlying StreamStateHandle
{code}
The underlying {{StreamStateHandle}} duplicates filename
({{ByteStreamStateHandle}} has it too at least because of {{equals/hashcode}} I
guess).
An alternative would be something like
{code:java}
Each subtask reports to JM TaskStateSnapshot
each with zero ore more OperatorSubtaskState
each with zero or one StreamStateHandle (for channel state)
each with zero or more InputChannelStateHandle and
ResultSubpartitionStateHandle{code}
{{(p}}{{robably, with StreamStateHandle}}{{ and InputChannelStateHandle and
ResultSubpartitionStateHandle}}{{ encapsulated)}}
It would be more effective (less data duplication) but probably also more
error-prone (implicit structure), less flexible (re-scaling).
(as discussed during introduction of {{StreamStateHandle.asBytesIfInMemory}}
[here|https://github.com/apache/flink/pull/12292#discussion_r429925802])
> Consider restructuring channel state
> ------------------------------------
>
> Key: FLINK-17972
> URL: https://issues.apache.org/jira/browse/FLINK-17972
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Affects Versions: 1.11.0
> Reporter: Roman Khachatryan
> Priority: Not a Priority
> Labels: auto-deprioritized-major, auto-deprioritized-minor,
> auto-unassigned
>
> (depends on rescaling for unaligned checkpoints (FLINK-17979))
>
> Current structure is the following (this PR doesn't change it):
> {code:java}
> Each subtask reports to JM TaskStateSnapshot
> each with zero ore more OperatorSubtaskState,
> each with zero or more InputChannelStateHandle and
> ResultSubpartitionStateHandle
> each referencing an underlying StreamStateHandle
> {code}
> The underlying {{StreamStateHandle}} duplicates filename
> ({{{}ByteStreamStateHandle{}}} has it too at least because of
> {{equals/hashcode}} I guess).
> An alternative would be something like
> {code:java}
> Each subtask reports to JM TaskStateSnapshot
> each with zero ore more OperatorSubtaskState
> each with zero or one StreamStateHandle (for channel state)
> each with zero or more InputChannelStateHandle and
> ResultSubpartitionStateHandle{code}
> {{{}(p{}}}{{{}robably, with StreamStateHandle{}}}{{ and
> InputChannelStateHandle and ResultSubpartitionStateHandle}}{{ encapsulated)}}
>
> It would be more effective (less data duplication) but probably also more
> error-prone (implicit structure), less flexible (re-scaling).
> (as discussed during introduction of {{StreamStateHandle.asBytesIfInMemory}}
> [here|https://github.com/apache/flink/pull/12292#discussion_r429925802])
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)