[ https://issues.apache.org/jira/browse/FLINK-17972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Piotr Nowojski updated FLINK-17972: ----------------------------------- Description: (depends on rescaling for unaligned checkpoints) Current structure is the following (this PR doesn't change it): {code:java} Each subtask reports to JM TaskStateSnapshot each with zero ore more OperatorSubtaskState, each with zero or more InputChannelStateHandle and ResultSubpartitionStateHandle each referencing an underlying StreamStateHandle {code} The underlying {{StreamStateHandle}} duplicates filename ({{ByteStreamStateHandle}} has it too at least because of {{equals/hashcode}} I guess). An alternative would be something like {code:java} Each subtask reports to JM TaskStateSnapshot each with zero ore more OperatorSubtaskState each with zero or one StreamStateHandle (for channel state) each with zero or more InputChannelStateHandle and ResultSubpartitionStateHandle{code} {{(p}}{{robably, with StreamStateHandle}}{{ and InputChannelStateHandle and ResultSubpartitionStateHandle}}{{ encapsulated)}} It would be more effective (less data duplication) but probably also more error-prone (implicit structure), less flexible (re-scaling). (as discussed during introduction of {{StreamStateHandle.asBytesIfInMemory}} [here|https://github.com/apache/flink/pull/12292#discussion_r429925802]) was: (depends on rescaling for unaligned checkpoints) Current structure is the following (this PR doesn't change it): {code:java} Each subtask reports to JM TaskStateSnapshot each with zero ore more OperatorSubtaskState, each with zero or more InputChannelStateHandle and ResultSubpartitionStateHandle each referencing an underlying StreamStateHandle {code} The underlying {{StreamStateHandle}} duplicates filename ({{ByteStreamStateHandle}} has it too at least because of {{equals/hashcode}} I guess). An alternative would be something like {code:java} Each subtask reports to JM TaskStateSnapshot each with zero ore more OperatorSubtaskState each with zero or one StreamStateHandle (for channel state) each with zero or more InputChannelStateHandle and ResultSubpartitionStateHandle{code} {{(p}}{{robably, with StreamStateHandle}}{{ and InputChannelStateHandle and ResultSubpartitionStateHandle}}{{ encapsulated)}} It would be more effective (less data duplication) but probably also more error-prone (implicit structure), less flexible (re-scaling). > Consider restructuring channel state > ------------------------------------ > > Key: FLINK-17972 > URL: https://issues.apache.org/jira/browse/FLINK-17972 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing > Affects Versions: 1.11.0 > Reporter: Roman Khachatryan > Assignee: Roman Khachatryan > Priority: Major > Fix For: 1.12.0 > > > (depends on rescaling for unaligned checkpoints) > > Current structure is the following (this PR doesn't change it): > {code:java} > Each subtask reports to JM TaskStateSnapshot > each with zero ore more OperatorSubtaskState, > each with zero or more InputChannelStateHandle and > ResultSubpartitionStateHandle > each referencing an underlying StreamStateHandle > {code} > The underlying {{StreamStateHandle}} duplicates filename > ({{ByteStreamStateHandle}} has it too at least because of {{equals/hashcode}} > I guess). > An alternative would be something like > {code:java} > Each subtask reports to JM TaskStateSnapshot > each with zero ore more OperatorSubtaskState > each with zero or one StreamStateHandle (for channel state) > each with zero or more InputChannelStateHandle and > ResultSubpartitionStateHandle{code} > {{(p}}{{robably, with StreamStateHandle}}{{ and InputChannelStateHandle and > ResultSubpartitionStateHandle}}{{ encapsulated)}} > > It would be more effective (less data duplication) but probably also more > error-prone (implicit structure), less flexible (re-scaling). > (as discussed during introduction of {{StreamStateHandle.asBytesIfInMemory}} > [here|https://github.com/apache/flink/pull/12292#discussion_r429925802]) -- This message was sent by Atlassian Jira (v8.3.4#803005)