[ 
https://issues.apache.org/jira/browse/FLINK-17972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-17972:
-----------------------------------
    Description: 
(depends on rescaling for unaligned checkpoints)

 

Current structure is the following (this PR doesn't change it):
{code:java}
Each subtask reports to JM TaskStateSnapshot
  each with zero ore more OperatorSubtaskState,
    each with zero or more InputChannelStateHandle and 
ResultSubpartitionStateHandle
      each referencing an underlying StreamStateHandle
{code}
The underlying {{StreamStateHandle}} duplicates filename 
({{ByteStreamStateHandle}} has it too at least because of {{equals/hashcode}} I 
guess).

An alternative would be something like
{code:java}
Each subtask reports to JM TaskStateSnapshot
  each with zero ore more OperatorSubtaskState
    each with zero or one StreamStateHandle (for channel state)
    each with zero or more InputChannelStateHandle and 
ResultSubpartitionStateHandle{code}
{{(p}}{{robably, with StreamStateHandle}}{{ and InputChannelStateHandle and 
ResultSubpartitionStateHandle}}{{ encapsulated)}}

 

It would be more effective (less data duplication) but probably also more 
error-prone (implicit structure), less flexible (re-scaling).

(as discussed during introduction of {{StreamStateHandle.asBytesIfInMemory}} 
[here|https://github.com/apache/flink/pull/12292#discussion_r429925802])

  was:
(depends on rescaling for unaligned checkpoints)

 

Current structure is the following (this PR doesn't change it):
{code:java}
Each subtask reports to JM TaskStateSnapshot
  each with zero ore more OperatorSubtaskState,
    each with zero or more InputChannelStateHandle and 
ResultSubpartitionStateHandle
      each referencing an underlying StreamStateHandle
{code}
The underlying {{StreamStateHandle}} duplicates filename 
({{ByteStreamStateHandle}} has it too at least because of {{equals/hashcode}} I 
guess).

An alternative would be something like
{code:java}
Each subtask reports to JM TaskStateSnapshot
  each with zero ore more OperatorSubtaskState
    each with zero or one StreamStateHandle (for channel state)
    each with zero or more InputChannelStateHandle and 
ResultSubpartitionStateHandle{code}
{{(p}}{{robably, with StreamStateHandle}}{{ and InputChannelStateHandle and 
ResultSubpartitionStateHandle}}{{ encapsulated)}}

 

It would be more effective (less data duplication) but probably also more 
error-prone (implicit structure), less flexible (re-scaling).


> Consider restructuring channel state
> ------------------------------------
>
>                 Key: FLINK-17972
>                 URL: https://issues.apache.org/jira/browse/FLINK-17972
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.11.0
>            Reporter: Roman Khachatryan
>            Assignee: Roman Khachatryan
>            Priority: Major
>             Fix For: 1.12.0
>
>
> (depends on rescaling for unaligned checkpoints)
>  
> Current structure is the following (this PR doesn't change it):
> {code:java}
> Each subtask reports to JM TaskStateSnapshot
>   each with zero ore more OperatorSubtaskState,
>     each with zero or more InputChannelStateHandle and 
> ResultSubpartitionStateHandle
>       each referencing an underlying StreamStateHandle
> {code}
> The underlying {{StreamStateHandle}} duplicates filename 
> ({{ByteStreamStateHandle}} has it too at least because of {{equals/hashcode}} 
> I guess).
> An alternative would be something like
> {code:java}
> Each subtask reports to JM TaskStateSnapshot
>   each with zero ore more OperatorSubtaskState
>     each with zero or one StreamStateHandle (for channel state)
>     each with zero or more InputChannelStateHandle and 
> ResultSubpartitionStateHandle{code}
> {{(p}}{{robably, with StreamStateHandle}}{{ and InputChannelStateHandle and 
> ResultSubpartitionStateHandle}}{{ encapsulated)}}
>  
> It would be more effective (less data duplication) but probably also more 
> error-prone (implicit structure), less flexible (re-scaling).
> (as discussed during introduction of {{StreamStateHandle.asBytesIfInMemory}} 
> [here|https://github.com/apache/flink/pull/12292#discussion_r429925802])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to