[jira] [Updated] (FLINK-23372) Disable AllVerticesInSameSlotSharingGroupByDefault in DataStream batch mode

Timo Walther (Jira) Tue, 13 Jul 2021 05:53:16 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-23372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Timo Walther updated FLINK-23372:
---------------------------------
    Description: 
In order to unify the behavior of DataStream API and Table API batch mode, we 
should disable AllVerticesInSameSlotSharingGroupByDefault also in DataStream 
API.

FLINK-20001 reverted setting this flag but without concrete arguments and the 
following comment: {{reconsider actually setting this flag in the future}}

After a offline chat with [~zhuzh], we should introduce this again for 
consistency:

{code}
The goal to assign different regions to different slot sharing groups by 
default is to reduce waste of resources. In batch jobs, there can be one region 
which has data dependency on another region. And the resource computation for 
slots and managed memory will be affected:
  1. If these regions are in the same slot sharing group, the group will 
require a large slot which can host tasks from both the regions.
  2. In managed memory fraction computing, tasks from both regions will be 
considered to compete for managed memory, so each task will be assigned with a 
smaller managed memory fraction (FLIP-53).

However, those regions will not run at the same time and results in a waste of 
resources. 

For streaming jobs, all tasks will run at the same time. So assigning them to 
the same slot sharing group will not result resource waste.
{code}

  was:
In order to unify the behavior of DataStream API and Table API batch mode, we 
should disable AllVerticesInSameSlotSharingGroupByDefault also in DataStream 
API.

FLINK-20001 reverted setting this flag but without concrete arguments and the 
following comment: {{reconsider actually setting this flag in the future}}

After a offline chat with [~zhuzh], we should introduce this again for 
consistency:

{code}
The goal to assign different regions to different slot sharing groups by 
default is to reduce waste of resources. In batch jobs, there can be one region 
which has data dependency on another region. And the resource computation for 
slots and managed memory will be affected:
  1 . If these regions are in the same slot sharing group, the group will 
require a large slot which can host tasks from both the regions.
  2. In managed memory fraction computing, tasks from both regions will be 
considered to compete for managed memory, so each task will be assigned with a 
smaller managed memory fraction (FLIP-53).

However, those regions will not run at the same time and results in a waste of 
resources. 

For streaming jobs, all tasks will run at the same time. So assigning them to 
the same slot sharing group will not result resource waste.
{code}


> Disable AllVerticesInSameSlotSharingGroupByDefault in DataStream batch mode
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-23372
>                 URL: https://issues.apache.org/jira/browse/FLINK-23372
>             Project: Flink
>          Issue Type: Sub-task
>          Components: API / DataStream
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>            Priority: Major
>
> In order to unify the behavior of DataStream API and Table API batch mode, we 
> should disable AllVerticesInSameSlotSharingGroupByDefault also in DataStream 
> API.
> FLINK-20001 reverted setting this flag but without concrete arguments and the 
> following comment: {{reconsider actually setting this flag in the future}}
> After a offline chat with [~zhuzh], we should introduce this again for 
> consistency:
> {code}
> The goal to assign different regions to different slot sharing groups by 
> default is to reduce waste of resources. In batch jobs, there can be one 
> region which has data dependency on another region. And the resource 
> computation for slots and managed memory will be affected:
>   1. If these regions are in the same slot sharing group, the group will 
> require a large slot which can host tasks from both the regions.
>   2. In managed memory fraction computing, tasks from both regions will be 
> considered to compete for managed memory, so each task will be assigned with 
> a smaller managed memory fraction (FLIP-53).
> However, those regions will not run at the same time and results in a waste 
> of resources. 
> For streaming jobs, all tasks will run at the same time. So assigning them to 
> the same slot sharing group will not result resource waste.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23372) Disable AllVerticesInSameSlotSharingGroupByDefault in DataStream batch mode

Reply via email to