[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2021-04-22 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17329490#comment-17329490
 ] 

Flink Jira Bot commented on FLINK-13707:


This issue is assigned but has not received an update in 7 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Assignee: xuekang
>Priority: Minor
>  Labels: stale-assigned, stale-minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2021-04-14 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321473#comment-17321473
 ] 

Flink Jira Bot commented on FLINK-13707:


This issue and all of its Sub-Tasks have not been updated for 180 days. So, it 
has been labeled "stale-minor". If you are still affected by this bug or are 
still interested in this issue, please give an update and remove the label. In 
7 days the issue will be closed automatically.

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Assignee: xuekang
>Priority: Minor
>  Labels: stale-minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2019-08-16 Thread Jark Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909053#comment-16909053
 ] 

Jark Wu commented on FLINK-13707:
-

Thanks [~stukid], I assigned this issue to you.

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Assignee: xuekang
>Priority: Minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2019-08-16 Thread xuekang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909020#comment-16909020
 ] 

xuekang commented on FLINK-13707:
-

ok, thanks for the comments. 

I will upload a patch soon.

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Priority: Minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2019-08-15 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908149#comment-16908149
 ] 

Stephan Ewen commented on FLINK-13707:
--

I think that is a fair point. Similar to the default parallelism in the 
{{flink-conf.yaml}}, there could also be a default max parallelism.

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Priority: Minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2019-08-15 Thread xuekang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908043#comment-16908043
 ] 

xuekang commented on FLINK-13707:
-

[~StephanEwen]  yes, we provide a realtime computing platform for the whole 
company. The users do not have to know very much about the meaning of max 
parallelism if we set it as a default value. 

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Priority: Minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2019-08-15 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907887#comment-16907887
 ] 

Stephan Ewen commented on FLINK-13707:
--

Just to double check: You can set the max parallelism explicitly in the 
execution environment.

Are you looking for a way to define a cluster-wide default?

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Priority: Minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2019-08-14 Thread xuekang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907752#comment-16907752
 ] 

xuekang commented on FLINK-13707:
-

Hi [~jark], thanks for reply.

Yes it is the way to compute a max parallelism, and the range is large enough, 
which is [128, 32768]. As the largest operator parallelism I have ever seen is 
2000+, so 4096 may be a proper default value in flink-conf.yaml.

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Priority: Minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13707) Make max parallelism configurable

2019-08-14 Thread Jark Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907001#comment-16907001
 ] 

Jark Wu commented on FLINK-13707:
-

Hi [~stukid], just want to make sure the logic of default max parallelism. I 
find the default max parallelism is computed [in this 
way|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/KeyGroupRangeAssignment.java#L127]:
 


{code:java}
Math.min(
Math.max(

MathUtils.roundUpToPowerOfTwo(operatorParallelism + (operatorParallelism / 2)),

DEFAULT_LOWER_BOUND_MAX_PARALLELISM),
UPPER_BOUND_MAX_PARALLELISM)
{code}


The max parallelism is {{roundUpToPowerOfTwo(1.5 * operatorParallelism)}}, so 
we have a safe range to increase the parallelism. 

Whatever,  I think it makes sense to have a global default max parallelism 
configuration.

> Make  max parallelism configurable
> --
>
> Key: FLINK-13707
> URL: https://issues.apache.org/jira/browse/FLINK-13707
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: xuekang
>Priority: Minor
>
> For now, if a user set parallelism larger than 128, and does not set max 
> parallelism explicitly, the system will compute a max parallelism, which is 
> 1.5 * parallelism.  When the job changes the parallelism and recover from a 
> savepoint, there may be some problem when restoring states from the 
> savepoint, as the number of key groups changed.
> To avoid this problem, and trying not to modify the code of existing jobs,  
> we want to configure the default max parallelism in flink-conf.yaml, but it 
> is not configurable now.
> Should we make it configurable? Any comments would be appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)