[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17329490#comment-17329490 ] Flink Jira Bot commented on FLINK-13707: This issue is assigned but has not received an update in 7 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Assignee: xuekang >Priority: Minor > Labels: stale-assigned, stale-minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321473#comment-17321473 ] Flink Jira Bot commented on FLINK-13707: This issue and all of its Sub-Tasks have not been updated for 180 days. So, it has been labeled "stale-minor". If you are still affected by this bug or are still interested in this issue, please give an update and remove the label. In 7 days the issue will be closed automatically. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Assignee: xuekang >Priority: Minor > Labels: stale-minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909053#comment-16909053 ] Jark Wu commented on FLINK-13707: - Thanks [~stukid], I assigned this issue to you. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Assignee: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909020#comment-16909020 ] xuekang commented on FLINK-13707: - ok, thanks for the comments. I will upload a patch soon. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908149#comment-16908149 ] Stephan Ewen commented on FLINK-13707: -- I think that is a fair point. Similar to the default parallelism in the {{flink-conf.yaml}}, there could also be a default max parallelism. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908043#comment-16908043 ] xuekang commented on FLINK-13707: - [~StephanEwen] yes, we provide a realtime computing platform for the whole company. The users do not have to know very much about the meaning of max parallelism if we set it as a default value. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907887#comment-16907887 ] Stephan Ewen commented on FLINK-13707: -- Just to double check: You can set the max parallelism explicitly in the execution environment. Are you looking for a way to define a cluster-wide default? > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907752#comment-16907752 ] xuekang commented on FLINK-13707: - Hi [~jark], thanks for reply. Yes it is the way to compute a max parallelism, and the range is large enough, which is [128, 32768]. As the largest operator parallelism I have ever seen is 2000+, so 4096 may be a proper default value in flink-conf.yaml. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907001#comment-16907001 ] Jark Wu commented on FLINK-13707: - Hi [~stukid], just want to make sure the logic of default max parallelism. I find the default max parallelism is computed [in this way|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/KeyGroupRangeAssignment.java#L127]: {code:java} Math.min( Math.max( MathUtils.roundUpToPowerOfTwo(operatorParallelism + (operatorParallelism / 2)), DEFAULT_LOWER_BOUND_MAX_PARALLELISM), UPPER_BOUND_MAX_PARALLELISM) {code} The max parallelism is {{roundUpToPowerOfTwo(1.5 * operatorParallelism)}}, so we have a safe range to increase the parallelism. Whatever, I think it makes sense to have a global default max parallelism configuration. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)