[
https://issues.apache.org/jira/browse/FLINK-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800436#comment-16800436
]
vinoyang commented on FLINK-10506:
----------------------------------
Hi [~till.rohrmann] I find some classes have introduced {{maxParallelism}} for
dynamic scaling. I list these classes and methods about the parallelism below:
{code:java}
JobGraph#getMaximumParallelism : maximum parallelism of all operators in job
graph;
JobVertex#getParallelism/setParallelism : one task’s parallelism;
//(if not set (-1),it will be calculate in ExecutionJobVertex by
computeDefaultMaxParallelism with JobVertex#getParallelism)
JobVertex#getMaxParallelism/setMaxParallelism : one task’s max parallelism in
runtime;
StreamNode#getParallelism/setParallelism : stream node’s parallelism
StreamNode#getMaxParallelism/setMaxParallelism : stream node’s max parallelism
StreamGraph#setParallelism : for single StreamNode
StreamGraph#setMaxParallelism : for single StreamNode
StreamTransformation#getParallelism/setParallelism :
StreamTransformation#getMaxParallelism/setMaxParallelism :
SingleOutputStreamOperator#setParallelism :
SingleOutputStreamOperator#setMaxParallelism :
StreamExecutionEnvironment#getParallelism : all operations default parallelism
StreamExecutionEnvironment#getMaxParallelism :
ExecutionConfig#getParallelism/setParallelism
ExecutionConfig#getMaxParallelism/setMaxParallelism
{code}
I think there are two questions need to confirm:
* Should current maxParallelism list above used as maximum parallelism for
declared resource management directly?
* How to compatible with getParallelism/setParallelism in these classes in the
future?
* introduce three independent API for minimum/target/maximum or introduce a
POJO class encapsulate them and provide a single API?
* For now, shall we introduce APIs go through Operator -> JobGraph or just for
JobGraph (calculate the value of min/target/max by exists parallelism)?
some thoughts:
* for the first question: I think it's different, unless we change the
calculate logic about the max parallelism in ExecutionJobVertex.
* for the second question: currently, we use setParallelism and calculate the
min/target/max, in the future, we can introduce special API for them and
discard the setParallelism in many classes.
* for the third question: it seems three single APIs are more intuitional
* for the fourth question: it seems you chose the latter?
> Introduce minimum, target and maximum parallelism to JobGraph
> -------------------------------------------------------------
>
> Key: FLINK-10506
> URL: https://issues.apache.org/jira/browse/FLINK-10506
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.7.0
> Reporter: Till Rohrmann
> Assignee: vinoyang
> Priority: Major
>
> In order to run a job with a variable parallelism, one needs to be able to
> define the minimum and maximum parallelism for an operator as well as the
> current target value. In the first implementation, minimum could be 1 and
> maximum the max parallelism of the operator if no explicit parallelism has
> been specified for an operator. If a parallelism p has been specified (via
> setParallelism(p)), then minimum = maximum = p. The target value could be the
> command line parameter -p or the default parallelism.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)