[ 
https://issues.apache.org/jira/browse/FLINK-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-7019:
------------------------------
    Description: 
Flink job parallelism is set with {{ExecutionConfig#setParallelism}} or when 
{{-p}} on the command-line. The Gelly algorithms {{JaccardIndex}}, 
{{AdamicAdar}}, {{TriangleListing}}, and {{ClusteringCoefficient}} have 
intermediate operators which generate output quadratic in the size of input. 
These algorithms may need to be run with a high parallelism but doing so for 
all operations is wasteful. Thus was introduced "little parallelism".

This can be simplified by moving the parallelism parameter to the new common 
base class with the rule-of-thumb to use the algorithm parallelism for all 
normal (small output) operators. The asymptotically large operators will 
default to the job parallelism, as will the default algorithm parallelism.

  was:
Flink job parallelism is set with {{ExecutionConfig#setParallelism}} or when 
{{-p}} on the command-line. The Gelly algorithms {{JaccardIndex}}, 
{{AdamicAdar}}, {{TriangleListing}}, and {{ClusteringCoefficient}} have 
intermediate operators which generated output quadratic in the size of input. 
These algorithms may need to be run with a high parallelism but doing so for 
all operations is wasteful. Thus was introduced "little parallelism".

This can be simplified by moving the parallelism parameter to the new common 
base class and with the rule-of-thumb to use the algorithm parallelism for all 
normal (small output) operators. The asymptotically large operators will 
default to the job parallelism, as will the default algorithm parallelism.


> Rework parallelism in Gelly algorithms and examples
> ---------------------------------------------------
>
>                 Key: FLINK-7019
>                 URL: https://issues.apache.org/jira/browse/FLINK-7019
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Gelly
>    Affects Versions: 1.4.0
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
>            Priority: Minor
>             Fix For: 1.4.0
>
>
> Flink job parallelism is set with {{ExecutionConfig#setParallelism}} or when 
> {{-p}} on the command-line. The Gelly algorithms {{JaccardIndex}}, 
> {{AdamicAdar}}, {{TriangleListing}}, and {{ClusteringCoefficient}} have 
> intermediate operators which generate output quadratic in the size of input. 
> These algorithms may need to be run with a high parallelism but doing so for 
> all operations is wasteful. Thus was introduced "little parallelism".
> This can be simplified by moving the parallelism parameter to the new common 
> base class with the rule-of-thumb to use the algorithm parallelism for all 
> normal (small output) operators. The asymptotically large operators will 
> default to the job parallelism, as will the default algorithm parallelism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to