Aljoscha Krettek created FLINK-3336:
---------------------------------------

             Summary: Add Semi-Rebalance Data Shipping for DataStream
                 Key: FLINK-3336
                 URL: https://issues.apache.org/jira/browse/FLINK-3336
             Project: Flink
          Issue Type: Improvement
          Components: Streaming
            Reporter: Aljoscha Krettek
            Assignee: Aljoscha Krettek
             Fix For: 1.0.0


This feature has recently been requested on the ML: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Distribution-of-sinks-among-the-nodes-td4640.html

The new data shipping pattern would allow to rebalance data only to a subset of 
downstream operations.

The subset of downstream operations to which the upstream operation would send
elements depends on the degree of parallelism of both the upstream and 
downstream operation.
For example, if the upstream operation has parallelism 2 and the downstream 
operation
has parallelism 4, then one upstream operation would distribute elements to two
downstream operations while the other upstream operation would distribute to 
the other
two downstream operations. If, on the other hand, the downstream operation had 
parallelism
2 while the upstream operation has parallelism 4 then two upstream operations 
would
distribute to one downstream operation while the other two upstream operations 
would
distribute to the other downstream operations.

In cases where the different parallelisms are not multiples of each other one 
or several
downstream operations would have a differing number of inputs from upstream 
operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to