[
https://issues.apache.org/jira/browse/FLINK-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595725#comment-14595725
]
Stephan Ewen commented on FLINK-2193:
-------------------------------------
I think there is currently no partition function that can access the runtime
context.
For now, this seems a bit like a very special case. Can you simulate this by
computing the target partition in a {{RichMapFunction}} and then use an
identity-Partitioner to decide the target channel?
One issue that you will encounter is that the i-th receiver has no affinity to
the location of the i-th sender, because to the system it look like every
sender task talks with every receiver task, and it does not know that you
intend to send 95% of all data through one channel.
> Partial shuffling
> -----------------
>
> Key: FLINK-2193
> URL: https://issues.apache.org/jira/browse/FLINK-2193
> Project: Flink
> Issue Type: Improvement
> Reporter: Sebastian Kruse
> Priority: Minor
>
> In some cases, it would come in handy to shuffle only some specific elements
> of a dataset instead of all elements. This is currently not achievable with a
> custom partitioner.
> Use cases for such a feature are:
> * Load balancing: split up elements that require high processing load and
> distribute the splits among all task managers.
> * Evolutionary algorithms: A well-suited EA model for Map/Reduce-like
> platforms is the island model, where each worker maintains and evolves its
> own population. From time to time, individuals among the population need to
> be exchanged. Shuffling all the complete populations is not necessary, though.
> A presumably easy way to achieve this feature could be to provide the local
> partition number in deployed partitioners, similar to
> {{RichFunction#getRuntimeContext()#getIndexOfThisSubtask()}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)