Hi Sebastian,

I agree, shuffling only specific elements would be a very useful feature,
but unfortunately it's not supported (yet).
Would you like to open a JIRA for that?

Cheers, Fabian

2015-06-09 17:22 GMT+02:00 Kruse, Sebastian <sebastian.kr...@hpi.de>:

>  Hi folks,
>
>
>
> I would like to do some load balancing within one of my Flink jobs to
> achieve good scalability. The rebalance() method is not applicable in my
> case, as the runtime is dominated by the processing of very few larger
> elements in my dataset. Hence, I need to distribute the processing work for
> these elements among the nodes in the cluster. To do so, I subdivide those
> elements into partial tasks and want to distribute these partial tasks to
> other nodes by employing a custom partitioner.
>
>
>
> Now, my question is the following: Actually, I do not need to shuffle the
> complete dataset but only a few elements. So is there a way of telling
> within the partitioner, that data should reside on the same task manager?
> Thanks!
>
>
>
> Cheers,
>
> Sebastian
>

Reply via email to