Hi Dario, Just to close the loop on this, I answered my own question on SO.
Unfortunately it seems like the recommended solution is to do the same hack I did a while ago, which is to generate (via trial-and-error) a key that gets assigned to the target slot. I was hoping for something a bit more elegant :) I think it’s likely I could make it work by implementing my own version of KeyGroupStreamPartitioner, but as I’d noted in my SO question, that would involve use of some internal-only classes, so maybe not a win. — Ken > On Mar 4, 2022, at 3:14 PM, Dario Heinisch <dario.heini...@gmail.com> wrote: > > Hi, > > I think you are looking for this answer from David: > https://stackoverflow.com/questions/69799181/flink-streaming-do-the-events-get-distributed-to-each-task-slots-separately-acc > > <https://stackoverflow.com/questions/69799181/flink-streaming-do-the-events-get-distributed-to-each-task-slots-separately-acc> > I think then you could technically create your partitioner - though little > bit cubersome - by mapping your existing keys to new keys who will have then > an output to the desired > group & slot. > > Hope this may help, > > Dario > > On 04.03.22 23:54, Ken Krugler wrote: >> Hi all, >> >> I need to be able to control which slot a keyBy group goes to, in order to >> compensate for a badly skewed dataset. >> >> Any recommended approach to use here? >> >> Previously (with a DataSet) I used groupBy followed by a withPartitioner, >> and provided my own custom partitioner. >> >> I posted this same question to >> https://stackoverflow.com/questions/71357833/equivalent-of-dataset-groupby-withpartitioner-for-datastream >> >> <https://stackoverflow.com/questions/71357833/equivalent-of-dataset-groupby-withpartitioner-for-datastream> >> >> Thanks, >> >> — Ken -------------------------- Ken Krugler http://www.scaleunlimited.com Custom big data solutions Flink, Pinot, Solr, Elasticsearch