Hey, The Flink scheduling mechanism has become quite a bit of a pain lately for us when trying to schedule IO heavy streaming jobs. And by IO heavy I mean it has a fairly large state that is being continuously updated/read.
The main problem is that the scheduled task slots are not evenly distributed among the different task managers but usually the first TM takes as much slots as possibles and the other TMs get much fewer. And since the job is RocksDB IO bound the uneven load causes a significant performance penalty. This is further accentuated during historical runs when we are trying to "fast-forward" the application. The difference can be quite substantial in a 3-4 node cluster: with even task distribution the history might run 3 times faster compared to an uneven one. I was wondering if there was a simple way to modify the scheduler so it allocates resources in a round-robin fashion. Probably someone has a lot of experience with this already :) (I'm running 1.0.3 for this job btw) Cheers, Gyula