Hi, Till It's very kind of your reply. I got your point, I'm sorry to not make it clear about my issue. I generated data by streaming benchmark just as the link: https://github.com/dataArtisans/databricks-benchmark/blob/master/src/main/scala/com/databricks/benchmark/flink/EventGenerator.scala .
What I wanna to say is that, let the parallelism is same assume to 96, just changes the tm and slots/tm. The first test to configure tm 3 with 32 slots/tm, there does not occur data skew, three machine receive same data and each partition processed approximate data. Then second test to configure tm 6 with 16 slots/tm, I find each partition processed same data too, but one machine processed data more than the other two machine. I wonder whether the taskmanager(jvm) competes in one machine? What's more, how does the streaming benchmark do with backpressure? I test on cluster with 4 node, one for master and three for worker, each node with Intel Xeon E5-2699 v4 @ 2.20GHz/3.60GHz, 256G memory, 88 cores, 10Gbps network, I could not find the bottleneck. It confused me! Best Regards & Thanks Rui ----- stay hungry, stay foolish. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/