Re: same parallelism with different taskmanager and slots, skew occurs

varuy322 Wed, 02 Jan 2019 17:38:17 -0800

Hi, Till
It's very kind of your reply. I got your point, I'm sorry to not make it
clear about my issue.
I generated data by streaming benchmark just as the link:
https://github.com/dataArtisans/databricks-benchmark/blob/master/src/main/scala/com/databricks/benchmark/flink/EventGenerator.scala
.


What I wanna to say is that, let the parallelism is same assume to 96, just
changes the tm and slots/tm. The first test to configure tm 3 with 32
slots/tm, there does not occur data skew, three machine receive same data
and each partition processed approximate data. Then second test to configure
tm 6 with 16 slots/tm, I find each partition processed same data too, but
one machine processed data more than the other two machine.

I wonder whether the taskmanager(jvm) competes in one machine? What's more,
how does the streaming benchmark do with backpressure? I test on cluster
with 4 node, one for master and three for worker, each node with Intel Xeon
E5-2699 v4 @ 2.20GHz/3.60GHz, 256G memory, 88 cores, 10Gbps network, I could
not find the bottleneck. It confused me!

Best Regards & Thanks

Rui



-----
stay hungry, stay foolish.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: same parallelism with different taskmanager and slots, skew occurs

Reply via email to