From the chart you pasted, I guess you only have one receiver with storage
level two copies, so mostly your taks are located on two executors. You
could use repartition to redistribute the data more evenly across the
executors. Also add more receiver is another solution.
2015-04-30 14:38
It seems that the data size is only 2.9MB, far less than the default rdd
size. How about put more data into kafka? and what about the number of
topic partitions from kafka?
Best regards,
Lin Hao XU
IBM Research China
Email: xulin...@cn.ibm.com
My Flickr:
Hi all
My environment info
Hadoop release version: HDP 2.1
Kakfa: 0.8.1.2.1.4.0
Spark: 1.1.0
My question:
I ran Spark streaming program on YARN. My Spark streaming program will
read data from Kafka and doing some processing. But, I found there is
always only ONE executor under processing. As
Hello Lin Hao
Thanks for your reply. I will try to produce more data into Kafka.
I run three Kafka borkers. Following is my topic info.
Topic:kyle_test_topic PartitionCount:3 ReplicationFactor:2 Configs:
Topic: kyle_test_topic Partition: 0 Leader: 3 Replicas: 3,4 Isr: 3,4
Topic: kyle_test_topic
Hi all
Producing more data into Kafka is not effective in my situation,
because the speed of reading Kafka is consistent. I will adopt Saiai's
suggestion to add more receivers.
Kyle
2015-04-30 14:49 GMT+08:00 Saisai Shao sai.sai.s...@gmail.com:
From the chart you pasted, I guess you only