[ https://issues.apache.org/jira/browse/SPARK-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513976#comment-14513976 ]
Marius Soutier edited comment on SPARK-7167 at 4/27/15 12:09 PM: ----------------------------------------------------------------- Maybe the slowdown is only incidental, though it's odd at a batch interval of 1 minute and 40-50 records per interval. In my case I have an actor system running on each worker node that receives data and forwards it to a registered actor receiver (ssc.actorStream(...)) to this results in additional network traffic, but that should not be a problem at 10Gbit. (I'm also aware that actorStream is not really a production-ready feature.) But in any case, from the documentation: "For example, a single Kafka input DStream receiving two topics of data can be split into two Kafka input streams, each receiving only one topic. This would run two receivers on two workers [...]" So receivers should be distributed equally on the cluster, and this appears to be a bug. I also noticed the receivers get redistributed all the time. was (Author: msoutier): Maybe the slowdown is only incidental, though it's odd at a batch interval of 1 minute and 40-50 records per interval. In my case I have an actor system running on each worker node that receives data and forwards it to a registered actor receiver (ssc.actorStream(...)) to this results in additional network traffic, but that should not be a problem at 10Gbit. (I'm also aware that actorStream is not really a production-ready feature.) But in any case, from the documentation: "For example, a single Kafka input DStream receiving two topics of data can be split into two Kafka input streams, each receiving only one topic. This would run two receivers on two workers [...]" So receivers should be distributed equally on the cluster, and this appears to be a bug. > Receivers are not distributed efficiently when starting from checkpoint > ----------------------------------------------------------------------- > > Key: SPARK-7167 > URL: https://issues.apache.org/jira/browse/SPARK-7167 > Project: Spark > Issue Type: Bug > Components: Streaming > Affects Versions: 1.2.1, 1.2.2 > Reporter: Marius Soutier > Priority: Minor > > Bug report: I'm seeing an issue where after starting a streaming application > from a checkpoint, the network receivers are distributed such that not all > nodes are used. > For example, I have five nodes: > node0 - 1 receiver > node1 - 2 receivers > node2 - 0 receivers > node3 - 2 receivers > node4 - 0 receivers > This slows down the job, waiting batches pile up, and I have to kill and > restart it, hoping that next time it will be distributed in a sensible > fashion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org