[
https://issues.apache.org/jira/browse/SPARK-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513976#comment-14513976
]
Marius Soutier edited comment on SPARK-7167 at 4/27/15 12:09 PM:
-----------------------------------------------------------------
Maybe the slowdown is only incidental, though it's odd at a batch interval of 1
minute and 40-50 records per interval.
In my case I have an actor system running on each worker node that receives
data and forwards it to a registered actor receiver (ssc.actorStream(...)) to
this results in additional network traffic, but that should not be a problem at
10Gbit. (I'm also aware that actorStream is not really a production-ready
feature.)
But in any case, from the documentation:
"For example, a single Kafka input DStream receiving two topics of data can be
split into two Kafka input streams, each receiving only one topic. This would
run two receivers on two workers [...]"
So receivers should be distributed equally on the cluster, and this appears to
be a bug.
I also noticed the receivers get redistributed all the time.
was (Author: msoutier):
Maybe the slowdown is only incidental, though it's odd at a batch interval of 1
minute and 40-50 records per interval.
In my case I have an actor system running on each worker node that receives
data and forwards it to a registered actor receiver (ssc.actorStream(...)) to
this results in additional network traffic, but that should not be a problem at
10Gbit. (I'm also aware that actorStream is not really a production-ready
feature.)
But in any case, from the documentation:
"For example, a single Kafka input DStream receiving two topics of data can be
split into two Kafka input streams, each receiving only one topic. This would
run two receivers on two workers [...]"
So receivers should be distributed equally on the cluster, and this appears to
be a bug.
> Receivers are not distributed efficiently when starting from checkpoint
> -----------------------------------------------------------------------
>
> Key: SPARK-7167
> URL: https://issues.apache.org/jira/browse/SPARK-7167
> Project: Spark
> Issue Type: Bug
> Components: Streaming
> Affects Versions: 1.2.1, 1.2.2
> Reporter: Marius Soutier
> Priority: Minor
>
> Bug report: I'm seeing an issue where after starting a streaming application
> from a checkpoint, the network receivers are distributed such that not all
> nodes are used.
> For example, I have five nodes:
> node0 - 1 receiver
> node1 - 2 receivers
> node2 - 0 receivers
> node3 - 2 receivers
> node4 - 0 receivers
> This slows down the job, waiting batches pile up, and I have to kill and
> restart it, hoping that next time it will be distributed in a sensible
> fashion.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]