This is a little confusing. Lets try to confirm the following first. In the
Spark application's web ui, can you find the the stage (one of the first
few) that has only 1 task and has the name XYZ at NetworkInputTracker . In
that can you see where the single task is running? Is it in node-005, or
any other node? That is the task that is supposed to start the receiver,
and based on the location preference it should run on node-005. If it is
not running on node-005 then the bind failure makes sense, and we need to
see why it is not using the location preference. If it is running on
node-005 then I am not sure why binding is failing. Anything else thats
bound to 4141?



On Tue, Feb 18, 2014 at 11:15 PM, anoldbrain <anoldbr...@gmail.com> wrote:

> Both standalone mode and mesos were tested, with the same outcome. After
> your
> suggestion, I tried again in standalone mode and specified the <host> with
> what was written in the log of a worker node. The problem remains.
>
> A bit more detail, the bind failed error is reported on the driver node.
>
> Say, driver node is node-001, and worker nodes are node-002 ~ node-008. Run
> the command on node-001
>
> > $ bin/run-example org.apache.spark.streaming.examples.FlumeEventCount
> > spark://node-001:7077 node-005 4141
>
> and on node-001, I see
>
> > ...
> > 14/02/19 15:01:04 INFO main HttpBroadcast: Broadcast server started at
> > http://192.168.101.11:60094
> > 14/02/19 15:01:04 DEBUG main MetadataCleaner: Starting metadata cleaner
> > for HTTP_BROADCAST with delay of 3600 seconds and period of 360 secs
> > 14/02/19 15:01:04 DEBUG main MetadataCleaner: Starting metadata cleaner
> > for MAP_OUTPUT_TRACKER with delay of 3600 seconds and period of 360 secs
> > ...
> > 14/02/19 15:01:06 INFO spark-akka.actor.default-dispatcher-13
> > SparkDeploySchedulerBackend: Registered executor:
> > Actor[akka.tcp://sparkExecutor@node-005:54959/user/Executor#-820307304]
> > with ID 0
> > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-5
> > DAGScheduler: submitStage(Stage 0)
> > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-13
> > TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 24
> > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-5
> > DAGScheduler: missing: List(Stage 1)
> > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-5
> > DAGScheduler: submitStage(Stage 1)
> > 14/02/19 15:01:06 INFO spark-akka.actor.default-dispatcher-13
> > TaskSetManager: Starting task 1.0:24 as TID 24 on executor 0: node-005
> > (PROCESS_LOCAL)
> > ...
> > 14/02/19 15:01:16 DEBUG spark-akka.actor.default-dispatcher-16
> > TaskSchedulerImpl: parentName: , name: TaskSet_22, runningTasks: 1
> > 14/02/19 15:01:16 DEBUG spark-akka.actor.default-dispatcher-15
> > TaskSchedulerImpl: parentName: , name: TaskSet_22, runningTasks: 0
> > 14/02/19 15:01:16 INFO spark-akka.actor.default-dispatcher-16
> > DAGScheduler: Completed ResultTask(22, 0)
> > 14/02/19 15:01:16 ERROR spark-akka.actor.default-dispatcher-22
> > NetworkInputTracker: De-registered receiver for network stream 0 with
> > message org.jboss.netty.channel.ChannelException: Failed to bind to:
> > node-005/192.168.101.15:4141
> > ...
>
> And there are no 'ERROR' level logs on node-005 stdout/stderr. Any specific
> type of log entries I should look at?
>
> Note: When I added a logInfo line in the getLocationPreference() of
> FlumeReceiver and run the example with the new jar. This line is displayed
> on driver node (node-001).
>
> Thank you.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-FlumeInputDStream-in-spark-cluster-tp1604p1742.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to