This is a little confusing. Lets try to confirm the following first. In the Spark application's web ui, can you find the the stage (one of the first few) that has only 1 task and has the name XYZ at NetworkInputTracker . In that can you see where the single task is running? Is it in node-005, or any other node? That is the task that is supposed to start the receiver, and based on the location preference it should run on node-005. If it is not running on node-005 then the bind failure makes sense, and we need to see why it is not using the location preference. If it is running on node-005 then I am not sure why binding is failing. Anything else thats bound to 4141?
On Tue, Feb 18, 2014 at 11:15 PM, anoldbrain <anoldbr...@gmail.com> wrote: > Both standalone mode and mesos were tested, with the same outcome. After > your > suggestion, I tried again in standalone mode and specified the <host> with > what was written in the log of a worker node. The problem remains. > > A bit more detail, the bind failed error is reported on the driver node. > > Say, driver node is node-001, and worker nodes are node-002 ~ node-008. Run > the command on node-001 > > > $ bin/run-example org.apache.spark.streaming.examples.FlumeEventCount > > spark://node-001:7077 node-005 4141 > > and on node-001, I see > > > ... > > 14/02/19 15:01:04 INFO main HttpBroadcast: Broadcast server started at > > http://192.168.101.11:60094 > > 14/02/19 15:01:04 DEBUG main MetadataCleaner: Starting metadata cleaner > > for HTTP_BROADCAST with delay of 3600 seconds and period of 360 secs > > 14/02/19 15:01:04 DEBUG main MetadataCleaner: Starting metadata cleaner > > for MAP_OUTPUT_TRACKER with delay of 3600 seconds and period of 360 secs > > ... > > 14/02/19 15:01:06 INFO spark-akka.actor.default-dispatcher-13 > > SparkDeploySchedulerBackend: Registered executor: > > Actor[akka.tcp://sparkExecutor@node-005:54959/user/Executor#-820307304] > > with ID 0 > > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-5 > > DAGScheduler: submitStage(Stage 0) > > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-13 > > TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 24 > > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-5 > > DAGScheduler: missing: List(Stage 1) > > 14/02/19 15:01:06 DEBUG spark-akka.actor.default-dispatcher-5 > > DAGScheduler: submitStage(Stage 1) > > 14/02/19 15:01:06 INFO spark-akka.actor.default-dispatcher-13 > > TaskSetManager: Starting task 1.0:24 as TID 24 on executor 0: node-005 > > (PROCESS_LOCAL) > > ... > > 14/02/19 15:01:16 DEBUG spark-akka.actor.default-dispatcher-16 > > TaskSchedulerImpl: parentName: , name: TaskSet_22, runningTasks: 1 > > 14/02/19 15:01:16 DEBUG spark-akka.actor.default-dispatcher-15 > > TaskSchedulerImpl: parentName: , name: TaskSet_22, runningTasks: 0 > > 14/02/19 15:01:16 INFO spark-akka.actor.default-dispatcher-16 > > DAGScheduler: Completed ResultTask(22, 0) > > 14/02/19 15:01:16 ERROR spark-akka.actor.default-dispatcher-22 > > NetworkInputTracker: De-registered receiver for network stream 0 with > > message org.jboss.netty.channel.ChannelException: Failed to bind to: > > node-005/192.168.101.15:4141 > > ... > > And there are no 'ERROR' level logs on node-005 stdout/stderr. Any specific > type of log entries I should look at? > > Note: When I added a logInfo line in the getLocationPreference() of > FlumeReceiver and run the example with the new jar. This line is displayed > on driver node (node-001). > > Thank you. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-FlumeInputDStream-in-spark-cluster-tp1604p1742.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >