Re: All datanodes are bad

2015-12-17 Thread Stephan Ewen
I guess the problem is pretty much like exception message says: HDFS output
does not work because all datanodes are bad.

If you activate execution retries, the system would re-execute the job. If
the datanodes would be well then, the job would succeed.

Greetings,
Stephan


On Wed, Dec 16, 2015 at 4:07 PM, Muhammad Ali Orakzai <m.orak...@gmail.com>
wrote:

> Hi,
>
> I am receiving the following exception while trying to run the terasort
> program on flink. My configuration is as follows:
>
> Hadoop: 2.6.2
>
> Flink: 0.10.1
>
>
> Server 1:
>
> Hadoop data and name node
>
> Flink job and task manager
>
>
> Server 2:
>
> Flink task manager
>
>
>
> org.apache.flink.client.program.ProgramInvocationException: The program
> execution failed: Job execution failed.
>
> at org.apache.flink.client.program.Client.runBlocking(Client.java:370)
>
> at org.apache.flink.client.program.Client.runBlocking(Client.java:348)
>
> at org.apache.flink.client.program.Client.runBlocking(Client.java:315)
>
> at
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:70)
>
> at
> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:627)
>
> at eastcircle.terasort.FlinkTeraSort$.main(FlinkTeraSort.scala:88)
>
> at eastcircle.terasort.FlinkTeraSort.main(FlinkTeraSort.scala)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:497)
>
> at
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:395)
>
> at org.apache.flink.client.program.Client.runBlocking(Client.java:252)
>
> at
> org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:676)
>
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:326)
>
> at
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:978)
>
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1028)
>
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job
> execution failed.
>
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply$mcV$sp(JobManager.scala:563)
>
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)
>
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)
>
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
> Caused by: java.io.IOException: All datanodes xxx.xxx.xx.xx:50010 are bad.
> Aborting...
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1206)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1004)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:548)
>


All datanodes are bad

2015-12-16 Thread Muhammad Ali Orakzai
Hi,

I am receiving the following exception while trying to run the terasort
program on flink. My configuration is as follows:

Hadoop: 2.6.2

Flink: 0.10.1


Server 1:

Hadoop data and name node

Flink job and task manager


Server 2:

Flink task manager



org.apache.flink.client.program.ProgramInvocationException: The program
execution failed: Job execution failed.

at org.apache.flink.client.program.Client.runBlocking(Client.java:370)

at org.apache.flink.client.program.Client.runBlocking(Client.java:348)

at org.apache.flink.client.program.Client.runBlocking(Client.java:315)

at
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:70)

at
org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:627)

at eastcircle.terasort.FlinkTeraSort$.main(FlinkTeraSort.scala:88)

at eastcircle.terasort.FlinkTeraSort.main(FlinkTeraSort.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:497)

at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:395)

at org.apache.flink.client.program.Client.runBlocking(Client.java:252)

at
org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:676)

at org.apache.flink.client.CliFrontend.run(CliFrontend.java:326)

at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:978)

at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1028)

Caused by: org.apache.flink.runtime.client.JobExecutionException: Job
execution failed.

at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply$mcV$sp(JobManager.scala:563)

at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)

at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$5.apply(JobManager.scala:509)

at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)

at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)

at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)

at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)

at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)

at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Caused by: java.io.IOException: All datanodes xxx.xxx.xx.xx:50010 are bad.
Aborting...

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1206)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1004)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:548)