Ahmed
Are you starting the mr history server ? Once the AM finishes, the client
requests are served by the history server.

Sharad

On Thu, Jul 14, 2011 at 2:14 AM, Ahmed Radwan <[email protected]> wrote:

> I am testing mr2 on a small real cluster, but I am seeing some
> flaky behavior in running jobs. The same exact job with the same
> configuration can sometimes run successfully or generate one of the
> following errors. It is random as far as I see (the job can give the error
> one time and then run normally the next, and so on).
>
> Have anyone seen this behavior before?
>
> ERROR 1:
> --------------
> 11/07/13 13:21:22 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
> proxy for protocol interface
> org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
> 11/07/13 13:21:22 INFO mapred.ClientServiceDelegate: Connecting to
> 172.29.5.33:52675
> 11/07/13 13:21:22 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
> proxy for protocol interface
> org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
> 11/07/13 13:21:23 INFO ipc.Client: Retrying connect to server: /
> 172.29.5.33:52675. Already tried 0 time(s).
> 11/07/13 13:21:24 INFO ipc.Client: Retrying connect to server: /
> 172.29.5.33:52675. Already tried 1 time(s).
> 11/07/13 13:21:25 INFO ipc.Client: Retrying connect to server: /
> 172.29.5.33:52675. Already tried 2 time(s).
> java.lang.reflect.UndeclaredThrowableException
> at
>
> org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:161)
> at
>
> org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:285)
> at
>
> org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:522)
> at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:540)
> at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1130)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1084)
> at org.apache.hadoop.examples.WordCount.main(WordCount.java:84)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
> Caused by: com.google.protobuf.ServiceException: java.net.ConnectException:
> Call to /172.29.5.33:52675 failed on connection exception:
> java.net.ConnectException: Connection refused
> at
>
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:96)
> at $Proxy9.getTaskAttemptCompletionEvents(Unknown Source)
> at
>
> org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:154)
> ... 18 more
>
> ERROR 2:
> --------------
> 11/07/13 13:32:30 INFO mapred.ClientServiceDelegate: Connecting to
> 172.29.5.34:41667
> 11/07/13 13:32:30 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
> proxy for protocol interface
> org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
> 11/07/13 13:32:35 INFO mapreduce.Job: Task Id :
> attempt_1310587965851_0005_m_000000_0, Status : FAILED
> java.io.FileNotFoundException: File
>
> file:/tmp/nm-local-dir/usercache/ahmed/appcache/application_1310587965851_0005
> does not exist.
> at
>
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:412)
> at
>
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:109)
> at
>
> org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:74)
> at
>
> org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.<init>(ChecksumFs.java:332)
> at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:367)
> at
> org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:551)
> at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:630)
> at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:627)
> at
>
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2278)
> at org.apache.hadoop.fs.FileContext.create(FileContext.java:627)
> at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2097)
> at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2039)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:81)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:779)
>
>
> --
> Ahmed
>

Reply via email to