Ahmed Are you starting the mr history server ? Once the AM finishes, the client requests are served by the history server.
Sharad On Thu, Jul 14, 2011 at 2:14 AM, Ahmed Radwan <[email protected]> wrote: > I am testing mr2 on a small real cluster, but I am seeing some > flaky behavior in running jobs. The same exact job with the same > configuration can sometimes run successfully or generate one of the > following errors. It is random as far as I see (the job can give the error > one time and then run normally the next, and so on). > > Have anyone seen this behavior before? > > ERROR 1: > -------------- > 11/07/13 13:21:22 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc > proxy for protocol interface > org.apache.hadoop.mapreduce.v2.api.MRClientProtocol > 11/07/13 13:21:22 INFO mapred.ClientServiceDelegate: Connecting to > 172.29.5.33:52675 > 11/07/13 13:21:22 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc > proxy for protocol interface > org.apache.hadoop.mapreduce.v2.api.MRClientProtocol > 11/07/13 13:21:23 INFO ipc.Client: Retrying connect to server: / > 172.29.5.33:52675. Already tried 0 time(s). > 11/07/13 13:21:24 INFO ipc.Client: Retrying connect to server: / > 172.29.5.33:52675. Already tried 1 time(s). > 11/07/13 13:21:25 INFO ipc.Client: Retrying connect to server: / > 172.29.5.33:52675. Already tried 2 time(s). > java.lang.reflect.UndeclaredThrowableException > at > > org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:161) > at > > org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:285) > at > > org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:522) > at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:540) > at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1130) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1084) > at org.apache.hadoop.examples.WordCount.main(WordCount.java:84) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) > at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:192) > Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: > Call to /172.29.5.33:52675 failed on connection exception: > java.net.ConnectException: Connection refused > at > > org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:96) > at $Proxy9.getTaskAttemptCompletionEvents(Unknown Source) > at > > org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:154) > ... 18 more > > ERROR 2: > -------------- > 11/07/13 13:32:30 INFO mapred.ClientServiceDelegate: Connecting to > 172.29.5.34:41667 > 11/07/13 13:32:30 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc > proxy for protocol interface > org.apache.hadoop.mapreduce.v2.api.MRClientProtocol > 11/07/13 13:32:35 INFO mapreduce.Job: Task Id : > attempt_1310587965851_0005_m_000000_0, Status : FAILED > java.io.FileNotFoundException: File > > file:/tmp/nm-local-dir/usercache/ahmed/appcache/application_1310587965851_0005 > does not exist. > at > > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:412) > at > > org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:109) > at > > org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:74) > at > > org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.<init>(ChecksumFs.java:332) > at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:367) > at > org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:551) > at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:630) > at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:627) > at > > org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2278) > at org.apache.hadoop.fs.FileContext.create(FileContext.java:627) > at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2097) > at org.apache.hadoop.fs.FileContext$Util.copy(FileContext.java:2039) > at > > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:81) > at > > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:779) > > > -- > Ahmed >
