Daemeon, Yes, deleting the older stagings should help. But could be that you have to restart the history server.
BR, Alex > On 19 Feb 2015, at 15:12, roland.depratti <[email protected]> wrote: > > Alex, > > That sounds like a very likely situation. > > I read in the first jira that tokens are now used in nonsecure setups, which > explains my earlier ssl question. > > Is the solution simply to delete those staging files from the cluster? > > - rd > > > Sent from my Verizon Wireless 4G LTE smartphone > > > -------- Original message -------- > From: Alexander Alten-Lorenz <[email protected]> > Date:02/19/2015 7:43 AM (GMT-05:00) > To: [email protected] > Subject: Re: Yarn AM is abending job when submitting a remote job to cluster > > Hi, > > https://issues.apache.org/jira/browse/YARN-1116 > <https://issues.apache.org/jira/browse/YARN-1058> > > Looks like that the history server received a unclean shutdown or an previous > job doesn’t finished, or wasn’t cleaned up after finishing the job > (2015-02-15 07:51:07,241 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, > Service: , Ident: > (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0 > <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0>) …. > Previous history file is at > hdfs://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist > > <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15>). > > BR, > Alex > > > > On 19 Feb 2015, at 13:27, Roland DePratti <[email protected]> wrote: > > > > Daemeon, > > > > Thanks for the reply. I have about 6 months exposure to Hadoop and new to > > SSL so I did some digging after reading your message. > > > > In the HDFS config, I have hadoop.ssl.enabled. using the default which is > > ‘false’ (which I understand sets it for all Hadoop daemons). > > > > I assumed this meant that it is not in use and not a factor in job > > submission (ssl certs not needed). > > > > Do I misunderstand and are you saying that it needs to be set to ‘true’ > > with valid certs and store setup for me to submit a remote job (this is a > > POC setup without exposure to outside my environment)? > > > > - rd > > > > From: daemeon reiydelle [mailto:[email protected]] > > Sent: Wednesday, February 18, 2015 10:22 PM > > To: [email protected] > > Subject: Re: Yarn AM is abending job when submitting a remote job to cluster > > > > I would guess you do not have your ssl certs set up, client or server, > > based on the error. > > > > > > ....... > > “Life should not be a journey to the grave with the intention of arriving > > safely in a > > pretty and well preserved body, but rather to skid in broadside in a cloud > > of smoke, > > thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a > > Ride!” > > - Hunter Thompson > > > > Daemeon C.M. Reiydelle > > USA (+1) 415.501.0198 > > London (+44) (0) 20 8144 9872 > > > > On Wed, Feb 18, 2015 at 5:19 PM, Roland DePratti <[email protected] > > <mailto:[email protected]>> wrote: > > I have been searching for a handle on a problem without very little clues. > > Any help pointing me to the right direction will be huge. > > I have not received any input form the Cloudera google groups. Perhaps this > > is more Yarn based and I am hoping I have more luck here. > > Any help is greatly appreciated. > > > > I am running a Hadoop cluster using CDH5.3. I also have a client machine > > with a standalone one node setup (VM). > > > > All environments are running CentOS 6.6. > > > > I have submitted some Java mapreduce jobs locally on both the cluster and > > the standalone environment with successfully completions. > > > > I can submit a remote HDFS job from client to cluster using -conf > > hadoop-cluster.xml (see below) and get data back from the cluster with no > > problem. > > > > When submitted remotely the mapreduce jobs remotely, I get an AM error: > > > > AM fails the job with the error: > > > > SecretManager$InvalidToken: appattempt_1424003606313_0001_000002 > > not found in AMRMTokenSecretManager > > > > I searched /var/log/secure on the client and cluster with no unusual > > messages. > > > > Here is the contents of hadoop-cluster.xml: > > > > <?xml version="1.0" encoding="UTF-8"?> > > > > <!--generated by Roland--> > > <configuration> > > <property> > > <name>fs.defaultFS</name> > > <value>hdfs://mycluser:8020</value> > > </property> > > <property> > > <name>mapreduce.jobtracker.address</name> > > <value>hdfs://mycluster:8032</value> > > </property> > > <property> > > <name>yarn.resourcemanager.address</name> > > <value>hdfs://mycluster:8032</value> > > </property> > > > > Here is the output from the job log on the cluster: > > > > 2015-02-15 07:51:06,544 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for > > application appattempt_1424003606313_0001_000002 > > 2015-02-15 07:51:06,949 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > hadoop.ssl.require.client.cert; Ignoring. > > 2015-02-15 07:51:06,952 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > mapreduce.job.end-notification.max.retry.interval; Ignoring. > > 2015-02-15 07:51:06,952 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; > > Ignoring. > > 2015-02-15 07:51:06,954 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > hadoop.ssl.keystores.factory.class; Ignoring. > > 2015-02-15 07:51:06,957 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; > > Ignoring. > > 2015-02-15 07:51:06,973 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > mapreduce.job.end-notification.max.attempts; Ignoring. > > 2015-02-15 07:51:07,241 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: > > 2015-02-15 07:51:07,241 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, > > Service: , Ident: > > (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0 > > <mailto:org.apache.hadoop.yarn.security.AMRMTokenIdentifier@33be1aa0>) > > 2015-02-15 07:51:07,332 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred > > newApiCommitter. > > 2015-02-15 07:51:07,627 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > hadoop.ssl.require.client.cert; Ignoring. > > 2015-02-15 07:51:07,632 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > mapreduce.job.end-notification.max.retry.interval; Ignoring. > > 2015-02-15 07:51:07,632 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; > > Ignoring. > > 2015-02-15 07:51:07,639 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > hadoop.ssl.keystores.factory.class; Ignoring. > > 2015-02-15 07:51:07,645 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; > > Ignoring. > > 2015-02-15 07:51:07,663 WARN [main] org.apache.hadoop.conf.Configuration: > > job.xml:an attempt to override final parameter: > > mapreduce.job.end-notification.max.attempts; Ignoring. > > 2015-02-15 07:51:08,237 WARN [main] > > org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop > > library for your platform... using builtin-java classes where applicable > > 2015-02-15 07:51:08,429 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in > > config null > > 2015-02-15 07:51:08,499 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is > > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > > 2015-02-15 07:51:08,526 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.jobhistory.EventType for class > > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler > > 2015-02-15 07:51:08,527 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher > > 2015-02-15 07:51:08,561 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher > > 2015-02-15 07:51:08,562 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher > > 2015-02-15 07:51:08,566 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class > > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler > > 2015-02-15 07:51:08,568 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher > > 2015-02-15 07:51:08,568 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for > > class > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter > > 2015-02-15 07:51:08,570 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for > > class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter > > 2015-02-15 07:51:08,599 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Recovery is enabled. Will > > try to recover from previous life on best effort basis. > > 2015-02-15 07:51:08,642 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history file is at > > hdfs://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist > > > > <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15> > > 2015-02-15 > > <http://mycluster.com:8020/user/cloudera/.staging/job_1424003606313_0001/job_1424003606313_0001_1.jhist2015-02-15> > > 07:51:09,147 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: > > Read completed tasks from history 0 > > 2015-02-15 07:51:09,193 INFO [main] > > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > > org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler > > 2015-02-15 07:51:09,222 INFO [main] > > org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from > > hadoop-metrics2.properties > > 2015-02-15 07:51:09,277 INFO [main] > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot > > period at 10 second(s). > > 2015-02-15 07:51:09,277 INFO [main] > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics > > system started > > 2015-02-15 07:51:09,286 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for > > job_1424003606313_0001 to jobTokenSecretManager > > 2015-02-15 07:51:09,306 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing > > job_1424003606313_0001 because: not enabled; too much RAM; > > 2015-02-15 07:51:09,324 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job > > job_1424003606313_0001 = 5343207. Number of splits = 5 > > 2015-02-15 07:51:09,325 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for > > job job_1424003606313_0001 = 1 > > 2015-02-15 07:51:09,325 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > > job_1424003606313_0001Job Transitioned from NEW to INITED > > 2015-02-15 07:51:09,327 INFO [main] > > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching > > normal, non-uberized, multi-container job job_1424003606313_0001. > > 2015-02-15 07:51:09,387 INFO [main]
