Job History runs cleaner 30sec. after restart and then after every 1 day, if cleaner is enabled. That is why jobs older than 7 days would have got deleted. Regarding your second question. No, you cannot recover deleted files.
Regards, Varun Saxena On Tue, Sep 22, 2015 at 7:08 AM, Boyu Zhang <boyuzhan...@gmail.com> wrote: > Thanks a lot for the answer! > > If you don't mind help more on this, here is what I am seeing. > > - The NameNode/DataNode and ResourceManager/NodeManager were running for 6 > months before I discovered that the job history server was not running. > After bringing up the job history server, I saw like 2k+ jobs showing up > from the history server web ui. But then the job history server got > restarted, and I don't see any jobs more than 7 days old showing up in the > history web ui. > > - I've disabled the cleaner in the config file. > > My question is, is there a way to find/recover the job history files more > than 7 days old? I read that the container logs are stored locally in the > NodeManger user log dir, and there are files (I have not dig through them > yet). I am not sure if the deleted job history files (by history cleaner) > are not easy to recover. > > Thanks in advance, > Boyu > > > On Mon, Sep 21, 2015 at 4:35 PM, Varun Saxena <vsaxena.va...@gmail.com> > wrote: > >> MR jobs will write history files to path given by config >> mapreduce.jobhistory.intermediate-done-dir >> History server will then move them to done dir which is given by config m >> apreduce.jobhistory.done-dir. >> >> By default these config values >> are ${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate >> and ${yarn.app.mapreduce.am.staging-dir}/history/done respectively. >> >> 7 days is also configurable(config being mapreduce.jobhistory.max-age-ms). >> You can set this value according to your cluster. >> >> I hope this answers your question. >> >> Regards, >> Varun Saxena. >> >> On Tue, Sep 22, 2015 at 1:39 AM, Boyu Zhang <boyuzhan...@gmail.com> >> wrote: >> >>> Thanks a lot for the clarification! >>> >>> I tried to find the log and history information about finished jobs. But >>> they are not in hfs://xxx/user/myusername/output/_SUCCESS (0B). Can you >>> please give some pointers on where the statistical/job history files are >>> located? The hfs://xxxx/history/done only stores history files up to 7 days. >>> >>> Thanks, >>> Boyu >>> >>> On Mon, Sep 21, 2015 at 1:23 PM, Varun Saxena <vsaxena.va...@gmail.com> >>> wrote: >>> >>>> No, you cant show them in RM UI then. >>>> >>>> However if you can start another daemon, you can consider using YARN >>>> Application History/Timeline Server or MR Job History Server(only for MR >>>> jobs) to see information about completed jobs. >>>> You can look up Hadoop documentation to learn more about them and how >>>> to configure them. >>>> >>>> Just to clarify though, the apps themselves are not lost, as in, the >>>> output is not lost. Its just the information about them which is no longer >>>> present on RM restart. >>>> >>>> Regards, >>>> Varun Saxena. >>>> >>>> On Mon, Sep 21, 2015 at 10:31 PM, Boyu Zhang <boyuzhan...@gmail.com> >>>> wrote: >>>> >>>>> Thanks for the answer Varun. >>>>> >>>>> It is the case that yarn.resourcemanager.recovery.enabled is set to be >>>>> false. Is there a way to show the jobs that are submitted before the >>>>> restart? We don't want to lose that data. >>>>> >>>>> Thanks, >>>>> Boyu >>>>> >>>>> >>>>> On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena < >>>>> vsaxena.va...@gmail.com> wrote: >>>>> >>>>>> Hi Boyu, >>>>>> >>>>>> RM stores apps in state store if recovery is enabled. Only then they >>>>>> will be available on restart. >>>>>> Otherwise they are kept in memory and hence lost on restart. >>>>>> >>>>>> You may not have it enabled. Check config value for below config. By >>>>>> default its false. >>>>>> yarn.resourcemanager.recovery.enabled >>>>>> >>>>>> Regards, >>>>>> Varun. >>>>>> >>>>>> On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang <boyuzhan...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hello Everyone, >>>>>>> >>>>>>> I have a strange error regarding the ResourceManager web UI ( >>>>>>> http://xx.xx:8088). >>>>>>> >>>>>>> Someone before me set up the hadoop + yarn cluster using Pivotal HD, >>>>>>> it was running fine. Then today, the resource manager and node manager >>>>>>> disappeared, the logs did not record this. I restarted them, they are up >>>>>>> and running, but the resource manger web UI does not show any jobs. We >>>>>>> have >>>>>>> 700+ jobs in the past, and they were showing before. >>>>>>> >>>>>>> If I submit MapReduce jobs, the new submitted ones show up. But the >>>>>>> disappear again after restart the resource manger and node manager. >>>>>>> >>>>>>> Can anyone give any hint on where to look? >>>>>>> >>>>>>> Thanks in advance, >>>>>>> Boyu >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >