Job History runs cleaner 30sec. after restart and then after every 1 day,
if cleaner is enabled. That is why jobs older than 7 days would have got
deleted.
Regarding your second question.
No, you cannot recover deleted files.
Regards,
Varun Saxena
On Tue, Sep 22, 2015 at 7:08 AM, Boyu Zhang
Thanks a lot for the answer!
If you don't mind help more on this, here is what I am seeing.
- The NameNode/DataNode and ResourceManager/NodeManager were running for 6
months before I discovered that the job history server was not running.
After bringing up the job history server, I saw like 2k+
Hi Boyu,
RM stores apps in state store if recovery is enabled. Only then they will
be available on restart.
Otherwise they are kept in memory and hence lost on restart.
You may not have it enabled. Check config value for below config. By
default its false.
yarn.resourcemanager.recovery.enabled
Thanks for the answer Varun.
It is the case that yarn.resourcemanager.recovery.enabled is set to be
false. Is there a way to show the jobs that are submitted before the
restart? We don't want to lose that data.
Thanks,
Boyu
On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena
No, you cant show them in RM UI then.
However if you can start another daemon, you can consider using YARN
Application History/Timeline Server or MR Job History Server(only for MR
jobs) to see information about completed jobs.
You can look up Hadoop documentation to learn more about them and how
Thanks a lot for the clarification!
I tried to find the log and history information about finished jobs. But
they are not in hfs://xxx/user/myusername/output/_SUCCESS (0B). Can you
please give some pointers on where the statistical/job history files are
located? The hfs:///history/done only
MR jobs will write history files to path given by config
mapreduce.jobhistory.intermediate-done-dir
History server will then move them to done dir which is given by config m
apreduce.jobhistory.done-dir.
By default these config values
are