Re: Yarn ResourceManager web UI does not show job
Job History runs cleaner 30sec. after restart and then after every 1 day, if cleaner is enabled. That is why jobs older than 7 days would have got deleted. Regarding your second question. No, you cannot recover deleted files. Regards, Varun Saxena On Tue, Sep 22, 2015 at 7:08 AM, Boyu Zhangwrote: > Thanks a lot for the answer! > > If you don't mind help more on this, here is what I am seeing. > > - The NameNode/DataNode and ResourceManager/NodeManager were running for 6 > months before I discovered that the job history server was not running. > After bringing up the job history server, I saw like 2k+ jobs showing up > from the history server web ui. But then the job history server got > restarted, and I don't see any jobs more than 7 days old showing up in the > history web ui. > > - I've disabled the cleaner in the config file. > > My question is, is there a way to find/recover the job history files more > than 7 days old? I read that the container logs are stored locally in the > NodeManger user log dir, and there are files (I have not dig through them > yet). I am not sure if the deleted job history files (by history cleaner) > are not easy to recover. > > Thanks in advance, > Boyu > > > On Mon, Sep 21, 2015 at 4:35 PM, Varun Saxena > wrote: > >> MR jobs will write history files to path given by config >> mapreduce.jobhistory.intermediate-done-dir >> History server will then move them to done dir which is given by config m >> apreduce.jobhistory.done-dir. >> >> By default these config values >> are ${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate >> and ${yarn.app.mapreduce.am.staging-dir}/history/done respectively. >> >> 7 days is also configurable(config being mapreduce.jobhistory.max-age-ms). >> You can set this value according to your cluster. >> >> I hope this answers your question. >> >> Regards, >> Varun Saxena. >> >> On Tue, Sep 22, 2015 at 1:39 AM, Boyu Zhang >> wrote: >> >>> Thanks a lot for the clarification! >>> >>> I tried to find the log and history information about finished jobs. But >>> they are not in hfs://xxx/user/myusername/output/_SUCCESS (0B). Can you >>> please give some pointers on where the statistical/job history files are >>> located? The hfs:///history/done only stores history files up to 7 days. >>> >>> Thanks, >>> Boyu >>> >>> On Mon, Sep 21, 2015 at 1:23 PM, Varun Saxena >>> wrote: >>> No, you cant show them in RM UI then. However if you can start another daemon, you can consider using YARN Application History/Timeline Server or MR Job History Server(only for MR jobs) to see information about completed jobs. You can look up Hadoop documentation to learn more about them and how to configure them. Just to clarify though, the apps themselves are not lost, as in, the output is not lost. Its just the information about them which is no longer present on RM restart. Regards, Varun Saxena. On Mon, Sep 21, 2015 at 10:31 PM, Boyu Zhang wrote: > Thanks for the answer Varun. > > It is the case that yarn.resourcemanager.recovery.enabled is set to be > false. Is there a way to show the jobs that are submitted before the > restart? We don't want to lose that data. > > Thanks, > Boyu > > > On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena < > vsaxena.va...@gmail.com> wrote: > >> Hi Boyu, >> >> RM stores apps in state store if recovery is enabled. Only then they >> will be available on restart. >> Otherwise they are kept in memory and hence lost on restart. >> >> You may not have it enabled. Check config value for below config. By >> default its false. >> yarn.resourcemanager.recovery.enabled >> >> Regards, >> Varun. >> >> On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang >> wrote: >> >>> Hello Everyone, >>> >>> I have a strange error regarding the ResourceManager web UI ( >>> http://xx.xx:8088). >>> >>> Someone before me set up the hadoop + yarn cluster using Pivotal HD, >>> it was running fine. Then today, the resource manager and node manager >>> disappeared, the logs did not record this. I restarted them, they are up >>> and running, but the resource manger web UI does not show any jobs. We >>> have >>> 700+ jobs in the past, and they were showing before. >>> >>> If I submit MapReduce jobs, the new submitted ones show up. But the >>> disappear again after restart the resource manger and node manager. >>> >>> Can anyone give any hint on where to look? >>> >>> Thanks in advance, >>> Boyu >>> >>> >> > >>> >> >
Re: Yarn ResourceManager web UI does not show job
Thanks a lot for the answer! If you don't mind help more on this, here is what I am seeing. - The NameNode/DataNode and ResourceManager/NodeManager were running for 6 months before I discovered that the job history server was not running. After bringing up the job history server, I saw like 2k+ jobs showing up from the history server web ui. But then the job history server got restarted, and I don't see any jobs more than 7 days old showing up in the history web ui. - I've disabled the cleaner in the config file. My question is, is there a way to find/recover the job history files more than 7 days old? I read that the container logs are stored locally in the NodeManger user log dir, and there are files (I have not dig through them yet). I am not sure if the deleted job history files (by history cleaner) are not easy to recover. Thanks in advance, Boyu On Mon, Sep 21, 2015 at 4:35 PM, Varun Saxenawrote: > MR jobs will write history files to path given by config > mapreduce.jobhistory.intermediate-done-dir > History server will then move them to done dir which is given by config m > apreduce.jobhistory.done-dir. > > By default these config values > are ${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate > and ${yarn.app.mapreduce.am.staging-dir}/history/done respectively. > > 7 days is also configurable(config being mapreduce.jobhistory.max-age-ms). > You can set this value according to your cluster. > > I hope this answers your question. > > Regards, > Varun Saxena. > > On Tue, Sep 22, 2015 at 1:39 AM, Boyu Zhang wrote: > >> Thanks a lot for the clarification! >> >> I tried to find the log and history information about finished jobs. But >> they are not in hfs://xxx/user/myusername/output/_SUCCESS (0B). Can you >> please give some pointers on where the statistical/job history files are >> located? The hfs:///history/done only stores history files up to 7 days. >> >> Thanks, >> Boyu >> >> On Mon, Sep 21, 2015 at 1:23 PM, Varun Saxena >> wrote: >> >>> No, you cant show them in RM UI then. >>> >>> However if you can start another daemon, you can consider using YARN >>> Application History/Timeline Server or MR Job History Server(only for MR >>> jobs) to see information about completed jobs. >>> You can look up Hadoop documentation to learn more about them and how to >>> configure them. >>> >>> Just to clarify though, the apps themselves are not lost, as in, the >>> output is not lost. Its just the information about them which is no longer >>> present on RM restart. >>> >>> Regards, >>> Varun Saxena. >>> >>> On Mon, Sep 21, 2015 at 10:31 PM, Boyu Zhang >>> wrote: >>> Thanks for the answer Varun. It is the case that yarn.resourcemanager.recovery.enabled is set to be false. Is there a way to show the jobs that are submitted before the restart? We don't want to lose that data. Thanks, Boyu On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena wrote: > Hi Boyu, > > RM stores apps in state store if recovery is enabled. Only then they > will be available on restart. > Otherwise they are kept in memory and hence lost on restart. > > You may not have it enabled. Check config value for below config. By > default its false. > yarn.resourcemanager.recovery.enabled > > Regards, > Varun. > > On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang > wrote: > >> Hello Everyone, >> >> I have a strange error regarding the ResourceManager web UI ( >> http://xx.xx:8088). >> >> Someone before me set up the hadoop + yarn cluster using Pivotal HD, >> it was running fine. Then today, the resource manager and node manager >> disappeared, the logs did not record this. I restarted them, they are up >> and running, but the resource manger web UI does not show any jobs. We >> have >> 700+ jobs in the past, and they were showing before. >> >> If I submit MapReduce jobs, the new submitted ones show up. But the >> disappear again after restart the resource manger and node manager. >> >> Can anyone give any hint on where to look? >> >> Thanks in advance, >> Boyu >> >> > >>> >> >
Re: Yarn ResourceManager web UI does not show job
Hi Boyu, RM stores apps in state store if recovery is enabled. Only then they will be available on restart. Otherwise they are kept in memory and hence lost on restart. You may not have it enabled. Check config value for below config. By default its false. yarn.resourcemanager.recovery.enabled Regards, Varun. On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhangwrote: > Hello Everyone, > > I have a strange error regarding the ResourceManager web UI ( > http://xx.xx:8088). > > Someone before me set up the hadoop + yarn cluster using Pivotal HD, it > was running fine. Then today, the resource manager and node manager > disappeared, the logs did not record this. I restarted them, they are up > and running, but the resource manger web UI does not show any jobs. We have > 700+ jobs in the past, and they were showing before. > > If I submit MapReduce jobs, the new submitted ones show up. But the > disappear again after restart the resource manger and node manager. > > Can anyone give any hint on where to look? > > Thanks in advance, > Boyu > >
Re: Yarn ResourceManager web UI does not show job
Thanks for the answer Varun. It is the case that yarn.resourcemanager.recovery.enabled is set to be false. Is there a way to show the jobs that are submitted before the restart? We don't want to lose that data. Thanks, Boyu On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxenawrote: > Hi Boyu, > > RM stores apps in state store if recovery is enabled. Only then they will > be available on restart. > Otherwise they are kept in memory and hence lost on restart. > > You may not have it enabled. Check config value for below config. By > default its false. > yarn.resourcemanager.recovery.enabled > > Regards, > Varun. > > On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang > wrote: > >> Hello Everyone, >> >> I have a strange error regarding the ResourceManager web UI ( >> http://xx.xx:8088). >> >> Someone before me set up the hadoop + yarn cluster using Pivotal HD, it >> was running fine. Then today, the resource manager and node manager >> disappeared, the logs did not record this. I restarted them, they are up >> and running, but the resource manger web UI does not show any jobs. We have >> 700+ jobs in the past, and they were showing before. >> >> If I submit MapReduce jobs, the new submitted ones show up. But the >> disappear again after restart the resource manger and node manager. >> >> Can anyone give any hint on where to look? >> >> Thanks in advance, >> Boyu >> >> >
Re: Yarn ResourceManager web UI does not show job
No, you cant show them in RM UI then. However if you can start another daemon, you can consider using YARN Application History/Timeline Server or MR Job History Server(only for MR jobs) to see information about completed jobs. You can look up Hadoop documentation to learn more about them and how to configure them. Just to clarify though, the apps themselves are not lost, as in, the output is not lost. Its just the information about them which is no longer present on RM restart. Regards, Varun Saxena. On Mon, Sep 21, 2015 at 10:31 PM, Boyu Zhangwrote: > Thanks for the answer Varun. > > It is the case that yarn.resourcemanager.recovery.enabled is set to be > false. Is there a way to show the jobs that are submitted before the > restart? We don't want to lose that data. > > Thanks, > Boyu > > > On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena > wrote: > >> Hi Boyu, >> >> RM stores apps in state store if recovery is enabled. Only then they will >> be available on restart. >> Otherwise they are kept in memory and hence lost on restart. >> >> You may not have it enabled. Check config value for below config. By >> default its false. >> yarn.resourcemanager.recovery.enabled >> >> Regards, >> Varun. >> >> On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang >> wrote: >> >>> Hello Everyone, >>> >>> I have a strange error regarding the ResourceManager web UI ( >>> http://xx.xx:8088). >>> >>> Someone before me set up the hadoop + yarn cluster using Pivotal HD, it >>> was running fine. Then today, the resource manager and node manager >>> disappeared, the logs did not record this. I restarted them, they are up >>> and running, but the resource manger web UI does not show any jobs. We have >>> 700+ jobs in the past, and they were showing before. >>> >>> If I submit MapReduce jobs, the new submitted ones show up. But the >>> disappear again after restart the resource manger and node manager. >>> >>> Can anyone give any hint on where to look? >>> >>> Thanks in advance, >>> Boyu >>> >>> >> >
Re: Yarn ResourceManager web UI does not show job
Thanks a lot for the clarification! I tried to find the log and history information about finished jobs. But they are not in hfs://xxx/user/myusername/output/_SUCCESS (0B). Can you please give some pointers on where the statistical/job history files are located? The hfs:///history/done only stores history files up to 7 days. Thanks, Boyu On Mon, Sep 21, 2015 at 1:23 PM, Varun Saxenawrote: > No, you cant show them in RM UI then. > > However if you can start another daemon, you can consider using YARN > Application History/Timeline Server or MR Job History Server(only for MR > jobs) to see information about completed jobs. > You can look up Hadoop documentation to learn more about them and how to > configure them. > > Just to clarify though, the apps themselves are not lost, as in, the > output is not lost. Its just the information about them which is no longer > present on RM restart. > > Regards, > Varun Saxena. > > On Mon, Sep 21, 2015 at 10:31 PM, Boyu Zhang > wrote: > >> Thanks for the answer Varun. >> >> It is the case that yarn.resourcemanager.recovery.enabled is set to be >> false. Is there a way to show the jobs that are submitted before the >> restart? We don't want to lose that data. >> >> Thanks, >> Boyu >> >> >> On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena >> wrote: >> >>> Hi Boyu, >>> >>> RM stores apps in state store if recovery is enabled. Only then they >>> will be available on restart. >>> Otherwise they are kept in memory and hence lost on restart. >>> >>> You may not have it enabled. Check config value for below config. By >>> default its false. >>> yarn.resourcemanager.recovery.enabled >>> >>> Regards, >>> Varun. >>> >>> On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang >>> wrote: >>> Hello Everyone, I have a strange error regarding the ResourceManager web UI ( http://xx.xx:8088). Someone before me set up the hadoop + yarn cluster using Pivotal HD, it was running fine. Then today, the resource manager and node manager disappeared, the logs did not record this. I restarted them, they are up and running, but the resource manger web UI does not show any jobs. We have 700+ jobs in the past, and they were showing before. If I submit MapReduce jobs, the new submitted ones show up. But the disappear again after restart the resource manger and node manager. Can anyone give any hint on where to look? Thanks in advance, Boyu >>> >> >
Re: Yarn ResourceManager web UI does not show job
MR jobs will write history files to path given by config mapreduce.jobhistory.intermediate-done-dir History server will then move them to done dir which is given by config m apreduce.jobhistory.done-dir. By default these config values are ${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate and ${yarn.app.mapreduce.am.staging-dir}/history/done respectively. 7 days is also configurable(config being mapreduce.jobhistory.max-age-ms). You can set this value according to your cluster. I hope this answers your question. Regards, Varun Saxena. On Tue, Sep 22, 2015 at 1:39 AM, Boyu Zhangwrote: > Thanks a lot for the clarification! > > I tried to find the log and history information about finished jobs. But > they are not in hfs://xxx/user/myusername/output/_SUCCESS (0B). Can you > please give some pointers on where the statistical/job history files are > located? The hfs:///history/done only stores history files up to 7 days. > > Thanks, > Boyu > > On Mon, Sep 21, 2015 at 1:23 PM, Varun Saxena > wrote: > >> No, you cant show them in RM UI then. >> >> However if you can start another daemon, you can consider using YARN >> Application History/Timeline Server or MR Job History Server(only for MR >> jobs) to see information about completed jobs. >> You can look up Hadoop documentation to learn more about them and how to >> configure them. >> >> Just to clarify though, the apps themselves are not lost, as in, the >> output is not lost. Its just the information about them which is no longer >> present on RM restart. >> >> Regards, >> Varun Saxena. >> >> On Mon, Sep 21, 2015 at 10:31 PM, Boyu Zhang >> wrote: >> >>> Thanks for the answer Varun. >>> >>> It is the case that yarn.resourcemanager.recovery.enabled is set to be >>> false. Is there a way to show the jobs that are submitted before the >>> restart? We don't want to lose that data. >>> >>> Thanks, >>> Boyu >>> >>> >>> On Mon, Sep 21, 2015 at 12:53 PM, Varun Saxena >>> wrote: >>> Hi Boyu, RM stores apps in state store if recovery is enabled. Only then they will be available on restart. Otherwise they are kept in memory and hence lost on restart. You may not have it enabled. Check config value for below config. By default its false. yarn.resourcemanager.recovery.enabled Regards, Varun. On Mon, Sep 21, 2015 at 10:01 PM, Boyu Zhang wrote: > Hello Everyone, > > I have a strange error regarding the ResourceManager web UI ( > http://xx.xx:8088). > > Someone before me set up the hadoop + yarn cluster using Pivotal HD, > it was running fine. Then today, the resource manager and node manager > disappeared, the logs did not record this. I restarted them, they are up > and running, but the resource manger web UI does not show any jobs. We > have > 700+ jobs in the past, and they were showing before. > > If I submit MapReduce jobs, the new submitted ones show up. But the > disappear again after restart the resource manger and node manager. > > Can anyone give any hint on where to look? > > Thanks in advance, > Boyu > > >>> >> >