Hi Rohith,
In our application, there were around 3,62,738 containers which
ran successfully before we encountered this issue. So under
userLogs/applicationId/ we had 3,62,738 directories, each directory having
container’s stdout and stderr file. We are not expecting to rotate these stdout
and stderr file as its mentioned in jira 2443. These logs are of no use after
certain time, for a week we may need those in case we need to troubleshoot why
container failed or so.
Thanks,
Smita
From: Rohith Sharma K S [mailto:[email protected]]
Sent: Monday, April 20, 2015 11:02 AM
To: [email protected]
Subject: RE: how to delete logs automatically from hadoop yarn
That’s interesting use-case!!
>>>> let’s say I want to delete container logs which are older than week or so.
>>>> So is there any configuration to do that?
I don’t think there is such configuration exist in the YARN currently. I think
it should be able to handle from log4j properties.
But enabling log-aggregation, disk filling issue can be overcome. I think in
the Hadoop-2.6 or later(yet to release)handling long running services on yarn
is done in JIRA https://issues.apache.org/jira/i#browse/YARN-2443 .
>>> Because of these continuous logs, we are running out of Linux file limit
>>> and thereafter containers are not launched because of exception while
>>> creating log directory inside application ID directory
I could not get how continuous logs causing exceeding Linux resource limit.
How many containers are running in cluster and per machine? If I think, each
containers holds one resource for logging.
Thanks & Regards
Rohith Sharma K S
From: Smita Deshpande [mailto:[email protected]]
Sent: 20 April 2015 10:23
To: [email protected]<mailto:[email protected]>
Subject: RE: how to delete logs automatically from hadoop yarn
Hi Rohith,
Thanks for your solution. The actual problem we are looking at is : We have a
lifelong running application, so configurations by which logs will be deleted
right after application is finished will not help us.
Because of these continuous logs, we are running out of Linux file limit and
thereafter containers are not launched because of exception while creating log
directory inside application ID directory.
During the job execution itself, let’s say I want to delete container logs
which are older than week or so. So is there any configuration to do that?
Thanks,
Smita
From: Rohith Sharma K S [mailto:[email protected]]
Sent: Monday, April 20, 2015 10:09 AM
To: [email protected]<mailto:[email protected]>
Subject: RE: how to delete logs automatically from hadoop yarn
Hi
With below configuration , log deletion should be triggered. You can see from
the log that deletion has been set to 3600 sec in NM like below. May be you can
check NM logs for the below log that give debug information.
“INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler:
Scheduling Log Deletion for application: application_1428298081702_0008, with
delay of 10800 seconds”
But there is another configuration which affect deletion task is
“yarn.nodemanager.delete.debug-delay-sec”, default value is zero. It means
immediately deletion will be triggered. Check is this is configured?
<property>
<description>
Number of seconds after an application finishes before the nodemanager's
DeletionService will delete the application's localized file directory
and log directory.
To diagnose Yarn application problems, set this property's value large
enough (for example, to 600 = 10 minutes) to permit examination of these
directories. After changing the property's value, you must restart the
nodemanager in order for it to have an effect.
The roots of Yarn applications' work directories is configurable with
the yarn.nodemanager.local-dirs property (see below), and the roots
of the Yarn applications' log directories is configurable with the
yarn.nodemanager.log-dirs property (see also below).
</description>
<name>yarn.nodemanager.delete.debug-delay-sec</name>
<value>0</value>
</property>
Thanks & Regards
Rohith Sharma K S
From: Sunil Garg [mailto:[email protected]]
Sent: 20 April 2015 09:52
To: [email protected]<mailto:[email protected]>
Subject: how to delete logs automatically from hadoop yarn
How to delete logs from Hadoop yarn automatically, I Have tried following
settings but it is not working
Is there any other way we can do this or am I doing something wrong !!
<property>
<name>yarn.log-aggregation-enable</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.log.retain-seconds</name>
<value>3600</value>
</property>
Thanks
Sunil Garg