Hi Rohith,
                In our application, there were around 3,62,738 containers which 
ran successfully before we encountered this issue. So under 
userLogs/applicationId/ we had 3,62,738 directories, each directory having 
container’s stdout and stderr file. We are not expecting to rotate these stdout 
and stderr file as its mentioned in jira 2443. These logs are of no use after 
certain time, for a week we may need those in case we need to troubleshoot why 
container failed or so.

Thanks,
Smita

From: Rohith Sharma K S [mailto:[email protected]]
Sent: Monday, April 20, 2015 11:02 AM
To: [email protected]
Subject: RE: how to delete logs automatically from hadoop yarn

That’s  interesting use-case!!

>>>> let’s say I want to delete container logs which are older than week or so. 
>>>> So is there any configuration to do that?
I don’t think there is such configuration exist in the YARN currently. I think 
it should be able to handle from log4j properties.

But enabling log-aggregation, disk filling issue can be overcome. I think in 
the Hadoop-2.6 or later(yet to release)handling long running services on yarn 
is done in JIRA https://issues.apache.org/jira/i#browse/YARN-2443 .

>>> Because of these continuous logs, we are running out of Linux file limit 
>>> and thereafter containers are not launched because of exception while 
>>> creating log directory inside application ID directory
I could not get how continuous logs causing exceeding Linux resource limit.  
How many containers are running in cluster and per machine? If I think, each 
containers holds one resource for logging.


Thanks & Regards
Rohith Sharma K S

From: Smita Deshpande [mailto:[email protected]]
Sent: 20 April 2015 10:23
To: [email protected]<mailto:[email protected]>
Subject: RE: how to delete logs automatically from hadoop yarn

Hi Rohith,
Thanks for your solution. The actual problem we are looking at is : We have a 
lifelong running application, so configurations by which logs will be deleted 
right after application is finished will not help us.
Because of these continuous logs, we are running out of Linux file limit and 
thereafter containers are not launched because of exception while creating log 
directory inside application ID directory.
During the job execution itself, let’s say I want to delete container logs 
which are older than week or so. So is there any configuration to do that?

Thanks,
Smita


From: Rohith Sharma K S [mailto:[email protected]]
Sent: Monday, April 20, 2015 10:09 AM
To: [email protected]<mailto:[email protected]>
Subject: RE: how to delete logs automatically from hadoop yarn

Hi

With below configuration , log deletion should be triggered.  You can see from 
the log that deletion has been set to 3600 sec in NM like below. May be you can 
check NM logs for the below log that give debug information.
“INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler:
 Scheduling Log Deletion for application: application_1428298081702_0008, with 
delay of 10800 seconds”

But there is another configuration which affect deletion task is 
“yarn.nodemanager.delete.debug-delay-sec”, default value is zero. It means 
immediately deletion will be triggered. Check is this is configured?
  <property>
    <description>
      Number of seconds after an application finishes before the nodemanager's
      DeletionService will delete the application's localized file directory
      and log directory.

      To diagnose Yarn application problems, set this property's value large
      enough (for example, to 600 = 10 minutes) to permit examination of these
      directories. After changing the property's value, you must restart the
      nodemanager in order for it to have an effect.

      The roots of Yarn applications' work directories is configurable with
      the yarn.nodemanager.local-dirs property (see below), and the roots
      of the Yarn applications' log directories is configurable with the
      yarn.nodemanager.log-dirs property (see also below).
    </description>
    <name>yarn.nodemanager.delete.debug-delay-sec</name>
    <value>0</value>
  </property>


Thanks & Regards
Rohith Sharma K S
From: Sunil Garg [mailto:[email protected]]
Sent: 20 April 2015 09:52
To: [email protected]<mailto:[email protected]>
Subject: how to delete logs automatically from hadoop yarn


How to delete logs from Hadoop yarn automatically, I Have tried following 
settings but it is not working
Is there any other way we can do this or am I doing something wrong !!

<property>
<name>yarn.log-aggregation-enable</name>
<value>false</value>
</property>

<property>
<name>yarn.nodemanager.log.retain-seconds</name>
<value>3600</value>
</property>

Thanks
Sunil Garg

Reply via email to