[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhang Wei updated MAPREDUCE-6283:
---------------------------------
    Assignee: Varun Saxena  (was: Zhang Wei)

> MRHistoryServer log files management optimization
> -------------------------------------------------
>
>                 Key: MAPREDUCE-6283
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6283
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>            Reporter: Zhang Wei
>            Assignee: Varun Saxena
>            Priority: Minor
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> In some heavy computation clusters, user may continually submit lots of jobs, 
> in our scenario, there are 240k jobs per day. On average, 5 nodes will 
> participate in running a job. All these job's log file will be aggregated on 
> the hdfs. That is a big load for namenode. The total number of generated log 
> files in the default cleaning period (1 week) can be calculated as follows:
> AM logs per week: 7 days * 240,000 jobs/day * 2 files/job = 3360,000 files
> App logs per week: 7 days * 240,000 jobs/day * 5 nodes/job * 1 file/node = 
> 8400,000 files
> There will be more than 10 million log files generated in one week. Even 
> worse, some environments have to keep the logs for potential issues tracking 
> for longer time. In general, these small log files will occupy about 12G heap 
> size of Namenode, and impact the response speed of Namenode.
> For optimizing the log management of history server, the main goals are:
> 1)    Reduce the total count of files in HDFS.
> 2)    Compatible with the former history server operation.
> As per the goals above, we can mine the detail demands as follows: 
> 1)    Merge log files into bigger ones in HDFS periodically.
> 2)    Optimized design should inherits from the original architecture to make 
> the merged logs transparent to be browsed.
> 3)    Merged logs should be aged periodically just like the common logs.
> The whole  life cycle of the AM logs:
> 1.Created by Application Master in intermediate-done-dir.
> 2.Moved to done-dir after the job is done.
> 3.Archived to archived-dir  periodically.
> 4.Cleaned when all the logs in harball are expired.
> The whole  life cycle of the App logs:
> 1.Created by Applications in local-dirs.
> 2.Aggregated to remote-app-log-dir after the job is done.
> 3.Archived to archived-dir  periodically.
> 4.Cleaned when all the logs in harball are expired. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to