[
https://issues.apache.org/jira/browse/MAPREDUCE-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhang Wei updated MAPREDUCE-6283:
---------------------------------
Description:
In some heavy computation clusters, there might be a potential hdfs small files
problem. The continually submitted MR jobs will create millions of log files
(include the application master logs and application logs which are aggregated
to hdfs by nodemanager). This optimization design helps to reduce the numbers
of log files by merging them into bigger ones.
MR-HistoryServer Optimization
1. Background
Running one MR job will output 2 types of logs, the Application Master(AM) logs
and the Application logs.
AM Logs
AM (Application Master) logs are generated by MR Application Master, and will
be outputted to HDFS directly. They recorded the starting time and running time
of the jobs, and the starting time, running time and the Counter values of all
tasks. These logs are managed and analyzed by MR History Server, and shown in
the History Server job list page and job details page.
Each job will generate three log files in intermediate-done-dir, and a timed
task will move “.jhist” and “.xml” files to the final done-dir, “.summary”
file will be deleted. So, the total number of logs in HDFS will be twice over
the job number.
Application Logs
Application logs are generated by the applications which run in the Container
of Node Manager. By default, Application logs will only be stored in local
disks. By enabling the aggregation feature of Yarn, These logs can be
aggregated to remote-app-log-dir in HDFS and deleted in local disks.
During aggregation processing, all the logs in the same host will be merged to
one log file which is named by the host name and Node Manager port. So, the
total number of Application logs in HDFS will be the same with the number of
Node Manager which participate in running this job.
2. Problem Scenario
Take a heavy computation cluster for example: A cluster with hundreds of
computing nodes, there will be more than 240,000 MR jobs submitted and running
in this cluster in a day. On average, 5 nodes will participate in running a
job. The total number of generated log files in the default cleaning period (1
week) can be calculated as follows:
AM logs per week: 7 days * 240,000 jobs/day * 2 files/job = 3360,000 files
App logs per week: 7 days * 240,000 jobs/day * 5 nodes/job * 1 file/node =
8400,000 files
There will be more than 10 million log files generated in one week. Even worse,
some environments have to keep the logs for potential issues tracking for
longer time. In general, these small log files will occupy about 12G heap size
of Namenode, and impact the response speed of Namenode.
3. Design Goals & Demands
For optimizing the log management of history server, the main goals are:
1) Reduce the total count of files in HDFS.
2) Do the least source code changes to reach goal 1.
3) Compatible with the former history server operation.
As per the goals above, we can mine the detail demands as follows:
1) Merge log files into bigger ones in HDFS periodically.
2) Optimized design should Inherits from the original architecture to make
the merged logs transparent to be browsed.
3) Merged logs should be aged periodically just like the common logs.
4. Overall Design
Both AM logs and aggregated App logs are stored in HDFS. And the logs will be
read many times but with no requirements for editing. As a distributed file
system, HDFS is not intended for storing huge number of small files. Hadoop
Archives is one of the solutions to solve the small files problem. Archives the
logs periodically into large files and delete the original small log files can
rapidly reduce the numbers of files. Regarding the problem scenario above,
suppose the archive task runs every 24h, there will be only less than 1
thousand files (blocks) increased every day.
Archive log files
The archive task will be triggered by the Timer every 24h by default (can be
configured). If there are more than 1000 (can be configured) jobs in the
aggregated dir, then the archive task will archive them into large files, and
then delete the original files.
Browse archived logs
File in Hadoop Archive format can be accessed through HDFS API transparently.
So archived log files can be read by the History Server compatibly with only a
little core code updating. User can browse archived logs in the front page as
before.
Clean archived logs
Archived logs will be deleted by the cleaning task of History Server. The
cleaning task is triggered by Timer every 24h by default (can be configured).
If all of the log files in this archive package are expired (expired time can
be configured), then this package will be deleted by this task immediately.
5. Detailed Design
5.1 AM logs
5.1.1 Archive AM logs
1. The archiving thread is periodically called by a Scheduler as per the
configuration of $mapreduce. jobhistory.archive.interval-ms.
2. The archiving thread checks the $mapreduce.jobhistory.done-dir to count
the jobs that have been done.
3. Check the existing archiving thread for archiving scope, prevent for
duplicating archiving.
4. If the count of jobs is less than
$mapreduce.jobhistory.archive.jobs.minimum, then archiving thread exits. Else
go to step 4.
5. The archiving thread start an archiving MR job to archive the logs into
$mapreduce.jobhistory.archive. archive-dir.
6. If the archiving MR job finished successfully, the archiving thread will
a) Update the buffer of History File Manager (Job List Cache and Serial
Number Index) for the archived logs.
b) Deleting the logs that have been archived in
$mapreduce.jobhistory.done-dir.
7. Else try to delete the har files that may have been created.
5.1.2 Browse AM logs
The browsing module relates to two parts:
initExisting: Fill the buffers
① Index the History-Logs Done Dir with Serial Number. It’s original
business logic.
② Fill the Job List Cache with History-Logs Done Dir. It’s original
business logic.
③ If Serial Number Index is less than the maxSize (200000 by default),
continue to index the History-Logs Archived Dir with Serial Number.
④ If Job List Cache is not full filled (20000 by default), continue to
fill the cache with History-Logs Archived Dir.
getFileInfo: Get Job summary info
a) Execute step a to get job info from JobListCache. It’s original
business logic.
b) If the specified job id is not hit in the cache, then execute step b to
search the job id in Serial Number Index for path, and get job info from the
History-Logs-Done-Dir.
c) If the file is still not hit in the History-Logs-Done-Dir, then scan
the History-Logs-Archived-Dir.
5.1.3 Clean AM logs
1. As the cleaning task has finished cleaning the
$mapreduce.jobhistory.done-dir, continue to scan the
$mapreduce.jobhistory.archive.archive-dir to find out the expired logs.
2. Even if one log in the har file is not expired, then keep this har file
for browsing and continue scanning other har files.
3. If all the logs in the har file are expired, then try to delete this
har file.
4. If deleting failed:
a) Retry times is less than the max value, then sleep for some
milliseconds and try again.
b) Retry times is greater than the max value, then break the loop and
continue scanning other har files.
5. If deleting is successful, then update the deleted job info buffer and
continue scanning other har files.
5.2 App logs
5.2.1 Archive App logs
1) The archiving thread is called periodically by a Timer as per the
configuration of $yarn.log-aggregation.archive.interval-ms.
2) The archiving thread checks the $yarn.nodemanager.remote-app-log-dir to
count the logs that have been aggregated.
3) Check the existing archiving thread for archiving scope, prevent for
duplicating archiving.
4) If the count of logs is less than
$yarn.log-aggregation.archive.jobs.minimum, then archiving thread exits. Else
go to step 4.
5) The archiving thread start an archiving MR job to archive the logs into
$yarn.log-aggregation.archive. archive-dir.
6) If the archiving MR job finished successfully, the archiving thread
will delete the original logs in $yarn.nodemanager.remote-app-log-dir, else try
to delete the har files that may have been created.
5.2.2 Browse App logs
The web page requirement:
① According to the job id, create the path of log file. If the path
exists, execute as the original business logic. Else, go to ②.
② According to the job id, scan the archived log dir and create the path
with “har://” prefix. Then execute as the original business logic.
The CLI requirement was handled the same as the web page requirement.
5.2.3 Clean App logs
1. As the cleaning task has finished cleaning the
$yarn.nodemanager.remote-app-log-dir, continue to scan the
$yarn.log-aggregation.archive.archive-dir to find out the expired logs.
2. Even if one log in the har file is not expired, then keep this har file
for browsing and continue scanning other har files.
3. If all the logs in the har file are expired, then try to delete this
har file.
4. If deleting failed:
a) Retry times is less than the max value, then sleep for some
milliseconds and try again.
b) Retry times is greater than the max value, then break the loop and
continue scanning other har files.
5. If deleting is successful, then continue scanning other har files until
all the har files have been scanned.
was:
In some heavy computation clusters, there might be a potential hdfs small files
problem. The continually submitted MR jobs will create millions of log files
(include the application master logs and application logs which are aggregated
to hdfs by nodemanager). This optimization design helps to reduce the numbers
of log files by merging them into bigger ones.
Get the details from the design doc which I will upload later.
> MRHistoryServer log files management optimization
> -------------------------------------------------
>
> Key: MAPREDUCE-6283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6283
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobhistoryserver
> Reporter: Zhang Wei
> Assignee: Zhang Wei
> Priority: Minor
> Original Estimate: 2,016h
> Remaining Estimate: 2,016h
>
> In some heavy computation clusters, there might be a potential hdfs small
> files problem. The continually submitted MR jobs will create millions of log
> files (include the application master logs and application logs which are
> aggregated to hdfs by nodemanager). This optimization design helps to reduce
> the numbers of log files by merging them into bigger ones.
> MR-HistoryServer Optimization
> 1. Background
> Running one MR job will output 2 types of logs, the Application Master(AM)
> logs and the Application logs.
> AM Logs
> AM (Application Master) logs are generated by MR Application Master, and will
> be outputted to HDFS directly. They recorded the starting time and running
> time of the jobs, and the starting time, running time and the Counter values
> of all tasks. These logs are managed and analyzed by MR History Server, and
> shown in the History Server job list page and job details page.
> Each job will generate three log files in intermediate-done-dir, and a timed
> task will move “.jhist” and “.xml” files to the final done-dir, “.summary”
> file will be deleted. So, the total number of logs in HDFS will be twice over
> the job number.
> Application Logs
> Application logs are generated by the applications which run in the Container
> of Node Manager. By default, Application logs will only be stored in local
> disks. By enabling the aggregation feature of Yarn, These logs can be
> aggregated to remote-app-log-dir in HDFS and deleted in local disks.
> During aggregation processing, all the logs in the same host will be merged
> to one log file which is named by the host name and Node Manager port. So,
> the total number of Application logs in HDFS will be the same with the number
> of Node Manager which participate in running this job.
> 2. Problem Scenario
> Take a heavy computation cluster for example: A cluster with hundreds of
> computing nodes, there will be more than 240,000 MR jobs submitted and
> running in this cluster in a day. On average, 5 nodes will participate in
> running a job. The total number of generated log files in the default
> cleaning period (1 week) can be calculated as follows:
> AM logs per week: 7 days * 240,000 jobs/day * 2 files/job = 3360,000 files
> App logs per week: 7 days * 240,000 jobs/day * 5 nodes/job * 1 file/node =
> 8400,000 files
> There will be more than 10 million log files generated in one week. Even
> worse, some environments have to keep the logs for potential issues tracking
> for longer time. In general, these small log files will occupy about 12G heap
> size of Namenode, and impact the response speed of Namenode.
> 3. Design Goals & Demands
> For optimizing the log management of history server, the main goals are:
> 1) Reduce the total count of files in HDFS.
> 2) Do the least source code changes to reach goal 1.
> 3) Compatible with the former history server operation.
> As per the goals above, we can mine the detail demands as follows:
> 1) Merge log files into bigger ones in HDFS periodically.
> 2) Optimized design should Inherits from the original architecture to make
> the merged logs transparent to be browsed.
> 3) Merged logs should be aged periodically just like the common logs.
> 4. Overall Design
> Both AM logs and aggregated App logs are stored in HDFS. And the logs will be
> read many times but with no requirements for editing. As a distributed file
> system, HDFS is not intended for storing huge number of small files. Hadoop
> Archives is one of the solutions to solve the small files problem. Archives
> the logs periodically into large files and delete the original small log
> files can rapidly reduce the numbers of files. Regarding the problem scenario
> above, suppose the archive task runs every 24h, there will be only less than
> 1 thousand files (blocks) increased every day.
> Archive log files
> The archive task will be triggered by the Timer every 24h by default (can be
> configured). If there are more than 1000 (can be configured) jobs in the
> aggregated dir, then the archive task will archive them into large files, and
> then delete the original files.
> Browse archived logs
> File in Hadoop Archive format can be accessed through HDFS API transparently.
> So archived log files can be read by the History Server compatibly with only
> a little core code updating. User can browse archived logs in the front page
> as before.
> Clean archived logs
> Archived logs will be deleted by the cleaning task of History Server. The
> cleaning task is triggered by Timer every 24h by default (can be configured).
> If all of the log files in this archive package are expired (expired time can
> be configured), then this package will be deleted by this task immediately.
> 5. Detailed Design
> 5.1 AM logs
> 5.1.1 Archive AM logs
>
> 1. The archiving thread is periodically called by a Scheduler as per the
> configuration of $mapreduce. jobhistory.archive.interval-ms.
> 2. The archiving thread checks the $mapreduce.jobhistory.done-dir to count
> the jobs that have been done.
> 3. Check the existing archiving thread for archiving scope, prevent for
> duplicating archiving.
> 4. If the count of jobs is less than
> $mapreduce.jobhistory.archive.jobs.minimum, then archiving thread exits. Else
> go to step 4.
> 5. The archiving thread start an archiving MR job to archive the logs into
> $mapreduce.jobhistory.archive. archive-dir.
> 6. If the archiving MR job finished successfully, the archiving thread will
> a) Update the buffer of History File Manager (Job List Cache and Serial
> Number Index) for the archived logs.
> b) Deleting the logs that have been archived in
> $mapreduce.jobhistory.done-dir.
> 7. Else try to delete the har files that may have been created.
> 5.1.2 Browse AM logs
>
> The browsing module relates to two parts:
> initExisting: Fill the buffers
> ① Index the History-Logs Done Dir with Serial Number. It’s original
> business logic.
> ② Fill the Job List Cache with History-Logs Done Dir. It’s original
> business logic.
> ③ If Serial Number Index is less than the maxSize (200000 by default),
> continue to index the History-Logs Archived Dir with Serial Number.
> ④ If Job List Cache is not full filled (20000 by default), continue to
> fill the cache with History-Logs Archived Dir.
> getFileInfo: Get Job summary info
> a) Execute step a to get job info from JobListCache. It’s original
> business logic.
> b) If the specified job id is not hit in the cache, then execute step b to
> search the job id in Serial Number Index for path, and get job info from the
> History-Logs-Done-Dir.
> c) If the file is still not hit in the History-Logs-Done-Dir, then scan
> the History-Logs-Archived-Dir.
> 5.1.3 Clean AM logs
>
> 1. As the cleaning task has finished cleaning the
> $mapreduce.jobhistory.done-dir, continue to scan the
> $mapreduce.jobhistory.archive.archive-dir to find out the expired logs.
> 2. Even if one log in the har file is not expired, then keep this har file
> for browsing and continue scanning other har files.
> 3. If all the logs in the har file are expired, then try to delete this
> har file.
> 4. If deleting failed:
> a) Retry times is less than the max value, then sleep for some
> milliseconds and try again.
> b) Retry times is greater than the max value, then break the loop and
> continue scanning other har files.
> 5. If deleting is successful, then update the deleted job info buffer and
> continue scanning other har files.
> 5.2 App logs
> 5.2.1 Archive App logs
>
> 1) The archiving thread is called periodically by a Timer as per the
> configuration of $yarn.log-aggregation.archive.interval-ms.
> 2) The archiving thread checks the $yarn.nodemanager.remote-app-log-dir to
> count the logs that have been aggregated.
> 3) Check the existing archiving thread for archiving scope, prevent for
> duplicating archiving.
> 4) If the count of logs is less than
> $yarn.log-aggregation.archive.jobs.minimum, then archiving thread exits. Else
> go to step 4.
> 5) The archiving thread start an archiving MR job to archive the logs into
> $yarn.log-aggregation.archive. archive-dir.
> 6) If the archiving MR job finished successfully, the archiving thread
> will delete the original logs in $yarn.nodemanager.remote-app-log-dir, else
> try to delete the har files that may have been created.
> 5.2.2 Browse App logs
>
> The web page requirement:
> ① According to the job id, create the path of log file. If the path
> exists, execute as the original business logic. Else, go to ②.
> ② According to the job id, scan the archived log dir and create the path
> with “har://” prefix. Then execute as the original business logic.
> The CLI requirement was handled the same as the web page requirement.
> 5.2.3 Clean App logs
>
> 1. As the cleaning task has finished cleaning the
> $yarn.nodemanager.remote-app-log-dir, continue to scan the
> $yarn.log-aggregation.archive.archive-dir to find out the expired logs.
> 2. Even if one log in the har file is not expired, then keep this har file
> for browsing and continue scanning other har files.
> 3. If all the logs in the har file are expired, then try to delete this
> har file.
> 4. If deleting failed:
> a) Retry times is less than the max value, then sleep for some
> milliseconds and try again.
> b) Retry times is greater than the max value, then break the loop and
> continue scanning other har files.
> 5. If deleting is successful, then continue scanning other har files until
> all the har files have been scanned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)