[
https://issues.apache.org/jira/browse/MAPREDUCE-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod K V updated MAPREDUCE-1100:
---------------------------------
Attachment: MAPREDUCE-1100-20091102.txt
Attaching a first patch.
Introducing the following configuration items:
- Job Configuration:
-- {{JobContext.MAP_USERLOG_LIMIT}} : Per task limit on how much each log
file can grow to. Used by {{killRunningTasksOverLimit()}} for killing tasks
that write excessive logging.
-- {{JobContext.REDUCE_USERLOG_LIMIT}} : Same as above for reduces.
-- {{JobContext.MAP_USERLOG_RETAIN_SIZE}} : Per task configuration of how
much tail of the each log file has to be retained. Each task-log file is
truncated to this amount after the task finishes. Used by
{{truncateLogsOfFinishedTasks()}}
-- {{JobContext.REDUCE_USERLOG_RETAIN_SIZE}} : Same as above for reduces.
- TT configuration
-- {{TTConfig.TT_USERLOG_RETAIN_HOURS}} : TT configuraton of how long logs
of each finished task has to be retained on this TT. Used by
{{retireOldLogs()}} to cleanup very old logs.
-- {{TTConfig.TT_USERLOG_CUMULATIVE_LIMIT}} : TT configuration limiting the
total usage of log files across all tasks. If the total usage grows beyond this
limit, {{removeOldFilesToControlCumulativeUsage()}} removes old log files
irrespective of their age w.r.t {{TTConfig.TT_USERLOG_RETAIN_HOURS}}.
Moved clean-up of task-logs from child into TaskLogsMonitor which does the
following:
{code}
while(true) {
retireOldLogs(); // remove very old logs
truncateLogsOfFinishedTasks(); // truncate finished tasks' logs. Also set
no-writable permissions.
killRunningTasksOverLimit(); // kill tasks going over per-task per-file limit
removeOldFilesToControlCumulativeUsage(); // remove very old logs if total
usage is alarming irrespective of retain.hours
}
{code}
> User's task-logs filling up local disks on the TaskTrackers
> -----------------------------------------------------------
>
> Key: MAPREDUCE-1100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1100
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: tasktracker
> Affects Versions: 0.21.0
> Reporter: Vinod K V
> Assignee: Vinod K V
> Attachments: MAPREDUCE-1100-20091102.txt
>
>
> Some user's jobs are filling up TT disks by outrageous logging.
> mapreduce.task.userlog.limit.kb is not enabled on the cluster. Disks are
> getting filled up before task-log cleanup via
> mapred.task.userlog.retain.hours can kick in.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.