[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855720#action_12855720
 ] 

Guilin Sun commented on MAPREDUCE-1648:
---------------------------------------

Thanks for comment!

bq. From the first look of the patch, it seems to me that this behaviour is 
completely gone and the jvm's output and error are now being redirected to 
/dev/null

Default stdout/stderr have been replaced by line 13, we use Log4jOutputStream 
instead of default ones, so "/dev/null" will get nothing unless log4j init 
failed, I put "/dev/null" here is to prevent child-jvm produce any outputs to 
real stdout/stderr.  (log4j/contrib/ provided a LoggingOutputStream 
implementation to do same thing). 

bg. Even if we fix the above, I am not really sure this will solve the overall 
feature of limiting logs simply because simply switching/rotating log files in 
the jvm process will not do the same for the streaming/pipes tasks

Streaming use PipeMapper/PipeReducer  and their stderr will be catch by 
child-jvm and then output to child-jvm' stderr, because we replaced default 
stdout and stderr of child-jvm, so it works well with streaming, but pipes is 
not under test yet.

This issue is point to  'tail -c" and old "TaskLogAppender"(another version of 
"tail -c" in fact) problem, and main benefits of this patch includes:
# No delay, and so will not lose logs when child exit abnormally.
# Prevent tasks produce too large logs in time rather than truncate logs till 
tasks finishes, 
# Because of log4j, we can change log directory/file when starting a new 
task(for jvm-reuse), so it is easy to control size by task.
# Redirect stdout/stderr by child-jvm, so do not need to change any client 
source code(include streaming).





> Use RollingFileAppender to limit tasklogs
> -----------------------------------------
>
>                 Key: MAPREDUCE-1648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1648
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>            Reporter: Guilin Sun
>            Priority: Minor
>         Attachments: syslog.patch
>
>
> There are at least two types of task-logs: syslog and stdlog
> Task-Jvm outputs syslog by log4j with TaskLogAppender, TaskLogAppender looks 
> just like "tail -c", it stores last N byte/line logs in memory(via queue), 
> and do real output only if all logs is commit and Appender is going to close.
> The common problem of TaskLogAppender and 'tail -c'  is keep everything in 
> memory and user can't see any log output while task is in progress.
> So I'm going to try RollingFileAppender  instead of  TaskLogAppender, use 
> MaxFileSize&MaxBackupIndex to limit log file size.
> RollingFileAppender is also suitable for stdout/stderr, just redirect 
> stdout/stderr to log4j via LoggingOutputStream, no client code have to be 
> changed, and RollingFileAppender seems better than 'tail -c' too.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to