[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Balitsky updated MAPREDUCE-6778:
------------------------------------------
    Description: 
We can run job with huge amount of stdout/stderr and causing undesired 
consequence.

The possible solution is to redirect Stdout's and Stderr's output to log4j in 
YarnChild.java main method.
In this case System.out and System.err streams will be redirected to log4j 
logger with  appender that will direct output in to stderr or stdout files with 
needed size limitation. Thereby we are able to limit log's size on the fly, 
having one backup rolling file (thanks to ContainerRollingLogAppender).

One of the syslog's size limitation approaches works the same way.

So, we can set limitation via new properties in mapred-site.xml:
mapreduce.task.userlog.stderr.limit.kb
mapreduce.task.userlog.stdout.limit.kb

Advantages of such solution:
- it allows us to restrict file sizes during job execution.
- we can see logs during job execution.

Disadvantages:
- It will work only for MRs jobs.

Is it appropriate solution for solving this problem, or is there something 
better?



  was:
We can run job with huge amount of stdout/stderr and causing undesired 
consequence. There is already a Jira which is been open for while now:
https://issues.apache.org/jira/browse/YARN-2231

The possible solution is to redirect Stdout's and Stderr's output to log4j in 
YarnChild.java main method.
In this case System.out and System.err streams will be redirected to log4j 
logger with  appender that will direct output in to stderr or stdout files with 
needed size limitation. Thereby we are able to limit log's size on the fly, 
having one backup rolling file (thanks to ContainerRollingLogAppender).

One of the syslog's size limitation approaches works the same way.

So, we can set limitation via new properties in mapred-site.xml:
mapreduce.task.userlog.stderr.limit.kb
mapreduce.task.userlog.stdout.limit.kb

Advantages of such solution:
- it allows us to restrict file sizes during job execution.
- we can see logs during job execution.

Disadvantages:
- It will work only for MRs jobs.

Is it appropriate solution for solving this problem, or is there something 
better?




> Provide way to limit MRJob's stdout/stderr size
> -----------------------------------------------
>
>                 Key: MAPREDUCE-6778
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6778
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.7.0
>            Reporter: Aleksandr Balitsky
>            Priority: Minor
>         Attachments: MAPREDUCE-6778.v1.001.patch
>
>
> We can run job with huge amount of stdout/stderr and causing undesired 
> consequence.
> The possible solution is to redirect Stdout's and Stderr's output to log4j in 
> YarnChild.java main method.
> In this case System.out and System.err streams will be redirected to log4j 
> logger with  appender that will direct output in to stderr or stdout files 
> with needed size limitation. Thereby we are able to limit log's size on the 
> fly, having one backup rolling file (thanks to ContainerRollingLogAppender).
> One of the syslog's size limitation approaches works the same way.
> So, we can set limitation via new properties in mapred-site.xml:
> mapreduce.task.userlog.stderr.limit.kb
> mapreduce.task.userlog.stdout.limit.kb
> Advantages of such solution:
> - it allows us to restrict file sizes during job execution.
> - we can see logs during job execution.
> Disadvantages:
> - It will work only for MRs jobs.
> Is it appropriate solution for solving this problem, or is there something 
> better?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to