[ 
https://issues.apache.org/jira/browse/TEZ-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222576#comment-14222576
 ] 

Rajesh Balamohan commented on TEZ-1775:
---------------------------------------

[~sseth], task specific cmd options might be of help here.

{code}
test.properties
===============
hadoop.root.logger=DEBUG,CLA

# Define the root logger to the system property "tez.root.logger".
log4j.rootLogger=${tez.root.logger}, EventCounter
log4j.threshold=ALL
log4j.appender.CLA=org.apache.tez.common.TezContainerLogAppender
log4j.appender.CLA.containerLogDir=${yarn.app.container.log.dir}
log4j.appender.CLA.layout=org.apache.log4j.PatternLayout
log4j.appender.CLA.layout.ConversionPattern=%d{ISO8601} %p [%t] %c{2}: %m%n
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter

# User specific DEBUG levels
log4j.logger.org.apache.tez.runtime.task.TezChild=DEBUG

{code}


In hive, add "add FILE test.properties" in the begining of the query.  This 
would place "test.properties" in distributed cache.

Now, run hive (with task specific tez options) command as follows:

{code}
hive --database mydb --hiveconf tez.task-specific.launch.cmd-opts.list="Map 
2[1]" --hiveconf 
tez.task-specific.launch.cmd-opts="-Dlog4j.configuration=test.properties" -f 
/home/rajesh/q27.sql
{code}

Here, user is trying to enable DEBUG log (i.e with test.properties) for task #1 
of "Map 2" phase.  If we need to enable for entire "Map 2", we can just specify 
"Map 2[]".  If we need to enable for multiple phases, we can add "Map 2[],Map 
8[],Reducer 1[]".

This should enable DEBUG logs very specific to 
"org.apache.tez.runtime.task.TezChild" and this can be verified by grepping the 
logs for this application
{code}
"yarn logs -applicationId <app_id>"
{code}

I tried the above approach on the test cluster and it works fine.

I hope other components like Pig will have a mechanism to upload files to 
distributed cache.  As long as "test.properties" is available in disibtributed 
cache, the above approach would work.

Caveat:
======
1. For DAGAppMaster, it loads 
"-Dlog4j.configuration=tez-container-log4j.properties" after 
"tez.am.launch.cmd-opts".  
So specifying --hiveconf 
tez.am.launch.cmd-opts="-Dlog4j.configuration=test.properties" to debug 
DAGAppMaster would not work.  We might need to change the order in 
TezClientUtils.createApplicationSubmissionContext() to load 
TEZ_AM_LAUNCH_CMD_OPTS at the end.  If this is done, same approach should 
ideally work for DAGAppMaster as well (I haven't tried this for DAGAppMaster).

> Simpler task / AM logger configuration
> --------------------------------------
>
>                 Key: TEZ-1775
>                 URL: https://issues.apache.org/jira/browse/TEZ-1775
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>
> Currently, it's fairly difficult to configure logging beyond a generic log 
> level. It'll be useful to have some control over which components need to be 
> logged at a level / should be avoided. The IPC layer, for example, generates 
> a lot of (multi-line) noise - which isn't useful when looking for Tez logs 
> only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to