[
https://issues.apache.org/jira/browse/TEZ-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222576#comment-14222576
]
Rajesh Balamohan commented on TEZ-1775:
---------------------------------------
[~sseth], task specific cmd options might be of help here.
{code}
test.properties
===============
hadoop.root.logger=DEBUG,CLA
# Define the root logger to the system property "tez.root.logger".
log4j.rootLogger=${tez.root.logger}, EventCounter
log4j.threshold=ALL
log4j.appender.CLA=org.apache.tez.common.TezContainerLogAppender
log4j.appender.CLA.containerLogDir=${yarn.app.container.log.dir}
log4j.appender.CLA.layout=org.apache.log4j.PatternLayout
log4j.appender.CLA.layout.ConversionPattern=%d{ISO8601} %p [%t] %c{2}: %m%n
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
# User specific DEBUG levels
log4j.logger.org.apache.tez.runtime.task.TezChild=DEBUG
{code}
In hive, add "add FILE test.properties" in the begining of the query. This
would place "test.properties" in distributed cache.
Now, run hive (with task specific tez options) command as follows:
{code}
hive --database mydb --hiveconf tez.task-specific.launch.cmd-opts.list="Map
2[1]" --hiveconf
tez.task-specific.launch.cmd-opts="-Dlog4j.configuration=test.properties" -f
/home/rajesh/q27.sql
{code}
Here, user is trying to enable DEBUG log (i.e with test.properties) for task #1
of "Map 2" phase. If we need to enable for entire "Map 2", we can just specify
"Map 2[]". If we need to enable for multiple phases, we can add "Map 2[],Map
8[],Reducer 1[]".
This should enable DEBUG logs very specific to
"org.apache.tez.runtime.task.TezChild" and this can be verified by grepping the
logs for this application
{code}
"yarn logs -applicationId <app_id>"
{code}
I tried the above approach on the test cluster and it works fine.
I hope other components like Pig will have a mechanism to upload files to
distributed cache. As long as "test.properties" is available in disibtributed
cache, the above approach would work.
Caveat:
======
1. For DAGAppMaster, it loads
"-Dlog4j.configuration=tez-container-log4j.properties" after
"tez.am.launch.cmd-opts".
So specifying --hiveconf
tez.am.launch.cmd-opts="-Dlog4j.configuration=test.properties" to debug
DAGAppMaster would not work. We might need to change the order in
TezClientUtils.createApplicationSubmissionContext() to load
TEZ_AM_LAUNCH_CMD_OPTS at the end. If this is done, same approach should
ideally work for DAGAppMaster as well (I haven't tried this for DAGAppMaster).
> Simpler task / AM logger configuration
> --------------------------------------
>
> Key: TEZ-1775
> URL: https://issues.apache.org/jira/browse/TEZ-1775
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Siddharth Seth
>
> Currently, it's fairly difficult to configure logging beyond a generic log
> level. It'll be useful to have some control over which components need to be
> logged at a level / should be avoided. The IPC layer, for example, generates
> a lot of (multi-line) noise - which isn't useful when looking for Tez logs
> only.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)