[
https://issues.apache.org/jira/browse/TEZ-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajesh Balamohan updated TEZ-1777:
----------------------------------
Attachment: map_task_hive_query_92.png
reduce_task_hive_query_92.png
TEZ-1777.1.patch
HTrace can be useful in such scenarios. It is possible to write the trace data
to local file or DB or KV stores (cassandra / hbase etc). Zipkin viewer can be
used for viewing the trace details in graphical format.
Attaching the initial patch with HTrace for early comments. Attaching the
map/reduce task snippet with zipkin.
Following config can be used for enabling htrace
{code}
<property>
<name>tez.htrace.enabled</name>
<value>true</value>
</property>
<property>
<name>tez.htrace.spanreceiver.classes</name>
<!-- More span receivers can be added by providing a comma separated list of
classnames-->
<value>org.htrace.impl.ZipkinSpanReceiver</value>
</property>
<property>
<name>tez.htrace.zipkin.collector-hostname</name>
<value>mymachine</value>
</property>
{code}
for writing htrace data to local machines in text format
{code}
<property>
<name>tez.htrace.enabled</name>
<value>true</value>
</property>
<property>
<name>tez.htrace.spanreceiver.classes</name>
<value>org.htrace.impl.LocalFileSpanReceiver</value>
</property>
<property>
<name>tez.htrace.local-file-span-receiver.path</name>
<value>/tmp/htrace.out</value>
</property>
{code}
> Explore distributed tracing of Tez tasks
> -----------------------------------------
>
> Key: TEZ-1777
> URL: https://issues.apache.org/jira/browse/TEZ-1777
> Project: Apache Tez
> Issue Type: Wish
> Reporter: Gopal V
> Assignee: Rajesh Balamohan
> Attachments: TEZ-1777.1.patch, map_task_hive_query_92.png,
> reduce_task_hive_query_92.png
>
>
> Debugging Tez latencies using Swimlanes does not give enough insight into
> latencies produced within a task.
> Explore a distributed tracing mode to track this on large clusters (for
> debug/profiling purposes)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)