[ 
https://issues.apache.org/jira/browse/HTRACE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693033#comment-14693033
 ] 

Wangda Tan commented on HTRACE-69:
----------------------------------

Hi [~cmccabe]/[~iwasakims]/[~eclark]

Thanks for all your comments, I took a look at HTRACE-214, IIUC, it should be a 
module specific tracer, we can define YARN-RM tracer, YARN-RM-Scheduler tracer, 
use different tracer in implementation. And enable/disable them via 
configuration Correct?

Regarding to the "keep all the span" and "keep partial span" topic, I think 
performance is the most important thing. One interesting reference is, Twitter 
folks did a large scale performance test for YARN timeline server v2 
(YARN-2928). It uses synthetic generated mapreduce jobs, and write metrics 
information to HBase, the number of records for a test mr job may be much less 
than number of trace info for a mr job. If you have interest, please refer to 
report: 
https://issues.apache.org/jira/secure/attachment/12737364/TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf.
 I'm agree with get/save as much information as possible if we can keep good 
performance.

I am also thinking about YARN and upper app level tracing information these 
days. I think for the per-application (start/end time for app) / per-container 
(start/end time for container) / workflow (a oozie job, a Hive job, etc.), we 
can get them from Timeline server. We should already have lots of information 
stored in TS. For other information, especially for monitoring RPC call 
performance, such as
- The delay of writing data from RM to ATS
- The delay of AM<->RM heartbeat
- The delay of RM<->NM heartbeat
- The delay of HDFS opereations for a single job
- The time spend in scheduler loop
HTrace is more usaful help throubleshooting these issues.

Regarding to the smallest grandularity, I think a loop, a menthod or RPC can be 
the smallest grandularity. IMHO, if HTrace can support per module sampling, we 
can define arbitary grandularity, we can use different trace loggers/samplers 
to get what we want, just like Log4j, we can define per-class log level and can 
be changed at runtime.

Thoughts?

> Filtering child spans by sampler
> --------------------------------
>
>                 Key: HTRACE-69
>                 URL: https://issues.apache.org/jira/browse/HTRACE-69
>             Project: HTrace
>          Issue Type: New Feature
>            Reporter: Masatake Iwasaki
>            Assignee: Masatake Iwasaki
>
> Trace#startSpan respect the sampler given as argument only when there is no 
> ongoing span (i.e. when creating new root span).
> {code}
>   public static TraceScope startSpan(String description, Sampler<TraceInfo> 
> s, TraceInfo tinfo) {
>     Span span = null;
>     if (isTracing() || s.next(tinfo)) {
> {code}
> Adding API starting span if {{(isTracing() && s.next(tinfo))}} enables 
> filtering of child spans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to