[
https://issues.apache.org/jira/browse/HTRACE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693033#comment-14693033
]
Wangda Tan commented on HTRACE-69:
----------------------------------
Hi [~cmccabe]/[~iwasakims]/[~eclark]
Thanks for all your comments, I took a look at HTRACE-214, IIUC, it should be a
module specific tracer, we can define YARN-RM tracer, YARN-RM-Scheduler tracer,
use different tracer in implementation. And enable/disable them via
configuration Correct?
Regarding to the "keep all the span" and "keep partial span" topic, I think
performance is the most important thing. One interesting reference is, Twitter
folks did a large scale performance test for YARN timeline server v2
(YARN-2928). It uses synthetic generated mapreduce jobs, and write metrics
information to HBase, the number of records for a test mr job may be much less
than number of trace info for a mr job. If you have interest, please refer to
report:
https://issues.apache.org/jira/secure/attachment/12737364/TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf.
I'm agree with get/save as much information as possible if we can keep good
performance.
I am also thinking about YARN and upper app level tracing information these
days. I think for the per-application (start/end time for app) / per-container
(start/end time for container) / workflow (a oozie job, a Hive job, etc.), we
can get them from Timeline server. We should already have lots of information
stored in TS. For other information, especially for monitoring RPC call
performance, such as
- The delay of writing data from RM to ATS
- The delay of AM<->RM heartbeat
- The delay of RM<->NM heartbeat
- The delay of HDFS opereations for a single job
- The time spend in scheduler loop
HTrace is more usaful help throubleshooting these issues.
Regarding to the smallest grandularity, I think a loop, a menthod or RPC can be
the smallest grandularity. IMHO, if HTrace can support per module sampling, we
can define arbitary grandularity, we can use different trace loggers/samplers
to get what we want, just like Log4j, we can define per-class log level and can
be changed at runtime.
Thoughts?
> Filtering child spans by sampler
> --------------------------------
>
> Key: HTRACE-69
> URL: https://issues.apache.org/jira/browse/HTRACE-69
> Project: HTrace
> Issue Type: New Feature
> Reporter: Masatake Iwasaki
> Assignee: Masatake Iwasaki
>
> Trace#startSpan respect the sampler given as argument only when there is no
> ongoing span (i.e. when creating new root span).
> {code}
> public static TraceScope startSpan(String description, Sampler<TraceInfo>
> s, TraceInfo tinfo) {
> Span span = null;
> if (isTracing() || s.next(tinfo)) {
> {code}
> Adding API starting span if {{(isTracing() && s.next(tinfo))}} enables
> filtering of child spans.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)