[
https://issues.apache.org/jira/browse/HBASE-26366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438454#comment-17438454
]
Tak-Lon (Stephen) Wu edited comment on HBASE-26366 at 11/4/21, 2:23 AM:
------------------------------------------------------------------------
I found the potential issue of this single span(s), basically it's due to the
static class ZKUtil calls {{zkw.getRecoverableZooKeeper().create}} whenever a
znode or znode with its parent (normally /hbase) need to be created, e.g.
{{ZKUtil.createAndWatch}}, {{ZKUtil.createWithParents}}, {{ZKUtil.getData}}
If we're going to group all these spans, we need to wrap a parent span at their
parent level, e.g. I found at few places, 1/ ClientZKSyncer, 2/ Procedure /
ProcedureMember, 3/ HMaster#startActiveMasterManager (this is a private method)
and etc.
if this is the direction, the scope of this JIRAs are big and may need to
create more sub-tasks.
Moreover, based few experiment I did to group spans for
HMaster#startActiveMasterManager (see [example result as
attached|https://issues.apache.org/jira/secure/attachment/13035667/regroup-spans-for-recoverablezookeeper.create.png]
), IMO we should replace some current code of {{Span span =
TraceUtil.getGlobalTracer()}}or {{TraceUtil.trace()}} with [{{@WithSpan}}
annotation|https://github.com/open-telemetry/opentelemetry-java/blob/main/extensions/annotations/src/main/java/io/opentelemetry/extension/annotations/WithSpan.java]
(see the [code
example|https://github.com/open-telemetry/opentelemetry-java/blob/main/extensions/annotations/src/test/java/io/opentelemetry/extension/annotations/WithSpanUsageExamples.java]
and don't hack too much code in TraceUtils)
was (Author: taklwu):
I found the potential issue of this single span(s), basically it's due to the
static class ZKUtil calls {{zkw.getRecoverableZooKeeper().create}} whenever a
znode or znode with its parent (normally /hbase) need to be created, e.g.
{{ZKUtil.createAndWatch}}, {{ZKUtil.createWithParents}}, {{ZKUtil.getData}}
If we're going to group all these spans, we need to wrap a parent span at their
parent level, e.g. I found at few places, 1/ ClientZKSyncer, 2/ Procedure /
ProcedureMember, 3/ HMaster#startActiveMasterManager (this is a private method)
and etc.
if this is the direction, the scope of this JIRAs are big and may need to
create more sub-tasks.
Moreover, based few experiment I did to group spans for
HMaster#startActiveMasterManager, IMO we should replace some current code of
{{Span span = TraceUtil.getGlobalTracer()}}or {{TraceUtil.trace()}} with
[{{@WithSpan}}
annotation|https://github.com/open-telemetry/opentelemetry-java/blob/main/extensions/annotations/src/main/java/io/opentelemetry/extension/annotations/WithSpan.java]
(see the [code
example|https://github.com/open-telemetry/opentelemetry-java/blob/main/extensions/annotations/src/test/java/io/opentelemetry/extension/annotations/WithSpanUsageExamples.java]
and don't hack too much code in TraceUtils)
> ZK interaction during Master startup produces loads of root-less spans
> ----------------------------------------------------------------------
>
> Key: HBASE-26366
> URL: https://issues.apache.org/jira/browse/HBASE-26366
> Project: HBase
> Issue Type: Bug
> Components: tracing
> Affects Versions: 2.5.0, 3.0.0-alpha-2
> Reporter: Nick Dimiduk
> Priority: Major
> Attachments: image.png,
> regroup-spans-for-recoverablezookeeper.create.png
>
>
> Enable tracing with {{-Dotel.traces.sampler=always_on}} and launch the
> standalone master, check the trace store. We see lots of single-span traces,
> each a call to different {{RecoverableZooKeeper}} methods, like {{create}}
> and {{exists}}. Whatever master startup thread this is should define its own
> root span.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)