[
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507643#comment-14507643
]
Colin Patrick McCabe commented on HDFS-8213:
--------------------------------------------
Thanks again for kicking the tires on htrace, [~billie.rinaldi]. Let me see if
I can get to the bottom of this.
bq. As documented, each process must configure its own span receivers if it
wants to use tracing. If I set hadoop.htrace.span.receiver.classes to the empty
string, then the NameNode and DataNode will not do any tracing.
You are right that you need to set {{hadoop.htrace.span.receiver.classes}} in
the NameNode and DataNode configuration. However, you need to avoid setting it
in the Accumulo configuration... instead, use whatever configuration Accumulo
uses to set this value. This means that you can't use the same config file for
the NN and DN as for the DFSClient, currently.
bq. If span receiver initialization in DFSClient is important to the use of the
hadoop.htrace.sampler configuration property, perhaps a compromise would be to
perform SpanReceiverHost.getInstance only when the sampler is set to something
other than NeverSampler.
Keep in mind that {{hadoop.htrace.sampler}} is a completely different
configuration key than {{hadoop.htrace.span.receiver.classes}}. If you are
sampling at the level of Accumulo operations, I would not recommend setting
{{hadoop.htrace.sampler}}, in any config file on the cluster. You want all of
the sampling to happen inside accumulo.
bq. I think Billie Rinaldi is correct here; the client should not instantiate
it's own SpanReceiverHost, but instead depend on the process in which it
resides to provide. This is how HBase client works as well.
HBase is exactly the same. In the case of HBase, you do not want to set
{{hadoop.htrace.span.receiver.classes}} in the HBase config files. Instead,
you would set {{hbase.htrace.span.receiver.classes}}. Then HBase would create
a span receiver, and DFSClient would not.
It seems like there is a hidden assumption here that you want to use the same
config file for everything. But we really don't support that right now.
Getting rid of the SpanReceiverHost in DFSClient is not an option since some
people want to just trace HDFS without tracing any other system. Plus, it just
kicks the problem up to a higher level. If my FooProcess wants to use both
HTrace and Accumulo, FooProcess could easily make the same argument that
"Accumulo should not instantiate SpanReceiverHost" since FooProcess is already
doing that. And since FooProcess uses the accumulo client, it would conflict
with whatever accumulo was configuring, if the same config file was used for
everything.
One thing we could do to make this a little less painful is to deduplicate span
receivers inside the library. So if both DFSClient and Accumlo requested an
HTracedSpanReceiver, we could simply create one instance of that. This would
allow us to use the same config file for everything.
As a side note, [~billie.rinaldi], can you explain how you configure which
sampler and span receiver accumulo uses? In HBase we set it to
{{hbase.htrace.span.receiver.classes}}, etc. I would recommend something like
{{accumulo.htrace.span.receiver.classes}} for consistency. This also allows
you to sue the same config file for everything since it doesn't conflict with
the keys which Hadoop uses to set these values. That is the reason why we set
up the "hbase.htrace" "namespace" as separate from the "hadoop.htrace"
"namespace" if you see what I'm saying.
> DFSClient should not instantiate SpanReceiverHost
> -------------------------------------------------
>
> Key: HDFS-8213
> URL: https://issues.apache.org/jira/browse/HDFS-8213
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.7.0
> Reporter: Billie Rinaldi
> Assignee: Brahma Reddy Battula
> Priority: Critical
>
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages
> SpanReceivers through its own configuration. This results in the same
> receivers being registered multiple times and spans being delivered more than
> once. The documentation says SpanReceiverHost.getInstance should be issued
> once per process, so there is no expectation that DFSClient should do this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)