[
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507771#comment-14507771
]
Billie Rinaldi commented on HDFS-8213:
--------------------------------------
The hadoop.htrace.span.receiver.classes is not set in Accumulo configuration
files, but it is set in Hadoop configuration files. Accumulo uses Hadoop
configuration files to connect to HDFS, thus its uses of DFSClient will have
Hadoop's hadoop.htrace.span.receiver.classes. HBase does something similar, I
believe.
bq. Plus, it just kicks the problem up to a higher level. If my FooProcess
wants to use both HTrace and Accumulo, FooProcess could easily make the same
argument that "Accumulo should not instantiate SpanReceiverHost" since
FooProcess is already doing that. And since FooProcess uses the accumulo
client, it would conflict with whatever accumulo was configuring, if the same
config file was used for everything.
No. The way it works (did work, until this change was introduced in DFSClient)
is that server processes instantiate SpanReceiverHost. If an app wants
tracing, it also has to instantiate SpanReceiverHost. The Accumulo client does
not instantiate SPH itself, as DFSClient should not.
bq. One thing we could do to make this a little less painful is to deduplicate
span receivers inside the library. So if both DFSClient and Accumlo requested
an HTracedSpanReceiver, we could simply create one instance of that. This would
allow us to use the same config file for everything.
The change in DFSClient changes how apps are supposed to use tracing. It seems
like this would be mitigated by deduping SpanReceivers in htrace, but if we go
that route I would like the DFSClient change to be reverted until HDFS moves to
a version of htrace with deduping. Otherwise, Accumulo and HBase will have to
leave HDFS tracing disabled, or change how they're configuring HDFS, if they
wish to avoid double delivery of spans.
> DFSClient should not instantiate SpanReceiverHost
> -------------------------------------------------
>
> Key: HDFS-8213
> URL: https://issues.apache.org/jira/browse/HDFS-8213
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.7.0
> Reporter: Billie Rinaldi
> Assignee: Brahma Reddy Battula
> Priority: Critical
>
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages
> SpanReceivers through its own configuration. This results in the same
> receivers being registered multiple times and spans being delivered more than
> once. The documentation says SpanReceiverHost.getInstance should be issued
> once per process, so there is no expectation that DFSClient should do this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)