[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507771#comment-14507771
 ] 

Billie Rinaldi commented on HDFS-8213:
--------------------------------------

The hadoop.htrace.span.receiver.classes is not set in Accumulo configuration 
files, but it is set in Hadoop configuration files.  Accumulo uses Hadoop 
configuration files to connect to HDFS, thus its uses of DFSClient will have 
Hadoop's hadoop.htrace.span.receiver.classes.  HBase does something similar, I 
believe.

bq. Plus, it just kicks the problem up to a higher level. If my FooProcess 
wants to use both HTrace and Accumulo, FooProcess could easily make the same 
argument that "Accumulo should not instantiate SpanReceiverHost" since 
FooProcess is already doing that. And since FooProcess uses the accumulo 
client, it would conflict with whatever accumulo was configuring, if the same 
config file was used for everything.

No.  The way it works (did work, until this change was introduced in DFSClient) 
is that server processes instantiate SpanReceiverHost.  If an app wants 
tracing, it also has to instantiate SpanReceiverHost.  The Accumulo client does 
not instantiate SPH itself, as DFSClient should not.

bq. One thing we could do to make this a little less painful is to deduplicate 
span receivers inside the library. So if both DFSClient and Accumlo requested 
an HTracedSpanReceiver, we could simply create one instance of that. This would 
allow us to use the same config file for everything.

The change in DFSClient changes how apps are supposed to use tracing.  It seems 
like this would be mitigated by deduping SpanReceivers in htrace, but if we go 
that route I would like the DFSClient change to be reverted until HDFS moves to 
a version of htrace with deduping.  Otherwise, Accumulo and HBase will have to 
leave HDFS tracing disabled, or change how they're configuring HDFS, if they 
wish to avoid double delivery of spans.

> DFSClient should not instantiate SpanReceiverHost
> -------------------------------------------------
>
>                 Key: HDFS-8213
>                 URL: https://issues.apache.org/jira/browse/HDFS-8213
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Billie Rinaldi
>            Assignee: Brahma Reddy Battula
>            Priority: Critical
>
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
> SpanReceivers through its own configuration.  This results in the same 
> receivers being registered multiple times and spans being delivered more than 
> once.  The documentation says SpanReceiverHost.getInstance should be issued 
> once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to