David Smiley created SOLR-15283:
-----------------------------------
Summary: Remove Solr trace sampling; let Tracer configuration/impl
decide
Key: SOLR-15283
URL: https://issues.apache.org/jira/browse/SOLR-15283
Project: Solr
Issue Type: Task
Security Level: Public (Default Security Level. Issues are Public)
Reporter: David Smiley
Assignee: David Smiley
GlobalTracer should always have the Tracer produced by the
TracerConfiguratorPlugin. Solr should not intervene by substituting a no-op
version sometimes, and thus needn't have its ThreadLocal tracking either (which
doesn't work well). The special {{samplePercentage}} cluster property should
be removed.
Background: When someone configures tracing (supplying TracerConfigurator
plugin), Solr will "sample" tracing if an incoming request has no tracing
information. By default this is 10% and is only configurable via a
{{samplePercentage}} cluster property. If you're in the 90%, this results in a
no-op Tracer -- no trace IDs. This is really confusing & annoying because
Tracers themselves have notions of sampling, which means "reporting" (sending)
the trace to a tracing server where it can be stored/analyzed/visualized. The
point of a non-sampled trace is propagating IDs for logging (trace ID in MDC)
-- very light-weight. Zipkin and Jaeger (and others?) have their own samplers.
When Solr receives a request with a trace ID, in Zipkin it also includes the
binary sampling decision (it's another header). The expectation is that if the
trace says to sample, then this sampling decision is propagated downstream and
thus the whole call tree is fully sampled (reported to a server).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)