+1 for replacing current tracing with Opentracing for 5.2 release and not
making any changes to 5.1 release line.

IMHO, any further patch release on 5.1 release line should only focus on
stability, bug fixes and compatible features.
Thanks for your inputs and detailed explanation, Istvan!


On Fri, Oct 20, 2023 at 2:24 AM Istvan Toth <st...@apache.org> wrote:

> Sorry for the wall of text, if you don't have the time to read all of it,
> please just read and respond to the last part: *How to handle the
> migration.*
>
> Hi!
>
> I've been investigating the migration to OpenTelemetry for some time.
>
> If nothing else, I have managed to gain appreciation for the API of
> OpenTelemetry and the separation between the instrumentation and
> implementation APIs that it provides.
>
>
>
> *The present state*
>
> The current implementation is a super tightly coupled mess.
> (Apologies to the authors, I'm sure there was no better option available at
> the time)
> AFAICT the intended operation is:
>
>    - HTrace3 api is used for instrumentation, Spans are created. (There
>    seems to be a lot of manual context wrangling going on, I can see why
>    HTRace3 was abandoned)
>    - TraceSpanReceiver is registered (from Java code) to HTracing, and gets
>    tracing events, and pushes them into a queue.
>    - TraceWriter is started and starts polling the queue, and periodically
>    dumps them into the SYSTEM.TRACING_STATS table.
>    - This is also not happening by default, as the SYSTEM.TRACING_STATS
>    table never gets created by default (see below)
>
> To make things even more confusing, the metrics system is also tied into
> this.
>
>    - We register PhoenixMetricsSink as a Metrics2 sink in the properties
>    file
>       - This is where SYSTEM.TRACING_STATS gets created
>       - the properties file references *PhoenixTableMetricsWriter ,* but
>       that does not exist, writing is done by PhoenixMetricsSink directly
>    - Metrics are pushed to PhoenixMetricsSink which collects them and
>    periodically flushes them into SYSTEM.TRACING_STATS
>    - This is not happening by default, probably the properties files are
>    not put on the classpath
>
> I can kind of see the point, if I assume that no external distributed trace
> and metrics handling services are available, then this is a good way to
> collect all traces from the cluster, though using the same table for traces
> and metrics is still iffy.
> I didn't check if Htrace context is propagated over RPC. Older HBase
> clients probably propagate the HTrace contect automatically (though I'm not
> sure if htrace 3 and 4 are compatible in this regard), HBase-2.5 and later
> certainly doesn't, as it expects an OpenTelemetry context.
>
>
> *OpenTelemetry tracing and its challenges*
> In 2023, I don't think that collecting traces and metrics in an SQL table
> adds much value.
> There are several projects that specialize in collecting processing and
> visualizing distributed traces and/or metrics, and provide all kinds of
> value added features to it (I assume dumping them in an SQL table would
> also be possible)
>
> Opentelemetry splits the tracing functionality into two parts:
> - opentelemetry-api includes everything needed to create spans, add data to
> them, and (auto-) propagate them, but omits everything else, including the
> ability to turn tracing on/off.
> - opentelemetry-sdk includes the full implementation, the code for
> collecting/processing metrics, and most importantly the ability to define
> samplers.
>
> The preferred way to enable sampling seems to be adding the
> opentelemetry-agent as a java agent (which includes opentelemetry-sdk), and
> configuring its behaviour via environment variables or system properties.
> opentelemetry-agent also includes instrumentation to several libraries,
> which do not support it natively.
>
> HBase uses opentelemetry-agent, and does not add any additional features
> over what is available there.
> It is the application's responsibility to provide custom samplers if it
> wants to influence sampling decision (i.e. turn it on or off)
>
> The first (head) sampling decision happens when we open the first Span.
> Opentracing has the mechanisms to propagate the sampling decision (sampled
> flag)
> and includes a default sampler that honors the sampled flag from the
> context (set in the parent).
>
> It is possible to set up opentelemetry-agent for the HBase services so that
> they honor the sampling bit set by the client, which means that we only
> really have to worry about
> sampling for the first Span which is opened on the client.
> Unless the client application has already opened a Span, for Phoenix this
> typically happens when an SQL statement is executed (though we can add
> tracing elsewhere if we want to)
>
> This is not ideal for HBase, and even less so for Phoenix, which has SQL
> statements to turn tracing on and off, and to set tracing probability.
> We need to somehow add custom samplers so that we can turn tracing on/off
> and influence tracing frequency (probability)
>
> I can see two solutions:
> - Find an existing configurable sampler that we can add
> - Write one.
> It's only a few lines to add the functionality that we need to the default
> samplers, and it is possible to register them as a serveice so that they
> can be enabled via the standard mechanism.
>
> The problem is that writing a Sampler means depending on the full
> opentelemetry-sdk artifact instead of opentelemetry-api.
> I think we can work around this by making a new maven module/artifact for
> the sampler, so that we can avoid adding the full -sdk dependency to
> phoenix-core.
> We can then add this to phoenix-client. Tracing requires adding
> opentelemetry-agent to the classpath anyway, so it never gets called unless
> the SDK classes are present.
>
>
> *How to handle the migration*
> Supporting htrace and opentelemtry in the same repo would be a huge pain.
> It's probably doable, but we'd have to shim everything and push all tracing
> code to compatibility modules, and I'd prefer to avoid that.
>
> My suggestion is that we rip out the current tracing code, including the
> metrics sink and SYSTEM.TRACING_STATS support in 5.2 totally, and replace
> it with opentracing, while doing nothing in 5.1.
> This would possibly break tracing propagation for HBase-2.4, but I am not
> sure anyone will notice that.
> At CLDR, we have completely removed HTrace from Phoenix about a year ago
> for CVE reasons, and I don't remember this causing any problems for anyone.
>
> Another alternative is to simply backport all changes to 5.1, but that
> would remove existing functionality in a patch release.
> That would also be fine by me, but it would break some expectations
> regarding versioning.
>
> Looking forward to your input.
>
> Istvan
>

Reply via email to