[jira] [Comment Edited] (HBASE-28201) Add Endpoint and Method Name to COPROC_EXEC Spans

Istvan Toth (Jira) Tue, 14 Nov 2023 09:45:25 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-28201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786003#comment-17786003
 ]


Istvan Toth edited comment on HBASE-28201 at 11/14/23 5:44 PM:
---------------------------------------------------------------

{quote}I lean toward having two separate attributes, one for class and one for 
method. There's no sense in making a downstreamer parse our arbitrary strings.
{quote}
It really depends on whet we optimize for. For humans it's easier to use a 
compound value. For filters separate values may be easier to work with. It's 
fine by me either way.

{quote}Is there a case for adding up to some number of parameters, each 
truncated to a maximum width? Some serialized values would include useful 
information incrementally, such as a compound rowkey. Something like a 
serialized java object, much less so.
{quote}

I think that there are some inherent limitations enforced by OpenTelemetry both 
for the number and size of attributes.
The problem is that cannot really think of a good way to filter the parameters 
without adding custom code to each call for this.
Maybe playing around with this will give some ideas. 

{quote}I'm not sure that adding so much information to the span instances is a 
good idea – these extra allocations aren't free. Is it better to give an 
operator this information through their tracing infrastructure vs. letting them 
enable TRACE logging at the RPC layer to see these details?
{quote}

The advantage of tracing is that it can be more targeted.
Enabling TRACE logging at the RPC level is not granular, at the very least you 
have to do it at the client JVM level, and you cannot easily correlate it with 
the server executions.
You can enable tracing on statement by statement basis in Phoenix, or by 
starting an explicit sampled span in your Hbase java client app, and you can 
get true end-to-end traces with microsecond time resolution.
(Though OpenTelemetry doesn't make this particularly easy, as you have to add 
your own sampler for this) 
It is also possible to use probabilistic sampling, where you sample just very 
small fraction of calls.
Opentemetry has an api that will tell you if the span is sampled/recorded, so 
you can skip the more expensive operations.

I plan to play around with this some more, and see what is helpful and what is 
not.

BTW I am currently using opentelemetry agent with Jaeger to visualize the 
traces.
What did you use during development ?


was (Author: stoty):
{quote}I lean toward having two separate attributes, one for class and one for 
method. There's no sense in making a downstreamer parse our arbitrary strings.
{quote}
It really depends on whet we optimize for. For humans it's easier to use a 
compound value. For filters separate values may be easier to work with. It's 
fine by me either way.

{quote}Is there a case for adding up to some number of parameters, each 
truncated to a maximum width? Some serialized values would include useful 
information incrementally, such as a compound rowkey. Something like a 
serialized java object, much less so.
{quote}

I think that there are some inherent limitations enforced by OpenTelemetry both 
for the number and size of attributes.
The problem is that cannot really think of a good way to filter the parameters 
without adding custom code to each call for this.
Maybe playing around with this will give some ideas. 

{quote}I'm not sure that adding so much information to the span instances is a 
good idea – these extra allocations aren't free. Is it better to give an 
operator this information through their tracing infrastructure vs. letting them 
enable TRACE logging at the RPC layer to see these details?
{quote}

The advantage of tracing is that it can be more targeted.
Enabling TRACE logging at the RPC level is not granular, at the very least you 
have to do it at least at the client JVM level, and you cannot easily correlate 
it with the server executions.
You can enable tracing on statement by statement basis in Phoenix, or by 
starting an explicit sampled span in your Hbase java client app, and you can 
get true end-to-end traces with microsecond time resolution.
(Though OpenTelemetry doesn't make this particularly easy, as you have to add 
your own sampler for this) 
It is also possible to use probabilistic sampling, where you sample just very 
small fraction of calls.
Opentemetry has an api that will tell you if the span is sampled/recorded, so 
you can skip the more expensive operations.

I plan to play around with this some more, and see what is helpful and what is 
not.

BTW I am currently using opentelemetry agent with Jaeger to visualize the 
traces.
What did you use during development ?

> Add Endpoint  and Method Name to COPROC_EXEC Spans
> --------------------------------------------------
>
>                 Key: HBASE-28201
>                 URL: https://issues.apache.org/jira/browse/HBASE-28201
>             Project: HBase
>          Issue Type: Improvement
>          Components: tracing
>            Reporter: Istvan Toth
>            Assignee: Istvan Toth
>            Priority: Major
>
> If we assume parentBased=on, then it's enough to add this information on the 
> client side.
> However, we may want to also add this on the server side for stochastic 
> tracing.
> We could call these:
> _db.hbase.endpoint.name_
> _db.hbase.endpoint.method_
> or 
> _db.hbase.coprocessor.name_
> _db.hbase.coprocessor.method_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-28201) Add Endpoint and Method Name to COPROC_EXEC Spans

Reply via email to