[ 
https://issues.apache.org/jira/browse/HBASE-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562731#comment-17562731
 ] 

Nick Dimiduk commented on HBASE-27155:
--------------------------------------

[~apurtell]

I haven't performed a 1-to-1 audit of these metrics vs. the data i'm attaching 
to span events on HBASE-27153, but a quick look makes me think that we could 
collect all this data via tracing. It may be challenging to extract the span 
event data from the span repository -- I haven't tried interacting with the 
likes of Jaeger or X-Ray at that API level. That would be the next step for 
replicating the above.

Let me take a closer look at the metrics you expose in your patch and see what 
I can do with otel.

> Improvements to low level scanner tracing
> -----------------------------------------
>
>                 Key: HBASE-27155
>                 URL: https://issues.apache.org/jira/browse/HBASE-27155
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners, tracing
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>
> Related to HBASE-27153, consider tracer semantic attributes for low level 
> scanner details.
> Consider 
> https://issues.apache.org/jira/secure/attachment/13006571/W-7665966-Instrument-low-level-scan-details-branch-2.2.patch
>  (from HBASE-24637).  This was used to collect detailed metrics of the 
> decisions of ScanQueryMatcher and related classes.
> {noformat}
> metrics: [ "block_read_keys": 477 "block_read_ns": 3427040
>  "block_reads": 13 "block_seek_ns": 1606370 "block_seeks": 169
>  "block_unpack_ns": 10256 "block_unpacks": 13 "cells_matched": 165
>  "cells_matched__hbase:meta,,1.1588230740__info": 165
>  "column_hint_include": 148 "memstore_next": 72
>  "memstore_next_ns": 136671 "memstore_seek": 2
>  "memstore_seek_ns": 631629 "reseeks": 36 "sqm_hint_done": 17
>  "sqm_hint_include": 74 "sqm_hint_seek_next_col": 74
>  "store_next": 276
>  "store_next__1c930a35ff8041368a05817adbdcce97": 40
>  "store_next__2644194fdf794815abdc940c183dab88": 40
>  "store_next__32ce31753fb244668f788fb94ab02dff": 40
>  "store_next__61c8423b9d8846c99a61cd2996b5b621": 116
>  "store_next__f4f7878c9fcf40d9902416d5c7a4097a": 40
>  "store_next_ns": 1891634
>  "store_next_ns__1c930a35ff8041368a05817adbdcce97": 269383
>  "store_next_ns__2644194fdf794815abdc940c183dab88": 299936
>  "store_next_ns__32ce31753fb244668f788fb94ab02dff": 288594
>  "store_next_ns__61c8423b9d8846c99a61cd2996b5b621": 594313
>  "store_next_ns__f4f7878c9fcf40d9902416d5c7a4097a": 439408
>  "store_reseek": 164
>  "store_reseek__1c930a35ff8041368a05817adbdcce97": 32
>  "store_reseek__2644194fdf794815abdc940c183dab88": 32
>  "store_reseek__32ce31753fb244668f788fb94ab02dff": 32
>  "store_reseek__61c8423b9d8846c99a61cd2996b5b621": 36
>  "store_reseek__f4f7878c9fcf40d9902416d5c7a4097a": 32
>  "store_reseek_ns": 2969978
>  "store_reseek_ns__1c930a35ff8041368a05817adbdcce97": 359489
>  "store_reseek_ns__2644194fdf794815abdc940c183dab88": 595115
>  "store_reseek_ns__32ce31753fb244668f788fb94ab02dff": 474642
>  "store_reseek_ns__61c8423b9d8846c99a61cd2996b5b621": 1013188
>  "store_reseek_ns__f4f7878c9fcf40d9902416d5c7a4097a": 527544
>  "store_seek": 5
>  "store_seek__1c930a35ff8041368a05817adbdcce97": 1
>  "store_seek__2644194fdf794815abdc940c183dab88": 1
>  "store_seek__32ce31753fb244668f788fb94ab02dff": 1
>  "store_seek__61c8423b9d8846c99a61cd2996b5b621": 1
>  "store_seek__f4f7878c9fcf40d9902416d5c7a4097a": 1
>  "store_seek_ns": 8862786
>  "store_seek_ns__1c930a35ff8041368a05817adbdcce97": 830421
>  "store_seek_ns__2644194fdf794815abdc940c183dab88": 585899
>  "store_seek_ns__32ce31753fb244668f788fb94ab02dff": 483605
>  "store_seek_ns__61c8423b9d8846c99a61cd2996b5b621": 5958072
>  "store_seek_ns__f4f7878c9fcf40d9902416d5c7a4097a": 1004789
>  "versions_hint_include": 74 "versions_hint_seek_next_col": 74 ]
> {noformat}
> We can see the differences between seek time and reseek time and we get the 
> counts for same, so we can analyze if SQM is making optimal choices (or less 
> optimal choices) or not, or if behavior has changed; and we can identify 
> particular store file(s) that might be outliers for some reason when hunting 
> for sources of regression. We get the time required to unpack blocks (on 
> average). We get a count of hints supplied by base SQM functionality or 
> filters. We get the relative contributions of query processing time 
> separately from memstore and store files. 
> Perhaps this can be done conditionally for scans that are selected for 
> tracing. Of course there is a performance concern, so it must be done such 
> that the overheads really are conditional on if the path is being actively 
> traced, and measured carefully to decide if it should be committed or not. 
> WDYT [~ndimiduk] [~zhangduo]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to