Andrew Kyle Purtell created HBASE-27155:
-------------------------------------------
Summary: Improvements to low level scanner tracing
Key: HBASE-27155
URL: https://issues.apache.org/jira/browse/HBASE-27155
Project: HBase
Issue Type: Improvement
Components: Scanners, tracing
Reporter: Andrew Kyle Purtell
Related to HBASE-27153, introduce tracer semantic attributes for low level
scanner details.
Consider
https://issues.apache.org/jira/secure/attachment/13006571/W-7665966-Instrument-low-level-scan-details-branch-2.2.patch
(from HBASE-24637). This was used to collect detailed metrics of the
decisions of ScanQueryMatcher and related classes.
{noformat}
metrics: [ "block_read_keys": 477 "block_read_ns": 3427040
"block_reads": 13 "block_seek_ns": 1606370 "block_seeks": 169
"block_unpack_ns": 10256 "block_unpacks": 13 "cells_matched": 165
"cells_matched__hbase:meta,,1.1588230740__info": 165
"column_hint_include": 148 "memstore_next": 72
"memstore_next_ns": 136671 "memstore_seek": 2
"memstore_seek_ns": 631629 "reseeks": 36 "sqm_hint_done": 17
"sqm_hint_include": 74 "sqm_hint_seek_next_col": 74
"store_next": 276
"store_next__1c930a35ff8041368a05817adbdcce97": 40
"store_next__2644194fdf794815abdc940c183dab88": 40
"store_next__32ce31753fb244668f788fb94ab02dff": 40
"store_next__61c8423b9d8846c99a61cd2996b5b621": 116
"store_next__f4f7878c9fcf40d9902416d5c7a4097a": 40
"store_next_ns": 1891634
"store_next_ns__1c930a35ff8041368a05817adbdcce97": 269383
"store_next_ns__2644194fdf794815abdc940c183dab88": 299936
"store_next_ns__32ce31753fb244668f788fb94ab02dff": 288594
"store_next_ns__61c8423b9d8846c99a61cd2996b5b621": 594313
"store_next_ns__f4f7878c9fcf40d9902416d5c7a4097a": 439408
"store_reseek": 164
"store_reseek__1c930a35ff8041368a05817adbdcce97": 32
"store_reseek__2644194fdf794815abdc940c183dab88": 32
"store_reseek__32ce31753fb244668f788fb94ab02dff": 32
"store_reseek__61c8423b9d8846c99a61cd2996b5b621": 36
"store_reseek__f4f7878c9fcf40d9902416d5c7a4097a": 32
"store_reseek_ns": 2969978
"store_reseek_ns__1c930a35ff8041368a05817adbdcce97": 359489
"store_reseek_ns__2644194fdf794815abdc940c183dab88": 595115
"store_reseek_ns__32ce31753fb244668f788fb94ab02dff": 474642
"store_reseek_ns__61c8423b9d8846c99a61cd2996b5b621": 1013188
"store_reseek_ns__f4f7878c9fcf40d9902416d5c7a4097a": 527544
"store_seek": 5
"store_seek__1c930a35ff8041368a05817adbdcce97": 1
"store_seek__2644194fdf794815abdc940c183dab88": 1
"store_seek__32ce31753fb244668f788fb94ab02dff": 1
"store_seek__61c8423b9d8846c99a61cd2996b5b621": 1
"store_seek__f4f7878c9fcf40d9902416d5c7a4097a": 1
"store_seek_ns": 8862786
"store_seek_ns__1c930a35ff8041368a05817adbdcce97": 830421
"store_seek_ns__2644194fdf794815abdc940c183dab88": 585899
"store_seek_ns__32ce31753fb244668f788fb94ab02dff": 483605
"store_seek_ns__61c8423b9d8846c99a61cd2996b5b621": 5958072
"store_seek_ns__f4f7878c9fcf40d9902416d5c7a4097a": 1004789
"versions_hint_include": 74 "versions_hint_seek_next_col": 74 ]
{noformat}
We can see the differences between seek time and reseek time and we get the
counts for same, so we can analyze if SQM is making optimal choices (or less
optimal choices) or not, or if behavior has changed; and we can identify
particular store file(s) that might be outliers for some reason when hunting
for sources of regression. We get the time required to unpack blocks (on
average). We get a count of hints supplied by base SQM functionality or
filters. We get the relative contributions of query processing time separately
from memstore and store files.
Perhaps this can be done conditionally for scans that are selected for tracing.
Of course there is a performance concern, so it must be done such that the
overheads really are conditional on if the path is being actively traced, and
measured carefully to decide if it should be committed or not.
WDYT [~ndimiduk] [~zhangduo]
--
This message was sent by Atlassian Jira
(v8.20.7#820007)