[
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232965#comment-17232965
]
ramkrishna.s.vasudevan commented on HBASE-24637:
------------------------------------------------
[~apurtell]
bq.Why does branch-2 do this and not branch-1?
Yes branch-1 does not do reseek because of the optimization added at the SQM
layer as [~larsh] pointed out. So there we don't do any reseek for the case
when the tracker says we need to do SEEK_COL but filter says SKIP. This only
happens with addColumns.
But a case where the filter says INCLUDE and tracker says SEEK_COL (again with
addColumns) there wont be any regression between branch-1.3 and branch-2. Even
the number of comparisons and reseek should be same except that branch-2 might
suffer from some Comparator related perf which might not be impacting as shown
here in these tests.
bq.Is there a way to avoid the reseek per block?
I have not changed the SQM logic that was added as part of
https://issues.apache.org/jira/browse/HBASE-17125. The way I have tried to
solve that is the one that is attached in the PR where we can try checking for
few blocks if at all we really do a reseek to a new block other than the one
that reached out from trying to do a next(). If so continue the current way if
not switch over to next() only for that scan query.
> Reseek regression related to filter SKIP hinting
> ------------------------------------------------
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
> Issue Type: Bug
> Components: Filters, Performance, Scanners
> Affects Versions: 2.2.5
> Reporter: Andrew Kyle Purtell
> Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf,
> W-7665966-Instrument-low-level-scan-details-branch-1.patch,
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch,
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate
> significantly better microbenchmarks in a number of cases, and usually shows
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call
> metrics that leverage the fact it puts a reference to the current Call into a
> thread local and that all activity for a given RPC is processed by a single
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock,
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6
> and 2.2 versions under test operated on identical data files in HDFS. For
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached.
> It appears a refactor to ScanQueryMatcher and friends has disabled the
> ability of filters to provide meaningful SKIP hints, which disables an
> optimization that avoids reseeking, leading to a serious and proportional
> regression in reseek activity and time spent in that code path. So for
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was
> almost identical, as measured by counts of the hint types returned, whether
> or not column or version trackers are called, and counts of store seeks or
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and
> results generally fell within this range, except for the filter all case of
> course.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)