Re: HBase 2 slower than HBase 1?

Andrew Purtell Tue, 14 Jul 2020 18:33:20 -0700

This is a good finding, nice work!
I added a comment on HBASE-24742 that mentions HBASE-24637 on the off
chance they are related, although I suspect more changes are implicated by
the 2.x regression.


On Tue, Jul 14, 2020 at 5:53 PM Bharath Vissapragada <[email protected]>
wrote:

> FYI, we filed this today https://issues.apache.org/jira/browse/HBASE-24742
> .
> We ran into a similar regression when upgrading from 1.3 based branch to
> 1.6 based branch. After some profiling and code analysis we narrowed down
> the code paths.
>
> On Tue, Jul 14, 2020 at 11:38 AM Josh Elser <[email protected]> wrote:
>
> > Wow. Great stuff, Andrew!
> >
> > Thank you for compiling and posting it all here. I can only imagine how
> > time-consuming this was.
> >
> > On 6/26/20 1:57 PM, Andrew Purtell wrote:
> > > Hey Anoop, I opened https://issues.apache.org/jira/browse/HBASE-24637
> > and
> > > attached the patches and script used to make the comparison.
> > >
> > > On Fri, Jun 26, 2020 at 2:33 AM Anoop John <[email protected]>
> > wrote:
> > >
> > >> Great investigation Andy.  Do you know any Jiras which made changes in
> > SQM?
> > >> Would be great if you can attach your patch which tracks the scan
> > flow.  If
> > >> we have a Jira for this issue, can you pls attach?
> > >>
> > >> Anoop
> > >>
> > >> On Fri, Jun 26, 2020 at 1:56 AM Andrew Purtell <
> > [email protected]>
> > >> wrote:
> > >>
> > >>> Related, I think I found a bug in branch-1 where we don’t heartbeat
> in
> > >> the
> > >>> filter all case until we switch store files, so scanning a very large
> > >> store
> > >>> file might time out with client defaults. Remarking on this here so I
> > >> don’t
> > >>> forget to follow up.
> > >>>
> > >>>> On Jun 25, 2020, at 12:27 PM, Andrew Purtell <[email protected]>
> > >>> wrote:
> > >>>>
> > >>>> 
> > >>>> I repeated this test with pe --filterAll and the results were
> > >> revealing,
> > >>> at least for this case. I also patched in thread local hash map for
> > >> atomic
> > >>> counters that I could update from code paths in SQM, StoreScanner,
> > >>> HFileReader*, and HFileBlock. Because a RPC is processed by a single
> > >>> handler thread I could update counters and accumulate micro-timings
> via
> > >>> System#nanoTime() per RPC and dump them out of CallRunner in some new
> > >> trace
> > >>> logging. I spent a couple of days making sure the instrumentation was
> > >>> placed equivalently in both 1.6 and 2.2 code bases and was producing
> > >>> consistent results. I can provide these patches upon request.
> > >>>>
> > >>>> Again, test tables with one family and 1, 5, 10, 20, 50, and 100
> > >>> distinct column-qualifiers per row. After loading the table I made a
> > >>> snapshot and cloned the snapshot for testing, for both 1.6 and 2.2,
> so
> > >> both
> > >>> versions were tested using the exact same data files on HDFS. I also
> > used
> > >>> the 1.6 version of PE for both, so the only change is on the server
> > (1.6
> > >> vs
> > >>> 2.2 masters and regionservers).
> > >>>>
> > >>>> It appears a refactor to ScanQueryMatcher and friends has disabled
> the
> > >>> ability of filters to provide SKIP hints, which prevents us from
> > >> bypassing
> > >>> version checking (so some extra cost in SQM), and appears to disable
> an
> > >>> optimization that avoids reseeking, leading to a serious and
> > proportional
> > >>> regression in reseek activity and time spent in that code path. So
> for
> > >>> queries that use filters, there can be a substantial regression.
> > >>>>
> > >>>> Other test cases that did not use filters did not show a regression.
> > >>>>
> > >>>> A test case where I used ROW_INDEX_V1 encoding showed an expected
> > >> modest
> > >>> proportional regression in seeking time, due to the fact it is
> > optimized
> > >>> for point queries and not optimized for the full table scan case.
> > >>>>
> > >>>> I will come back here when I understand this better.
> > >>>>
> > >>>> Here are the results for the pe --filterAll case:
> > >>>>
> > >>>>
> > >>>> 1.6.0 c1      2.2.5 c1
> > >>>> 1.6.0 c5      2.2.5 c5
> > >>>> 1.6.0 c10     2.2.5 c10
> > >>>> 1.6.0 c20     2.2.5 c20
> > >>>> 1.6.0 c50     2.2.5 c50
> > >>>> 1.6.0 c100    2.2.5 c100
> > >>>> Counts
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>      (better heartbeating)
> > >>>>      (better heartbeating)
> > >>>>      (better heartbeating)
> > >>>>      (better heartbeating)
> > >>>>      (better heartbeating)
> > >>>> rpcs  1       2       200%    2       6       300%    2       10
> > >>> 500%    3       17      567%    4       37      925%    8       72
> > >> 900%
> > >>>> block_reads   11507   11508   100%    57255   57257   100%    114471
> > >>> 114474  100%    230372  230377  100%    578292  578298  100%
> 1157955
> > >>> 1157963 100%
> > >>>> block_unpacks 11507   11508   100%    57255   57257   100%    114471
> > >>> 114474  100%    230372  230377  100%    578292  578298  100%
> 1157955
> > >>> 1157963 100%
> > >>>> seeker_next   10000000        10000000        100%    50000000
> > >>> 50000000        100%    100000000       100000000       100%
> > 200000000
> > >>>       200000000       100%    500000000       500000000       100%
> > >>> 1000000000      1000000000      100%
> > >>>> store_next    10000000        9988268 100%    50000000
> 49940082
> > >>>        100%    100000000       99879401        100%    200000000
> > >>>   199766539       100%    500000000       499414653       100%
> > >>> 1000000000      998836518       100%
> > >>>> store_reseek  1       11733   > !     2       59924   > !     8
> > >>>   120607  > !     6       233467  > !     10      585357  > !     8
> > >>>   1163490 > !
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> cells_matched 20000000        20000000        100%    60000000
> > >>> 60000000        100%    110000000       110000000       100%
> > 210000000
> > >>>       210000000       100%    510000000       510000000       100%
> > >>> 1010000000      1010000000      100%
> > >>>> column_hint_include   10000000        10000000        100%
> 50000000
> > >>>        50000000        100%    100000000       100000000       100%
> > >>> 200000000       200000000       100%    500000000       500000000
> > >>>   100%    1000000000      1000000000      100%
> > >>>> filter_hint_skip      10000000        10000000        100%
> 50000000
> > >>>        50000000        100%    100000000       100000000       100%
> > >>> 200000000       200000000       100%    500000000       500000000
> > >>>   100%    1000000000      1000000000      100%
> > >>>> sqm_hint_done 9999999 9999999 100%    9999998 9999998 100%
> 9999998
> > >>> 9999998 100%    9999997 9999997 100%    9999996 9999996 100%
> 9999992
> > >>> 9999992 100%
> > >>>> sqm_hint_seek_next_col        1       0       0%      2
>  40000002
> > >>>        > !     2       90000002        > !     3       190000003
> >  > !
> > >>>     4       490000004       > !     8       990000008       > !
> > >>>> sqm_hint_seek_next_row        0       10000000        > !     2
> > >>>   10000000        > !     0       10000000        > !     0
> >  10000000
> > >>>        > !     0       10000000        > !     0       10000000
> > > !
> > >>>> sqm_hint_skip 10000000        0       0%      50000000        0
> > >>>   0%      100000000       0       0%      200000000       0       0%
> > >>> 500000000       0       0%      1000000000      0       0%
> > >>>> versions_hint_include_and_seek_next_col       0       0       0%
> > 0
> > >>>       40000000        > !     0       90000000        > !     0
> > >>>   190000000       > !     0       490000000       > !     0
> > >>   990000000
> > >>>       > !
> > >>>> versions_hint_include_and_seek_next_row       0       0       0%
> > 0
> > >>>       10000000        > !     0       10000000        > !     0
> > >>>   10000000        > !     0       10000000        > !     0
> >  10000000
> > >>>        > !
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Times
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> block_read_ms 174.12  215.96  124%    683.78  883.02  129%
> 1724.66
> > >>> 1937.87 112%    3199.93 3470.92 108%    7582.96 7941.45 105%
> > 14701.52
> > >>>      16022.57        109%
> > >>>> block_unpack_ms       0.63    0.62    99%     2.35    2.35    100%
> > >>> 4.36    4.56    105%    9.10    8.30    91%     21.59   22.01   102%
> > >>> 40.97   40.93   100%
> > >>>> seeker_next_ms        1073.45 1053.51 98%     4452.52 4728.96 106%
> > >>> 9359.61 9706.06 104%    18879.28        18330.93        97%
> >  47664.95
> > >>>      46404.26        97%     96511.83        92121.30        95%
> > >>>> store_next_ms 2651.11 2423.27 91%     12048.64        11398.63
> > >>> 95%     24833.68        23586.20        95%     48493.26
> > 46574.38
> > >>>      96%     120786.00       115603.00       96%     240627.00
> > >>>   227945.00       95%
> > >>>> store_reseek_ms       2.71    62.17   2297%   10.93   233.67  2139%
> > >>>   10.88   401.88  3693%   13.98   783.18  5601%   22.26   1939.28
> 8714%
> > >>>   24.32   4065.25 16714%
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Rows=10000000
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> ValueSize=20
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> BlockEncoding=FAST_DIFF
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Compress=NONE
> > >>>> Filter=ALL
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Thu, Jun 11, 2020 at 5:08 PM Andrew Purtell <[email protected]
> >
> > >>> wrote:
> > >>>>> I used PE to generate 10M row tables with one family with either 1,
> > >> 10,
> > >>> 20, 50, or 100 values per row (unique column-qualifiers). An increase
> > in
> > >>> wall clock time was noticeable, for example:
> > >>>>>
> > >>>>> 1.6.0
> > >>>>>
> > >>>>> time ./bin/hbase pe --rows=5000000 --table=TestTable_f1_c20
> > >>> --columns=20 --nomapred scan 2
> > >>>>> real 1m20.179s
> > >>>>> user 0m45.470s
> > >>>>> sys 0m16.572s
> > >>>>>
> > >>>>> 2.2.5
> > >>>>>
> > >>>>> time ./bin/hbase pe --rows=5000000 --table=TestTable_f1_c20
> > >>> --columns=20 --nomapred scan 2
> > >>>>> real 1m31.016s
> > >>>>> user 0m48.658s
> > >>>>> sys 0m16.965s
> > >>>>>
> > >>>>> It didnt really make a difference if I used 1 thread or 4 or 10,
> the
> > >>> delta was about the same, proportionally. I picked two threads in the
> > end
> > >>> so I'd have enough time to launch async-profiler twice in another
> > shell,
> > >> to
> > >>> capture flamegraph and call tree output, respectively. async-profiler
> > >>> captured 10 seconds at steady state per test case. Upon first
> > inspection
> > >>> what jumps out is an increasing proportion of CPU time spent in GC in
> > >> 2.2.5
> > >>> vs 1.6.0. The difference increases as the number of column families
> > >>> increase. There is little apparent difference at 1 column family,
> but a
> > >> 2x
> > >>> or more difference at 20 columns, and a 10x or more difference at 100
> > >>> columns, eyeballing the charts, flipping back and forth between
> browser
> > >>> windows. This seems more than coincidental but obviously calls for
> > >> capture
> > >>> and analysis of GC trace, with JFR. Will do that next.
> > >>>>>
> > >>>>> JVM: openjdk version "1.8.0_232" OpenJDK Runtime Environment (Zulu
> > >>> 8.42.0.21-CA-macosx) (build 1.8.0_232-b18) OpenJDK 64-Bit Server VM
> > (Zulu
> > >>> 8.42.0.21-CA-macosx) (build 25.232-b18, mixed mode)
> > >>>>>
> > >>>>> Regionserver JVM flags: -Xms10g -Xmx10g -XX:+UseG1GC
> > >>> -XX:+AlwaysPreTouch -XX:+UseNUMA -XX:-UseBiasedLocking
> > >>> -XX:+ParallelRefProcEnabled
> > >>>>>
> > >>>>>
> > >>>>> On Thu, Jun 11, 2020 at 7:06 AM Jan Van Besien <[email protected]>
> > >>> wrote:
> > >>>>>> This is promising, thanks a lot. Testing with hbase 2.2.5 shows an
> > >>>>>> improvement, but we're not there yet.
> > >>>>>>
> > >>>>>> As reported earlier, hbase 2.1.0 was about 60% slower than hbase
> > >> 1.2.0
> > >>>>>> in a test that simply scans all the regions in parallel without
> any
> > >>>>>> filter. A test with hbase 2.2.5 shows it to be about 40% slower
> than
> > >>>>>> 1.2.0. So that is better than 2.1.0, but still substantially
> slower
> > >>>>>> than what hbase 1.2.0 was.
> > >>>>>>
> > >>>>>> As before, I tested this both on a 3 node cluster as well as with
> a
> > >>>>>> unit test using HBaseTestingUtility. Both tests show very similar
> > >>>>>> relative differences.
> > >>>>>>
> > >>>>>> Jan
> > >>>>>>
> > >>>>>> On Thu, Jun 11, 2020 at 2:16 PM Anoop John <[email protected]
> >
> > >>> wrote:
> > >>>>>>>
> > >>>>>>> In another mail thread Zheng Hu brought up an important Jra fix
> > >>>>>>> https://issues.apache.org/jira/browse/HBASE-21657
> > >>>>>>> Can u pls check with this once?
> > >>>>>>>
> > >>>>>>> Anoop
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Tue, Jun 9, 2020 at 8:08 PM Jan Van Besien <[email protected]>
> > >>> wrote:
> > >>>>>>>
> > >>>>>>>> On Sun, Jun 7, 2020 at 7:49 AM Anoop John <
> [email protected]
> > >>>
> > >>> wrote:
> > >>>>>>>>> As per the above configs, it looks like Bucket Cache is not
> > >>> being used.
> > >>>>>>>>> Only on heap LRU cache in use.
> > >>>>>>>>
> > >>>>>>>> True (but it is large enough to hold everything, so I don't
> think
> > >>> it
> > >>>>>>>> matters).
> > >>>>>>>>
> > >>>>>>>>> @Jan - Is it possible for you to test with off heap Bucket
> > >> Cache?
> > >>>>>>>>   Config
> > >>>>>>>>> bucket cache off heap mode with size ~7.5 GB
> > >>>>>>>>
> > >>>>>>>> I did a quick test but it seems not to make a measurable
> > >>> difference.
> > >>>>>>>> If anything, it is actually slightly slower even. I see 100% hit
> > >>> ratio
> > >>>>>>>> in the L1
> > >>>>>>>> LruBlockCache and effectively also 100% in the L2 BucketCache
> > >> (hit
> > >>>>>>>> ratio is not yet at 100% but hits increase with every test and
> > >>> misses
> > >>>>>>>> do not).
> > >>>>>>>>
> > >>>>>>>> Given that the LruBlockCache was already large enough to cache
> > >> all
> > >>> the
> > >>>>>>>> data anyway, I did not expect this to help either, to be honest.
> > >>>>>>>>
> > >>>>>>>>> Do you have any DataBlockEncoding enabled on the CF?
> > >>>>>>>>
> > >>>>>>>> Yes, FAST_DIFF. But this is of course true in both the tests
> with
> > >>>>>>>> hbase2 and hbase1.
> > >>>>>>>>
> > >>>>>>>> Jan
> > >>>>>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best regards,
> > >>>>> Andrew
> > >>>>>
> > >>>>> Words like orphans lost among the crosstalk, meaning torn from
> > truth's
> > >>> decrepit hands
> > >>>>>     - A23, Crosstalk
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Best regards,
> > >>>> Andrew
> > >>>>
> > >>>> Words like orphans lost among the crosstalk, meaning torn from
> truth's
> > >>> decrepit hands
> > >>>>     - A23, Crosstalk
> > >>>
> > >>
> > >
> > >
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: HBase 2 slower than HBase 1?

Reply via email to