Bram Schuur created HBASE-28902:
-----------------------------------

             Summary: Performance regression from 2.5.8 to 2.6.0 when seeking 
SEEK_NEXT_USING_HINT to a next clumn family.
                 Key: HBASE-28902
                 URL: https://issues.apache.org/jira/browse/HBASE-28902
             Project: HBase
          Issue Type: Bug
          Components: Scanners
    Affects Versions: 2.6.0
            Reporter: Bram Schuur


We have a custom hbase filter that seeks (SEEK_NEXT_USING_HINT) to a next 
column family (called "cf" in our case) based on data in a cell in a prior 
column family (called "bf_slicing"). We upgraded to hbase 2.6.0 from 2.5.8, the 
change in this ticket https://issues.apache.org/jira/browse/HBASE-27788 caused 
a significant performance degradation (from instant seeking to the next family 
to traversing the entire bf_slicing family).

We traced the cause to the following:

When comparing families here, the 'cf' family is ordered lower than 
'bf_slicing' due to its length, causing the first column family ("bf_slicing") 
to be fully traversed. The offending code is here: 
[https://github.com/apache/hbase/pull/5171/files#diff-1ec9654ed8e00f46e11430fc726f8351db59597723efa0bf1e268196f00244c6R54]

The original story (HBASE_27788) mentions no seeking should be done outside a 
column family, but our use case seems legitimate in the data model, so we think 
this is a bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to