[ 
https://issues.apache.org/jira/browse/HBASE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihir Monani updated HBASE-28204:
---------------------------------
    Release Note: 
Canary is using Scan for first region of the table and Get for rest of the 
region. RAW Scan was only enabled for first region of any table. If a region 
has high number of deleted rows for the first row of the key-space, then It can 
take really long time for Get to finish execution. 

With this change, Region canary will use scan to validate that every region is 
accessible and also enables RAW Scan if it's enabled by the user.

> Canary can take lot more time If any region (except the first region) starts 
> with delete markers
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-28204
>                 URL: https://issues.apache.org/jira/browse/HBASE-28204
>             Project: HBase
>          Issue Type: Bug
>          Components: canary
>            Reporter: Mihir Monani
>            Assignee: Mihir Monani
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.4.18, 2.7.0, 2.5.8, 3.0.0-beta-2
>
>
> In CanaryTool.java, Canary reads only the first row of the region using 
> [Get|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L520C33-L520C33]
>  for any region of the table. Canary uses [Scan with FirstRowKeyFilter for 
> table 
> scan|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L530]
>  If the said region has empty start key (This will only happen when region is 
> the first region for a table)
> With -[HBASE-16091|https://issues.apache.org/jira/browse/HBASE-16091]- 
> RawScan was 
> [implemented|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519-L534]
>  to improve performance for regions which can have high number of delete 
> markers. Based on currently implementation, [RawScan is only 
> enabled|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519]
>  if region has empty start-key (or region is first region for the table).
> RawScan doesn't work for rest of the regions in the table except first 
> region. Also If the region has all the rows or majority of the rows with 
> delete markers, Get Operation can take a lot of time. This is can cause 
> timeouts for CanaryTool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to