[ 
https://issues.apache.org/jira/browse/HBASE-25709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554336#comment-17554336
 ] 

Viraj Jasani commented on HBASE-25709:
--------------------------------------

[~Xiaolin Ha] Thanks for providing further resolution. I am quite occupied this 
week, if still not reviewed, let me take a look next week. Thanks!

[~bbeaudreault] At high level, we can say that if the rows are quite large, and 
if the row also has delete markers as well, they are also returned by the scan. 
The patch I added in my previous comment would help understand at low level but 
that patch is applicable on the test that is now reverted with [this 
commit|https://github.com/apache/hbase/commit/5e34cdf1ef914b7c5d60df0edebd2f32ba543d02].
 Basically the repro can be done by reducing 
HBASE_CELLS_SCANNED_PER_HEARTBEAT_CHECK in the test.

> Close region may stuck when region is compacting and skipped most cells read
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-25709
>                 URL: https://issues.apache.org/jira/browse/HBASE-25709
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 1.7.1, 3.0.0-alpha-2, 2.4.10
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>             Fix For: 2.5.0, 2.6.0, 2.4.11, 3.0.0-alpha-4
>
>         Attachments: Master-UI-RIT.png, RS-region-state.png
>
>
> We found in our cluster about stop region stuck. The region is compacting, 
> and its store files has many TTL expired cells. Close region state 
> marker(HRegion#writestate.writesEnabled) is not checked in compaction, 
> because most cells were skipped. 
> !RS-region-state.png|width=698,height=310!
>  
> !Master-UI-RIT.png|width=693,height=157!
>  
> HBASE-23968 has encountered similar problem, but the solution in it is outer 
> the method
> InternalScanner#next(List<Cell> result, ScannerContext scannerContext), which 
> will not return if there are many skipped cells, for current compaction 
> scanner context. As a result, we need to return in time in the next method, 
> and then check the stop marker.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to