[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990068#comment-16990068
 ] 

Viraj Jasani edited comment on HBASE-23349 at 12/7/19 3:10 PM:
---------------------------------------------------------------

Thanks [~ram_krish]

What you are suggesting is notify scanner to reset by discharger thread itself 
(scanner reset used to happen only before HBASE-13082 right?) Also, as of now 
notifyChangedReadersObservers() is called only during store flush, so if we 
start using it by discharger thread, the thread should try to reset heap for 
all the scanners and not for specific one right? Since the thread might not 
have context of a specific scanner. And if so, we can directly reset refCount 
to 0 for all compacted away store files for a given store? Because once we 
reset heap and lastTop, may be we don't need to worry about refCount?

To sum up, by default, if archival of store files can't be done due to refCount 
> 0 till 2 min of discharger thread run, the thread should immediately run 
notifyChangedReadersObservers() which should reset heap for all existing 
scanners. Please let me know if my understanding is not correct.

In a way this could take care of open scanners gracefully.

 

Edit: If we notify all open scanners to reset themselves, even scanners reading 
non-compacted store files would be impacted right? At that moment, all open 
scanners would take little longer than usual regardless of whether they are 
using compacted away files to scan records? This should be fine?

Also, I was trying to check this notify code and it seems it is difficult to 
know which StoreScanner is currently holding lock on impacted(compacted away) 
store files. If we know this, probably we might have better implementation 
where we reset heap in StoreScanner only for those who are using compacted away 
store files and not reset heap for all StoreScanners.


was (Author: vjasani):
Thanks [~ram_krish]

What you are suggesting is notify scanner to reset by discharger thread itself 
(scanner reset used to happen only before HBASE-13082 right?) Also, as of now 
notifyChangedReadersObservers() is called only during store flush, so if we 
start using it by discharger thread, the thread should try to reset heap for 
all the scanners and not for specific one right? Since the thread might not 
have context of a specific scanner. And if so, we can directly reset refCount 
to 0 for all compacted away store files for a given store? Because once we 
reset heap and lastTop, may be we don't need to worry about refCount?

To sum up, by default, if archival of store files can't be done due to refCount 
> 0 till 2 min of discharger thread run, the thread should immediately run 
notifyChangedReadersObservers() which should reset heap for all existing 
scanners. Please let me know if my understanding is not correct.

In a way this could take care of open scanners gracefully.

> Reader lock on compacted store files preventing archival of compacted files
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-23349
>                 URL: https://issues.apache.org/jira/browse/HBASE-23349
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.3.0, 1.6.0
>            Reporter: Viraj Jasani
>            Assignee: Viraj Jasani
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0, 1.6.0
>
>         Attachments: HBASE-23349.master.000.patch, 
> HBASE-23349.master.001.patch, HBASE-23349.master.002.patch
>
>
> refCounts on compacted away store files as low as 1 can also prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code blocking reader lock if client or 
> coprocessor has held the lock for significantly high amount of 
> time(configurable - mostly same as discharger thread interval) or gracefully 
> resolve reader lock issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to