[
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813330#comment-13813330
]
Ted Yu commented on HBASE-8942:
-------------------------------
>From the following call stack (trunk), I don't see where readLock is grabbed:
{code}
HStore.getScanner(Scan, NavigableSet<byte[]>, long) line: 1683
HRegion$RegionScannerImpl.<init>(Scan, List<KeyValueScanner>, HRegion) line:
3427
HRegion.instantiateRegionScanner(Scan, List<KeyValueScanner>) line: 1746
HRegion.getScanner(Scan, List<KeyValueScanner>) line: 1738
HRegion.getScanner(Scan) line: 1715
TestHRegionBusyWait(TestHRegion).testWritesWhileScanning() line: 2914
{code}
> DFS errors during a read operation (get/scan), may cause write outliers
> -----------------------------------------------------------------------
>
> Key: HBASE-8942
> URL: https://issues.apache.org/jira/browse/HBASE-8942
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.89-fb, 0.95.2
> Reporter: Amitanand Aiyer
> Assignee: Amitanand Aiyer
> Priority: Minor
> Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14
>
> Attachments: 8942.096.txt, HBase-8942.txt
>
>
> This is a similar issue as discussed in HBASE-8228
> 1) A scanner holds the Store.ReadLock() while opening the store files ...
> encounters errors. Thus, takes a long time to finish.
> 2) A flush is completed, in the mean while. It needs the write lock to
> commit(), and update scanners. Hence ends up waiting.
> 3+) All Puts (and also Gets) to the CF, which will need a read lock, will
> have to wait for 1) and 2) to complete. Thus blocking updates to the system
> for the DFS timeout.
> Fix:
> Open Store files outside the read lock. getScanners() already tries to do
> this optimisation. However, Store.getScanner() which calls this functions
> through the StoreScanner constructor, redundantly tries to grab the readLock.
> Causing the readLock to be held while the storeFiles are being opened, and
> seeked.
> We should get rid of the readLock() in Store.getScanner(). This is not
> required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx).
> This has the required locking already.
--
This message was sent by Atlassian JIRA
(v6.1#6144)