[ 
https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365086#comment-16365086
 ] 

Francis Liu commented on HBASE-19468:
-------------------------------------

cherry picked this to 1.3.2

> FNFE during scans and flushes
> -----------------------------
>
>                 Key: HBASE-19468
>                 URL: https://issues.apache.org/jira/browse/HBASE-19468
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>    Affects Versions: 1.3.1
>            Reporter: Thiruvel Thirumoolan
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 2.0.0, 1.3.2, 1.4.1, 1.5.0
>
>         Attachments: HBASE-19468-poc.patch, HBASE-19468_1.4.patch, 
> HBASE-19468_master.patch
>
>
> We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at 
> the same time. This causes regionserver to throw a UnknownScannerException 
> and client retries.
> This happens during the following sequence:
> 1. Scanner open, client fetched some rows from regionserver and working on it
> 2. Flush happens and storeScanner is updated with flushed files 
> (StoreScanner.updateReaders())
> 3. Compaction happens on the region while scanner is still open
> 4. compaction discharger runs and cleans up the newly flushed file as we 
> don't have new scanners on it yet.
> 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we 
> get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. 
> With branch-1.4, the scan fails with a DoNotRetryIOException.
> [~ram_krish], My proposal is to increment the reader count during 
> updateReaders() and decrement it during resetScannerStack(), so discharger 
> doesn't clean it up. Scan lease expiries also have to be taken care of. Am I 
> missing anything? Is there a better approach?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to