[jira] [Commented] (HBASE-19468) FNFE during scans and flushes

ramkrishna.s.vasudevan (JIRA) Tue, 12 Dec 2017 22:41:42 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288788#comment-16288788
 ]


ramkrishna.s.vasudevan commented on HBASE-19468:
------------------------------------------------

Thanks for the review. 
bq.Opening the reader in updateReaders will make flusher open the all storefile 
for all scanner. Doest it impact the performance?
Even before I thought of this patch i first checked the code . One thing is 
sure that just on a flush is done when we commit that file
{code}
private HStoreFile commitFile(Path path, long logCacheFlushId, MonitoredTask 
status)
      throws IOException {
{code}
In HStore#commitfile we open the reader. So the actual Store#getScanner() 
internally just creates the light weight storeFile scanner and the reader is 
not getting opened by this method. So there is no extra resource that is 
getting held up. Pls correct me if am missing something here. [~chia7712]?

Coming to this
bq. May be it was a get op and no other next() calls might happen? Even on Scan.
In case of gets() generally it is single RPC . There are no multi RPC I 
believe. So  even if you feel the memstore scanner got flushed to file scanner 
it will get closed by the close call.
For scans() let the next() not be called - say there was a lease expiry- even 
then we will be closing this scanner as part of close().

bq.That is why I was wondering whether we can have a similar way of update 
readers after compaction(like the flush) and clear these new files from list.. 
Oh ya we should have ways of notify
This also I first thought but felt it may be tricky to implement. Again for 
that from compaction we should call updateReader like how it was happening 
before. Hence I went with this simple way.
So on every compaction also again we should instruct the scanner to update its 
scanner list and to avoid that was the actual aim of this ref counting feature. 
Correct me if am wrong here. Anyway i will still see if there is a better way. 
(if any).



> FNFE during scans and flushes
> -----------------------------
>
>                 Key: HBASE-19468
>                 URL: https://issues.apache.org/jira/browse/HBASE-19468
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>    Affects Versions: 1.3.1
>            Reporter: Thiruvel Thirumoolan
>            Priority: Critical
>             Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3
>
>         Attachments: HBASE-19468-poc.patch, HBASE-19468_1.4.patch
>
>
> We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at 
> the same time. This causes regionserver to throw a UnknownScannerException 
> and client retries.
> This happens during the following sequence:
> 1. Scanner open, client fetched some rows from regionserver and working on it
> 2. Flush happens and storeScanner is updated with flushed files 
> (StoreScanner.updateReaders())
> 3. Compaction happens on the region while scanner is still open
> 4. compaction discharger runs and cleans up the newly flushed file as we 
> don't have new scanners on it yet.
> 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we 
> get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. 
> With branch-1.4, the scan fails with a DoNotRetryIOException.
> [~ram_krish], My proposal is to increment the reader count during 
> updateReaders() and decrement it during resetScannerStack(), so discharger 
> doesn't clean it up. Scan lease expiries also have to be taken care of. Am I 
> missing anything? Is there a better approach?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-19468) FNFE during scans and flushes

Reply via email to