[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

Duo Zhang (JIRA) Wed, 01 Mar 2017 02:28:54 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889918#comment-15889918
 ]


Duo Zhang commented on HBASE-17712:
-----------------------------------

{quote}
Want to give an illustration of what in particular is driving you crazy Duo 
Zhang?
{quote}
In HBASE-17633, I want to update the lowestUnflushedSequenceId in 
internalFlushCacheAndCommit using the memstore's minSequenceId. And then I 
found that we may modify the memstore content in refreshStoreFiles which is not 
part of the flush processing. After reading the code related to region replica, 
I found it is easy to handle as secondary replica does not handle write, and 
the replay is single threaded, no race condition. But at last I found that we 
even call dropMemstoreContents in doDelta! This is totally a mess.. I can not 
find a safe way to update the lowestUnflushedSequenceId if the minSequenceId is 
changed because of we drop some contents in memstore... What happens if there 
is a flush ongoing at the same time?

{quote}
Do we have tests that prove the latter assertion?
{quote}
I could try to add one.

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -----------------------------------------------------------------
>
>                 Key: HBASE-17712
>                 URL: https://issues.apache.org/jira/browse/HBASE-17712
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 1.4.0
>            Reporter: Duo Zhang
>             Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

Reply via email to