[
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889990#comment-15889990
]
stack commented on HBASE-17712:
-------------------------------
bq. What happens if there is a flush ongoing at the same time?
I see. Looks like cruft built on top of cruft. Its a while since I was in here.
Replacement of current set of hfiles was always a little awkward. We didn't
want every access going across a synchronization just to check for the
extremely rare case of a change in the store file Set. I'd have to do some
archeology to see if retry of FNFE was a compromise so we could do w/o a sync
check. Would be coolio if we could purge having to handle FNFE.
I don't follow the comment on why the call to dropMemstoreContents was added to
doDelta by:
{code}
tree 11b5d28bb22d95bd5c6276346f3055412b2d6902
parent dda8f67b2cc9f6ef4ab434beea2a47d461a20a1f
author tedyu <[email protected]> Wed Aug 24 09:04:47 2016 -0700
committer tedyu <[email protected]> Wed Aug 24 09:04:47 2016 -0700
HBASE-16304 HRegion#RegionScannerImpl#handleFileNotFoundException may lead to
deadlock when trying to obtain write lock on updatesLock
{code}
Looking at my review of HBASE-16304, my last remark was: "I'm not sure I follow
the dropMemstoreContents(); bits. Some more commentary on interrelation might
help" ... to which the response was that there was explanation (I don't see
it...). Ram asks what it is about later also.... It doesn't look like he got a
straight response.
Can you help here [~tedyu]?
> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -----------------------------------------------------------------
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0, 1.4.0
> Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id
> is involved in and the method we called is used to serve secondary replica
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we
> will write a compaction marker to WAL before deleting the compacted files. We
> can only consider a RS as dead after its WAL files are all closed so if the
> region has already been reassigned the compaction will fail as we can not
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a
> critical bug which means we may loss data. I do not think it is a good idea
> to just eat the exception and refresh store files. Or even if we want to do
> this, we can just refresh store files without dropping memstore contents.
> This will also simplify the logic a lot.
> Suggestions are welcomed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)