[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889894#comment-15889894
 ] 

stack commented on HBASE-17712:
-------------------------------

bq.  I think sequence id accounting is your favorite part in HBase.

Thats funny. 

I keep promising a write-up on the life of a squenceid but its processing is in 
eternal flux. It is an afterthought on our base type, the KeyValue/Cell. It is 
not always present, cleared by compaction as an optimization after a 
near-arbitrary amount of time has elapsed, so a reluctance to lean on it in 
logic. This lack of clarity around fate of sequenceid is probably root cause of 
why sometimes it is treated with kid gloves while at other times it is used 
without locking.... (if we had hybrid logical clocks, sequenceid would inherent 
to timestamp, it would be always 'present', and it would be integral to Cell 
...... TODO).

Want to give an illustration of what in particular is driving you crazy 
[~Apache9]?

bq.  I do not think it is a good idea to just eat the exception and refresh 
store files. 

Agree... especially given  "... in 1.x release, the problem described in 
HBASE-13651 is gone." as Matteo says up in HBASE-13651.

Do we have tests that prove the latter assertion?

Thanks [~Apache9]






> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -----------------------------------------------------------------
>
>                 Key: HBASE-17712
>                 URL: https://issues.apache.org/jira/browse/HBASE-17712
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 1.4.0
>            Reporter: Duo Zhang
>             Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to