[
https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546264#comment-14546264
]
Hudson commented on HBASE-13651:
--------------------------------
SUCCESS: Integrated in HBase-0.98 #989 (See
[https://builds.apache.org/job/HBase-0.98/989/])
HBASE-13651 Handle StoreFileScanner FileNotFoundExceptin (matteo.bertozzi: rev
42e3e37ee3b3d3fc3d348e3888f07237b680e594)
*
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCorruptedRegionStoreFile.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
*
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
> Handle StoreFileScanner FileNotFoundException
> ---------------------------------------------
>
> Key: HBASE-13651
> URL: https://issues.apache.org/jira/browse/HBASE-13651
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.27, 0.98.10.1
> Reporter: Matteo Bertozzi
> Assignee: Matteo Bertozzi
> Priority: Minor
> Fix For: 2.0.0, 0.94.28, 0.98.13, 1.2.0
>
> Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch,
> HBASE-13651-v0-0.94.patch, HBASE-13651-v0-0.98.patch,
> HBASE-13651-v0-branch-1.patch, HBASE-13651-v0.patch
>
>
> Example:
> * Machine-1 is serving Region-X and start compaction
> * Machine-1 goes in GC pause
> * Region-X gets reassigned to Machine-2
> * Machine-1 exit from the GC pause
> * Machine-1 (re)moves the compacted files
> * Machine-1 get the lease expired and shutdown
> Machine-2 has now tons of FileNotFoundException on scan. If we reassign the
> region everything is ok, because we pickup the files compacted by Machine-1.
> This problem doesn't happen in the new code 1.0+ (i think but I haven't
> checked, it may be 1.1) where we write on the WAL the compaction event before
> (re)moving the files.
> A workaround is handling FileNotFoundException and refresh the store files,
> or shutdown the region and reassign. the first one is easy in 1.0+ the second
> one requires more work because at the moment we don't have the code to notify
> the master that the RS is closing the region, alternatively we can shutdown
> the entire RS (it is not a good solution but the case is rare enough)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)