[
https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538421#comment-14538421
]
Andrew Purtell commented on HBASE-13651:
----------------------------------------
I see fix versions for 0.94 and 0.98 and a patch for 0.94 and master.
We don't have RSRpcServices in 0.98. You'll find that code in HRegionServer.
> Handle StoreFileScanner FileNotFoundException
> ---------------------------------------------
>
> Key: HBASE-13651
> URL: https://issues.apache.org/jira/browse/HBASE-13651
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.94.27, 0.98.10.1
> Reporter: Matteo Bertozzi
> Assignee: Matteo Bertozzi
> Priority: Minor
> Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch
>
>
> Example:
> * Machine-1 is serving Region-X and start compaction
> * Machine-1 goes in GC pause
> * Region-X gets reassigned to Machine-2
> * Machine-1 exit from the GC pause
> * Machine-1 (re)moves the compacted files
> * Machine-1 get the lease expired and shutdown
> Machine-2 has now tons of FileNotFoundException on scan. If we reassign the
> region everything is ok, because we pickup the files compacted by Machine-1.
> This problem doesn't happen in the new code 1.0+ (i think but I haven't
> checked, it may be 1.1) where we write on the WAL the compaction event before
> (re)moving the files.
> A workaround is handling FileNotFoundException and refresh the store files,
> or shutdown the region and reassign. the first one is easy in 1.0+ the second
> one requires more work because at the moment we don't have the code to notify
> the master that the RS is closing the region, alternatively we can shutdown
> the entire RS (it is not a good solution but the case is rare enough)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)