[ 
https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538421#comment-14538421
 ] 

Andrew Purtell commented on HBASE-13651:
----------------------------------------

I see fix versions for 0.94 and 0.98 and a patch for 0.94 and master.

We don't have RSRpcServices in 0.98. You'll find that code in HRegionServer.

> Handle StoreFileScanner FileNotFoundException
> ---------------------------------------------
>
>                 Key: HBASE-13651
>                 URL: https://issues.apache.org/jira/browse/HBASE-13651
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.27, 0.98.10.1
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>         Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch
>
>
> Example:
>  * Machine-1 is serving Region-X and start compaction
>  * Machine-1 goes in GC pause
>  * Region-X gets reassigned to Machine-2
>  * Machine-1 exit from the GC pause
>  * Machine-1 (re)moves the compacted files
>  * Machine-1 get the lease expired and shutdown
> Machine-2 has now tons of FileNotFoundException on scan. If we reassign the 
> region everything is ok, because we pickup the files compacted by Machine-1.
> This problem doesn't happen in the new code 1.0+  (i think but I haven't 
> checked, it may be 1.1) where we write on the WAL the compaction event before 
> (re)moving the files.
> A workaround is handling FileNotFoundException and refresh the store files, 
> or shutdown the region and reassign. the first one is easy in 1.0+ the second 
> one requires more work because at the moment we don't have the code to notify 
> the master that the RS is closing the region, alternatively we can shutdown 
> the entire RS (it is not a good solution but the case is rare enough)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to