[ 
https://issues.apache.org/jira/browse/HBASE-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199483#comment-16199483
 ] 

Gary Helmling commented on HBASE-18786:
---------------------------------------

Seems fine to remove from 1.3.  

handleFileNotFound() was introduced by HBASE-13651 to handle a situation where 
regionserver A is hosting a region and starts a compaction, enters GC pause, 
region is reassigned, then regionserver A emerges from pause and archives the 
compacted files before aborting.  If we really want to handle this situation 
then we need to introduce fencing at the HDFS level during failed server 
processing.  The current situation with handleFileNotFound() seems worse than 
the original problem, since it can hide other problems.

> FileNotFoundException should not be silently handled for primary region 
> replicas
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-18786
>                 URL: https://issues.apache.org/jira/browse/HBASE-18786
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>            Reporter: Ashu Pachauri
>            Assignee: Andrew Purtell
>             Fix For: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>
>         Attachments: HBASE-18786-branch-1.3.patch, 
> HBASE-18786-branch-1.patch, HBASE-18786-branch-1.patch, HBASE-18786.patch, 
> HBASE-18786.patch
>
>
> This is a follow up for HBASE-18186.
> FileNotFoundException while scanning from a primary region replica can be 
> indicative of a more severe problem. Handling them silently can cause many 
> underlying issues go undetected. We should either
> 1. Hard fail the regionserver if there is a FNFE on a primary region replica, 
> OR
> 2. Report these exceptions as some region / server level metric so that these 
> can be proactively investigated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to