[ 
https://issues.apache.org/jira/browse/HBASE-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902110#action_12902110
 ] 

Nicolas Spiegelberg commented on HBASE-2643:
--------------------------------------------

We have encountered this EOF problem in our test cluster this week.  Is there a 
use case where an EOF could lead to data loss instead of just indicating data 
truncation due to connection failure?  HDFS throws a ChecksumException IOE with 
corrupt disk data, so EOF should only indicate application-level corruption.  
It seems like we should handle the EOF case differently than normal IOEs and 
proceed even when 'hbase.hlog.split.skip.errors' == false.

> Figure how to deal with eof splitting logs
> ------------------------------------------
>
>                 Key: HBASE-2643
>                 URL: https://issues.apache.org/jira/browse/HBASE-2643
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.90.0
>
>
> When splitting the WAL and encountering EOF, it's not clear what to do. 
> Initial discussion of this started in http://review.hbase.org/r/74/ - 
> summarizing here for brevity:
> We can get an EOFException while splitting the WAL in the following cases:
> - The writer died after creating the file but before even writing the header 
> (or crashed halfway through writing the header)
> - The writer died in the middle of flushing some data - sync() guarantees 
> that we can see _at least_ the last edit, but we may see half of an edit that 
> was being written out when the RS crashed (especially for large rows)
> - The data was actually corrupted somehow (eg a length field got changed to 
> be too long and thus points past EOF)
> Ideally we would know when we see EOF whether it was really the last record, 
> and in that case, simply drop that record (it wasn't synced, so therefore we 
> dont need to split it). Some open questions:
>   - Currently we ignore empty files. Is it ok to ignore an empty log file if 
> it's not the last one?
>   - Similarly, do we ignore an EOF mid-record if it's not the last log file?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to