[
https://issues.apache.org/jira/browse/HBASE-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902110#action_12902110
]
Nicolas Spiegelberg commented on HBASE-2643:
--------------------------------------------
We have encountered this EOF problem in our test cluster this week. Is there a
use case where an EOF could lead to data loss instead of just indicating data
truncation due to connection failure? HDFS throws a ChecksumException IOE with
corrupt disk data, so EOF should only indicate application-level corruption.
It seems like we should handle the EOF case differently than normal IOEs and
proceed even when 'hbase.hlog.split.skip.errors' == false.
> Figure how to deal with eof splitting logs
> ------------------------------------------
>
> Key: HBASE-2643
> URL: https://issues.apache.org/jira/browse/HBASE-2643
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.90.0
>
>
> When splitting the WAL and encountering EOF, it's not clear what to do.
> Initial discussion of this started in http://review.hbase.org/r/74/ -
> summarizing here for brevity:
> We can get an EOFException while splitting the WAL in the following cases:
> - The writer died after creating the file but before even writing the header
> (or crashed halfway through writing the header)
> - The writer died in the middle of flushing some data - sync() guarantees
> that we can see _at least_ the last edit, but we may see half of an edit that
> was being written out when the RS crashed (especially for large rows)
> - The data was actually corrupted somehow (eg a length field got changed to
> be too long and thus points past EOF)
> Ideally we would know when we see EOF whether it was really the last record,
> and in that case, simply drop that record (it wasn't synced, so therefore we
> dont need to split it). Some open questions:
> - Currently we ignore empty files. Is it ok to ignore an empty log file if
> it's not the last one?
> - Similarly, do we ignore an EOF mid-record if it's not the last log file?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.