[ 
https://issues.apache.org/jira/browse/KUDU-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238177#comment-15238177
 ] 

Todd Lipcon commented on KUDU-1414:
-----------------------------------

BTW it's worth consulting Table 1 in 
http://pages.cs.wisc.edu/~samera/papers/alice-osdi14.pdf which lists the 
atomicity violations that are observed on various real-life file systems. For 
Kudu's case I think we care about xfs and ext4-ordered. Both of these seem to 
guarantee a mutli-block prefix append property - in other words, since we're 
appending to a file without overwrite, we're guaranteed to see a correct prefix 
of the append (ie not some zeros followed by some real data).

Handling bit-swaps that happen on cold data later seems like it should be 
considered separately than the more common case of crashes which are enumerated 
by the Alice paper.

> Corrupting multiple log entries at the end of a WAL file may go undetected
> --------------------------------------------------------------------------
>
>                 Key: KUDU-1414
>                 URL: https://issues.apache.org/jira/browse/KUDU-1414
>             Project: Kudu
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.8.0
>            Reporter: Mike Percy
>
> While looking at KUDU-1377, I investigated how we are handling WAL truncation 
> when corruption is detected. The way the code is written today, a trailing 
> series of corrupt log entries are truncated with only a log warning message. 
> I'll post a unit test demonstrating this behavior.
> One way to get around this is to ensure that we only accept zeros following a 
> truncated record, instead of just bad records, in order to consider it a 
> partially-written record that we can safely truncate. We would have to 
> maintain this invariant when preallocating space and truncating partial 
> records before continuing to write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to