[ 
https://issues.apache.org/jira/browse/KUDU-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516459#comment-16516459
 ] 

Will Berkeley commented on KUDU-2260:
-------------------------------------

One other important detail: the NULL byte guarantee holds when using the 
default data=ordered guarantee or stronger. If ext4 is mounted with 
data=writeback then we could find anything at the end. So we can 
(transparently) recover from this situation at startup under default ext4 
settings.

> Log block manager should handle null bytes in metadata on crash
> ---------------------------------------------------------------
>
>                 Key: KUDU-2260
>                 URL: https://issues.apache.org/jira/browse/KUDU-2260
>             Project: Kudu
>          Issue Type: Bug
>          Components: fs
>            Reporter: Mike Percy
>            Priority: Major
>
> The log block manager currently may leave null bytes at the end of the 
> metadata log file if there is a system crash in the middle of a write. The 
> log block manager should detect null bytes at the end of a metadata entry on 
> startup and potentially truncate the entry or close the container.
> Currently, it prints an error along the following lines:
> {code}
> F0111 09:30:27.327011 28843 tablet_server_main.cc:64] Check failed: _s.ok() 
> Bad status: Corruption: Failed to load FS layout: Could not read records from 
> container /data/3/kudu/data/f70391c7c6084e08bbae7448518e0b5e: Data length 
> checksum does not match: Incorrect checksum in file 
> /data/3/kudu/data/f70391c7c6084e08bbae7448518e0b5e.metadata at offset 372533: 
> Checksum does not match. Expected: 0. Actual: 1323915147
> {code}
> At the time of writing, the workaround for this issue is to truncate the 
> affected file at the start of the incomplete entry in the file. While this 
> may leave orphaned blocks, this should be safe because if the metadata entry 
> was never successfully written then it should not have been considered 
> durable, either.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to