[
https://issues.apache.org/jira/browse/KUDU-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190425#comment-15190425
]
Adar Dembo commented on KUDU-668:
---------------------------------
bq. We should be able to detect this case where the metadata file is empty
(just a header, no records) and the data file is non-existent, and just remove
the metadata file, right?
Yes, removing the metadata file in this situation should be harmless.
> Log block container metadata files should be more forgiving to truncation
> -------------------------------------------------------------------------
>
> Key: KUDU-668
> URL: https://issues.apache.org/jira/browse/KUDU-668
> Project: Kudu
> Issue Type: Sub-task
> Components: fs
> Affects Versions: M5
> Reporter: Adar Dembo
> Assignee: Adar Dembo
>
> Log block container metadata files are resilient to many different kinds of
> failures (see pb_util.h for details). However, they are also overly strict
> with respect to truncation. Ideally, a truncation in the middle of a log
> block record should result in the record being discarded and the container
> reused for additional writes. The only way to do this safely is to prove
> that, between the truncation and the end of the file, there do not exist any
> other valid log block records. The WAL segment reader code has the same
> problem, and it handles this by trying to decode a segment header at every
> byte position between the point of truncation and the end of the file. Log
> block container metadata files should do the same thing.
> Here's what needs to happen:
> # Containerized PB files should add a CRC32 checksum to the message header
> structure. Otherwise we can't tell if a particular read in the file comprises
> a "valid" message header.
> # In the event of truncation, they should do what the WAL segment reader does
> and scan ahead in the file looking for valid message headers. If one is
> found, this is not truncation but corruption, and is unrecoverable. If none
> are found (or if the remainder of the file is all zeroes), it's recoverable
> truncation.
> # If the truncation is recoverable, we should make sure to start writing new
> metadata at the point of truncation, not at the end of the file.
> Once this is done, containerized PB files will be almost identical to WAL
> segments, and we could consider merging the two. As far as I can tell, the
> only remaining major difference is that WAL segments allow one to write
> different kinds of PB messages, while containerized PB files are restricted
> to one type of PB message per file.
> For the time being, log block container metadata files don't use memory
> mapped writing or preallocation, so that the likelihood of extra zeroes in
> the file is low. Still, if we believe that the underlying filesystem or disk
> could truncate the file unexpectedly, we will consider such truncation fatal
> instead of recovering gracefully.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)