[
https://issues.apache.org/jira/browse/QPID-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832427#comment-13832427
]
Kim van der Riet commented on QPID-5358:
----------------------------------------
In the legacy store (which uses a circular file journal), there is a possible
scenario in which bad data could be recovered without being detected. In this
scenario, a single record must span 3 or more journal files. If at the time of
broker failure the files containing the start and end (tail) of the enqueue
record have been written to disk, but some or all of the intermediate file(s)
remain unwritten (awaiting async i/o), then upon recovery, the intermediate
file data will be read as a part of the record without the write failure being
detected.
This scenario has never been observed in practice, as most modern disk
controllers tend to result in ordered async i/o. However, the probability of
such a failure remains, and is not easily quantifiable.
In the case of linearstore (assuming the same scenario), it may be possible for
files to be placed into the linearstore file sequence from the Empty File Pool,
and yet remain unwritten because the write sequence of buffering async i/o
controllers is not guaranteed.
The linearstore has added a checksum field to the record tail for enqueued
records. A checksum algorithm which is inexpensive and yet sufficiently
effective at detecting this scenario is required. The checksum will include the
xid and data sections of the enqueue record, and the xid of dequeue and
transaction records. Upon recovery, the checksum of the recovered data and xid
is calculated and compared with the checksum recorded in the record tail. This
strategy ought to detect if any of the intermediate files in a large enqueue
record remain unwritten at broker failure.
> Linearstore: Checksums not implemented in record tail
> -----------------------------------------------------
>
> Key: QPID-5358
> URL: https://issues.apache.org/jira/browse/QPID-5358
> Project: Qpid
> Issue Type: Bug
> Components: C++ Broker
> Reporter: Kim van der Riet
> Assignee: Kim van der Riet
> Labels: linearstore
>
> The linearstore now implements a 32-bit checksum in the record tail to check
> data integrity of the xid and data sections of the record. These have not
> been implemented (the value 0x0 is hardwired currently) and need some
> developement effort.
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]