[ 
https://issues.apache.org/jira/browse/QPID-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832427#comment-13832427
 ] 

Kim van der Riet commented on QPID-5358:
----------------------------------------

In the legacy store (which uses a circular file journal), there is a possible 
scenario in which bad data could be recovered without being detected. In this 
scenario, a single record must span 3 or more journal files. If at the time of 
broker failure the files containing the start and end (tail) of the enqueue 
record have been written to disk, but some or all of the intermediate file(s) 
remain unwritten (awaiting async i/o), then upon recovery, the intermediate 
file data will be read as a part of the record without the write failure being 
detected.

This scenario has never been observed in practice, as most modern disk 
controllers tend to result in ordered async i/o. However, the probability of 
such a failure remains, and is not easily quantifiable.

In the case of linearstore (assuming the same scenario), it may be possible for 
files to be placed into the linearstore file sequence from the Empty File Pool, 
and yet remain unwritten because the write sequence of buffering async i/o 
controllers is not guaranteed.

The linearstore has added a checksum field to the record tail for enqueued 
records. A checksum algorithm which is inexpensive and yet sufficiently 
effective at detecting this scenario is required. The checksum will include the 
xid and data sections of the enqueue record, and the xid of dequeue and 
transaction records. Upon recovery, the checksum of the recovered data and xid 
is calculated and compared with the checksum recorded in the record tail. This 
strategy ought to detect if any of the intermediate files in a large enqueue 
record remain unwritten at broker failure.

> Linearstore: Checksums not implemented in record tail
> -----------------------------------------------------
>
>                 Key: QPID-5358
>                 URL: https://issues.apache.org/jira/browse/QPID-5358
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>            Reporter: Kim van der Riet
>            Assignee: Kim van der Riet
>              Labels: linearstore
>
> The linearstore now implements a 32-bit checksum in the record tail to check 
> data integrity of the xid and data sections of the record. These have not 
> been implemented (the value 0x0 is hardwired currently) and need some 
> developement effort.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to