Re: [sqlite] Why would batched write operations NOT be faster than individual ones

Markus Schaber Mon, 03 Mar 2014 03:59:23 -0800

Hi,

Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org]
> On 3 Mar 2014, at 8:18am, Markus Schaber <m.scha...@codesys.com> wrote:
> > Another way to bust your data is to rely on RAID 5 or 6 or similar, at
> > least if the software does not take special care.
> >
> > When those mechanisms, updating a block always results in at least 2
> > disk
> > writes: The data block and the checksum block. There's a small time
> > window where only one of those blocks physically reached the disk.
> > Now, when the power fails during said time window, and the third disk
> > fails, it's content will be restored using the new data block and the
> > old checksum (or vice versa), leaving your data garbled.
> 
> What the heck ?  Is this a particular implementation of RAID or a conceptual
> problem with how RAID is designed to work ?  It sounds like a bug in one
> particular model rather than a general problem with how RAID works.


It is a conceptual problem of the RAID levels 5 and 6 and similar proprietary
mechanisms which are based on parity blocks.

RAID setups using only mirroring and striping like the RAID Levels 0, 1, 10
are not affected, and the risk may be lowered by using battery powered
RAID controllers.

Let's see a simple RAID5 with three disks. The blocks a and b are the two
data blocks which are covered by the parity block c. Let's say the database
code writes the block b. The RAID layer creates a corresponding write to
for the parity block c. As the harddisks are not physically synchronized,
there is a small time slot where only one of the blocks b and c has been 
written, but not the other one. The power fails during that time slot, and
during the reboot, the harddisk containing block a fails. During the raid
rebuild, the contents of block a are recreated using the blocks b and c -
but as only one of those blocks was up to date, and the other contains the
old state, this leads to (more or less) complete garbage in block a.

So using RAID5, you can risk damaging data which is even unrelated to
the data one was actually writing while the machine crashed.

Battery powered RAID controllers may lower the risk, as they either
hold a copy of the not-yet written blocks in their RAM (or flash)
until the power is restored, or they supply power to the harddisks
until all the blocks are written.

Similar things may happen with other parity / checksum based mechanisms,
like RAID 3, 6, or some (nowadays mostly extinct) proprietary solutions.


Best regards

Markus Schaber

CODESYS(r) a trademark of 3S-Smart Software Solutions GmbH

Inspiring Automation Solutions

3S-Smart Software Solutions GmbH
Dipl.-Inf. Markus Schaber | Product Development Core Technology
Memminger Str. 151 | 87439 Kempten | Germany
Tel. +49-831-54031-979 | Fax +49-831-54031-50

E-Mail: m.scha...@codesys.com | Web: http://www.codesys.com | CODESYS store: 
http://store.codesys.com
CODESYS forum: http://forum.codesys.com

Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade 
register: Kempten HRB 6186 | Tax ID No.: DE 167014915

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Why would batched write operations NOT be faster than individual ones

Reply via email to