I was randomly reading
http://tux3.org/pipermail/tux3/2008-September/000186.html
for pleasure and noticed a possible latent corruption bug in Daniel
Phillips's post of the atom-refcount-update procedure (below).

If an i/o error occurs while reading the block containing the upper-16
bits of refcount, the procedure nevertheless updates the low-16 bits of
refcount on disk, and then returns an EIO error.   The fix probably
requires bread-ing both blocks into separate buffers before modifying
either (only one block if the upper-16 remain zero, of course).

I don't know anything about tux3, so perhaps some higher-level mechanism
un-does the incorrect update of the low-16 refcount after the EIO is
returned.  But if not, a flaky disk (or network link to a disk) might
result in the refcount being silently reduced by 65535.

For reference, here is Daniel's code from the post (without endien-ness
stuff):

int use_atom(struct inode *inode, atom_t atom, int use)
{
        unsigned shift = inode->sb->blockbits - 1;
        unsigned block = inode->sb->atomref_base + 2 * (atom >> shift);
        unsigned offset = atom & ~(-1 << shift);
        struct buffer *buffer;

        if (!(buffer = bread(inode->map, block)))
                return -EIO;
        int low = ((u16 *)buffer->data)[offset] + use;
        ((u16 *)buffer->data)[offset] = low;
        if ((low & (-1 << 16))) {
                brelse_dirty(buffer);
                if (!(buffer = bread(inode->map, block + 1)))
                        return -EIO; // <********************** BUG ?
                ((u16 *)buffer->data)[offset] += low >> 16;
        }
        brelse_dirty(buffer);
        return 0;
}


_______________________________________________
Tux3 mailing list
[email protected]
http://tux3.org/cgi-bin/mailman/listinfo/tux3

Reply via email to