On 2019/3/27 下午10:07, Adam Borowski wrote:
> On Wed, Mar 27, 2019 at 05:46:50PM +0800, Qu Wenruo wrote:
>> This urgent patchset can be fetched from github:
>> https://github.com/adam900710/btrfs-progs/tree/flush_super
>> Which is based on v4.20.2.
>>
>> Before this patch, btrfs-progs writes to the fs has no barrier at all.
>> All metadata and superblock are just buffered write, no barrier between
>> super blocks and metadata writes at all.
>>
>> No wonder why even clear space cache can cause serious transid
>> corruption to the originally good fs.
>>
>> Please merge this fix as soon as possible as I really don't want to see
>> btrfs-progs corrupting any fs any more.
> 
> How often does this happen in practice?  I'm slightly incredulous about
> btrfs-progs crashing often.   Especially that pwrite() is buffered on the
> kernel side, so we'd need a _kernel_ crash (usually a power loss) to break
> consistency.  Obviously, a potential data loss bug is always something that
> needs fixing, I'm just wondering about severity.

Here is a valid case where a crash could cause transid error:

- transaction 1
  new em at 16K (fs root, gen = 1)
  new em at 32K (extent root, gen = 1)
  new em at 48K (tree root, gen = 1)
  sb->fs root = gen 1
  sb->extent root = gen 1
  sb->tree root = gen 1

- transaction 2
  new em at 64K (extent root, gen = 2)
  new em at 80K (tree root, gen = 2)
  sb->fs root = gen 1 at 16K
  sb->extent root = gen 2
  sb->tree root = gen 2

- transaction 3, half backed due to error commit transaction
  new eb at 16K (tree root, gen = 3) submitted

In above case, we will write the newest eb at 16K to disk, but with sb
from transaction 2.

Then sb expects to read out a tree with gen 1, but get a tree with gen 3.
Further more, even we ignore the generation mismatch, the content of em
16K is completely wrong, super block of gen 2 expects fs root content
from em at 16K, but its content is tree root.

This should explain the severity much better.

Thanks,
Qu

> 
> Or do I understand this wrong?
> 
> Asking because Dimitri John Ledkov stepped down as Debian's maintainer of
> this package, and I'm taking up the mantle (with Nicholas D Steeves being
> around) -- modulo any updates other than important bug fixes being on hold
> because of Debian's freeze.  Thus, I wonder if this is important enough to
> ask for a freeze exception.
> 
> 
> Meow!
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to