On Mon, Sep 19, 2016 at 07:50:07PM +0000, Alex Elsayed wrote:
> > That would be true if the problem were not already long solved in btrfs.
> > The 32-bit CRC tree stores 4 bytes per block separately and efficiently.
> > With minor changes it can store a 32-byte HMAC for each block.
> 
> I disagree that this "solves" it - in particular, the fact that the fsck 
> tool support dropping/regenerating the extent tree is wildly unsafe in 
> the face of this.

Those fsck features should no longer work on the AEAD tree (or would
require the keys to work if there was enough filesystem left to salvage).

> For an AEAD that lacks nonce-misuse-resistance, it's "merely" downgrading 
> security from AEAD to simple encryption (GCM, for instance, becomes 
> exactly CTR). This would be almost okay (it's a fsck tool, after all), 
> but the fact that it's a fsck tool makes the next part worse.
> 
> In the case of nonce-misuse-resistant AEAD, it's much worse: Dropping the 
> checksum tree would permanently and irrevocably corrupt every single 
> extent, with no data recoverable at all. This is the _exact_ opposite of 
> _anything_ you would _ever_ want a fsck tool to do.

So...don't put those features in fsck?

In my experience, if you're dropping the checksum or especially the
extent tree, your filesystem is already so badly damaged you might as
well mkfs+restore the filesystem.  It'll take longer to reverify the
data at the application level or compare with the last backup.

An AEAD tree would just be like that, except there's no point in even
offering the option.  It would just be "rebuilding the AEAD tree will
erase all your encrypted data, leaving only plaintext data on the
filesystem if you had any, are you very sure about this y/N"

> This is, fundamentally, the problem with treating an "auth tag" as a 
> separate thing: It's only separate at all in weaker systems, and the act 
> of separating the data induces incredibly nasty failure modes.
> 
> It gets even worse if you consider _why_ that option exists for the fsck 
> tool: Because of the possibility that the _structure_ of the checksum 
> tree becomes corrupted. As a result, two bit-flips (one for each 
> duplicate of the metadata) would be entirely capable of irrevocably 
> destroying _all encrypted data on the FS_.

That event already destroys a btrfs filesystem, even without encryption.
btrfs already includes much of the verification process of a Merkle tree,
with weak checksums and no auth.  Currently, if you lose both copies of an
interior tree node, it is only possible to recover the filesystem offline
by brute-force search of the metadata.  It's one of the reasons why it's
so important to have duplicate metadata even on a single disk.

The only difference with encryption is that recovery would be
theoretically impossible instead of just practically infeasible.

> Separating the "auth tag" - simply considering an "auth tag" a separate 
> thing from the overall ciphertext - is a dangerous thing to do.
> 
> >> If you're _not_ using a nonce-misuse-resistant AEAD, it's even worse:
> >> keeping the tag out-of-band makes it far too easy to fail to verify it,
> >> or verify it only after decrypting the ciphertext to plaintext.
> >> Bluntly: that is an immediate security vulnerability.
> >> 
> >> tl;dr: Don't encrypt pages, encrypt extents. They grow a little for the
> >> auth tag, and that's fine.
> >> 
> >> Btrfs already handles needing to read the full extent in order to get a
> >> page out of it with compression, anyway.
> > 
> > It does, but compressed extents are limited to 128K.  Uncompressed
> > extents come in sizes up to 128M, far too large to read in their
> > entirety for many applications.
> 
> Er, yes, and? Just as compressed extents have a different cap for reasons 
> of practicality, so too can encrypted extents.

...which very inefficient space usage for short extents.

Attachment: signature.asc
Description: Digital signature

Reply via email to