Re: [PATCH v2 3/4] btrfs: send: fix invalid commands for inodes with changed rdev but same gen

Filipe Manana Tue, 02 Feb 2021 03:58:53 -0800

On Sun, Jan 31, 2021 at 3:52 PM Roman Anasal | BDSU
<roman.ana...@bdsu.de> wrote:
>
> On Mon, Jan 25, 2021 at 20:51 +0000 Filipe Manana wrote:
> > On Mon, Jan 25, 2021 at 7:51 PM Roman Anasal <roman.ana...@bdsu.de>
> > wrote:
> > > Second example:
> > >   # case 2: same ino at different path
> > >   btrfs subvolume create subvol1
> > >   btrfs subvolume create subvol2
> > >   mknod subvol1/a c 1 3
> > >   mknod subvol2/b c 1 5
> > >   btrfs property set subvol1 ro true
> > >   btrfs property set subvol2 ro true
> > >   btrfs send -p subvol1 subvol2 | btrfs receive --dump
> >
> > As I've told you before for the v1 patchset from a week or two ago,
> > this is not a supported scenario for incremental sends.
> > Incremental sends are meant to be used on RO snapshots of the same
> > subvolume, and those snapshots must never be changed after they were
> > created.
> >
> > Incremental sends were simply not designed for these cases, and can
> > never be guaranteed to work with such cases.
> >
> > The bug is not having incremental sends fail right away, with an
> > explicit error message, when the send and parent roots aren't RO
> > snapshots of the same subvolume.
>
> Since this should be fixed then I'd like to propose to add the
> following check:
>
> The inodes of the subvolumes' root directories (ino
> BTRFS_FIRST_FREE_OBJECTID = 256) must have the same generation.
>
> Since create_subvol() will always commit the transaction, i.e.
> increment the generation, no two _independently_ created subvolumes can
> be created within the same generation (are there race conditions
> possible here?).


That is currently true, but it has been discussed and proposed the
ability to skip the transaction commit when creating a subvolume
Boris sent a proposal patch for that a few months ago.

I don't think that should be assumed. Avoiding the transaction commit,
either by default or optionally, is something that makes sense.
Plus for a case like snapshots, we can actually batch the creation of
several ones in a single transaction.

> Taking a snapshot of a subvolume does not modify the generation of the
> root dir inode. Also it is not possible to change or delete/re-create
> the root directory of a subvolume since this would delete the subvolume
> itself.
>
>
> So having two subvolumes with root directories created with different
> generations means they were created independently and can not share a
> common ancestor. Doing an incremental send with them is unsafe and thus
> must return an error.
> With the root directories at the same generation though the subvolumes
> are based on a common ancestor which is a requirement for a safe
> incremental send.
>
> Are my assumptions and my understanding here correct? Then this check
> would catch most of the unsafe parents.
> If so I could have a shot at a patch for this if you'd like me to?

That is too complex and makes too many assumptions.

To check if two roots are snapshots of the same subvolume (the send
and parent roots), you can simply check if they have non-null uuids in
the "parent_uuid" field of their root items and that they match.

While this is more straightforward to do in the kernel, I would prefer
to have it in btrfs-progs, because:

1) In btrfs-progs we can explicitly print an informative error message
to the user, while in the kernel you can only return an errno value
and log something dmesg/syslog, which is much less user friendly;

2) The check would be on by default but could be skipped with some new
flag - this is just being conservative to avoid breaking any existing
workflows we might not be aware of.
    In particular I'm thinking about people using "btrfs send" with -c
and omitting -p, in which case btrfs-progs selects one of the -c roots
to be used as the parent root,
    but the selected root might not be a snapshot of the same
subvolume as the send root.
    Then maybe one day that option to skip the check would be removed,
after we are more sure no one is using or really needs such workflows.

>
>
> This check still does not solve the second edge case though, when
> snapshots are modified afterwards and diverge independently form one
> another. For this I still see no good solution besides a new on-disk
> flag whether a snapshot was *ever* set to ro=false. But with that I'm
> not sure how to (not) inherit that flag in a safe way ...

I'm afraid there's nothing, codewise, to do about that case.

Setting some flag on the root to make it unusable for send in case it
was ever RW would break send in at least one way:

During a receive we create the root as RW, apply the send stream and
then change the root to RO.
After such change, it would mean we could not send the received
snapshot anymore. There's no way to make sure that only btrfs-receive
can do that, since anyone can use the ioctl.

Perhaps all that needs to be done is to document this well in the man
pages and wiki in case it's not already there.

Thanks.


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

Re: [PATCH v2 3/4] btrfs: send: fix invalid commands for inodes with changed rdev but same gen

Reply via email to