On 2021/2/10 上午9:05, Marek Behun wrote:
On Wed, 10 Feb 2021 08:09:14 +0800
Qu Wenruo <quwenruo.bt...@gmx.com> wrote:

On 2021/2/10 上午1:33, Marek Behún wrote:
When the btrfs_read_fs_root() function is searching a ROOT_ITEM with
location key offset other than -1, it currently fails via BUG_ON.

The offset can have other value than -1, though. This can happen for
example if a subvolume is renamed:

    $ btrfs subvolume create X && sync
    Create subvolume './X'
    $ btrfs inspect-internal dump-tree /dev/root | grep -B 2 'name: X$
          location key (270 ROOT_ITEM 18446744073709551615) type DIR
          transid 283 data_len 0 name_len 1
          name: X
    $ mv X Y && sync
    $ btrfs inspect-internal dump-tree /dev/root | grep -B 2 'name: Y$
          location key (270 ROOT_ITEM 0) type DIR
          transid 285 data_len 0 name_len 1
          name: Y

As can be seen the offset changed from -1ULL to 0.


Offset for subvolume ROOT_ITEM can be other values, especially for
snapshot that offset is the transid when it get created.

But the problem is, if we call btrfs_read_fs_root() for subvolume tree,
the offset of the key really doesn't matter, the only important thing is
the objectid.

Thus we use that BUG_ON() to catch careless callers.

Would you please provide a case where we wrongly call
btrfs_read_fs_root() with incorrect offset inside btrfs-progs/uboot?

I believe that would be the proper way to fix.

Qu,

this can be triggered in U-Boot when listing a directory containing a
subvolume that was renamed:
   - create a subvolume && sync
   - rename subvolume && sync
   - umount, reboot, list the directory containing the subvolume in
     u-boot
It will also break when you want to read a file that has a subvolume in
it's path (e.g. `read mmc 0 0x10000000 /renamed-subvol/file`).

I found out this btrfs-progs commit:
   
https://github.com/kdave/btrfs-progs/commit/10f1af0fe7de5a0310657993c7c21a1d78087e56
This commit ensures that while searching a directory recursively, when
a ROOT_ITEM is encountered, the offset of its location is changed to -1
before passing the location to btrfs_read_fs_root().

That's what I expect the code to do, but you're right, if kernel is not doing it anymore, I prefer the kernel behavior.


So maybe we could do this in u-boot as well, but why do this? Linux'
btrfs driver does not check whether the offset is -1. So why do it here?

You're correct, the kernel is using new schema, btrfs_get_fs_root(), which only requires root objectid and completely get rid of the offset/type, which is far less possible to call with wrong parameters.

It would be a good timing to sync the code between kernel and progs/u-boot now.


BTW, Qu, I think we have to change the BUG_ON code in U-Boot's btrfs
driver. BUG_ON in U-Boot calls a complete SOC reset. We can't break
whole U-Boot simply because btrfs partition contains broken data.
U-Boot commands must fail in such a case, not reset the SOC.

Well, progs (and even kernel) is a mine-field for BUG_ON()s.

But at least for kernel, it's protected by tree-checker which rejects invalid on-disk data before it reaches btrfs code, thus mostly kernel BUG_ON()s are really hard to hit (a lot of them are even impossible to hit after the introduction of tree-checker), and indicate real problems.

For now, the BUG_ON()s in U-boot still indicates problems that we can't really solve or doesn't expect at all in btrfs realm, e.g. the BUG_ON() you're hitting (call sites problem).

I admit it's a pain in the ass for full SoC reset, but I don't have any better alternatives yet.

The mid to long term solution would be introducing tree-checker to U-boot, so that the remaining BUG_ON()s are really code bugs.

Thanks,
Qu


Marek


Reply via email to