On 2019/9/4 下午6:53, Nikolay Borisov wrote:
>
>
> On 4.09.19 г. 11:30 ч., Qu Wenruo wrote:
>>
>>
>> On 2019/9/4 下午3:37, Nikolay Borisov wrote:
>>>
>>>
>>> On 3.09.19 г. 11:24 ч., Qu Wenruo wrote:
>>>> Before this patch, repair_imode_common() can only handle two types of
>>>> inodes:
>>>> - Free space cache inodes
>>>> - ROOT DIR inodes
>>>>
>>>> For inodes in subvolume trees, the problem is how to determine the
>>>> correct imode, thus it was not implemented.
>>>>
>>>> However there are more reports of incorrect imode in subvolume trees, we
>>>> need to support such fix.
>>>>
>>>> So this patch adds a new function, detect_imode(), to detect (or call it
>>>> educated guess) imode for inodes in subvolume trees.
>>>>
>>>> Signed-off-by: Qu Wenruo <w...@suse.com>
>>>> ---
>>>>  check/mode-common.c | 96 +++++++++++++++++++++++++++++++++++++++------
>>>>  1 file changed, 83 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/check/mode-common.c b/check/mode-common.c
>>>> index 195b6efaa7aa..807d7daf98a6 100644
>>>> --- a/check/mode-common.c
>>>> +++ b/check/mode-common.c
>>>> @@ -836,6 +836,80 @@ int reset_imode(struct btrfs_trans_handle *trans, 
>>>> struct btrfs_root *root,
>>>>    return ret;
>>>>  }
>>>>
>>>> +static int detect_imode(struct btrfs_root *root, struct btrfs_path *path,
>>>> +                  u32 *imode_ret)
>>>> +{
>>>> +  struct btrfs_key key;
>>>> +  struct btrfs_inode_item *iitem;
>>>> +  const u32 priv = 0700;
>>>> +  u64 ino;
>>>> +  u32 imode = S_IFREG;
>>>> +  int ret = 0;
>>>> +
>>>> +  btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>>>> +  ino = key.objectid;
>>>> +  iitem = btrfs_item_ptr(path->nodes[0], path->slots[0],
>>>> +                         struct btrfs_inode_item);
>>>> +
>>>> +  /*
>>>> +   * Both CHR and BLK uses rdev, no way to distinguish them, so fall back
>>>> +   * to BLK.
>>>> +   */
>>>> +  if (btrfs_inode_rdev(path->nodes[0], iitem) != 0) {
>>>> +          imode = S_IFBLK;
>>>> +          goto out;
>>>> +  }
>>>> +
>>>> +  /* root inode */
>>>> +  if (ino == BTRFS_FIRST_FREE_OBJECTID) {
>>>> +          imode = S_IFDIR;
>>>> +          goto out;
>>>> +  }
>>>> +
>>>> +  while (1) {
>>>> +          ret = btrfs_next_item(root, path);
>>>> +          if (ret > 0) {
>>>> +                  /* No determining result found, falls back to REG */
>>>> +                  ret = 0;
>>>> +                  goto out;
>>>> +          }
>>>> +          if (ret < 0)
>>>> +                  goto out;
>>>> +          btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>>>> +          if (key.objectid != ino)
>>>> +                  goto out;
>>>> +
>>>> +          /*
>>>> +           * We ignore some types to make life easier:
>>>> +           * - INODE_REF
>>>> +           *   We need to do a full tree search, which can fail for
>>>> +           *   corrupted fs, but not worthy compared to other easier
>>>> +           *   to determine types.
>>>> +           * - XATTR
>>>> +           *   Both REG and DIR can have xattr, so not useful
>>>> +           */
>>>> +          switch (key.type) {
>>>> +          case BTRFS_DIR_ITEM_KEY:
>>>> +          case BTRFS_DIR_INDEX_KEY:
>>>> +                  imode = S_IFDIR;
>>>> +                  goto out;
>>>> +          case BTRFS_EXTENT_DATA_KEY:
>>>> +                  /*
>>>> +                   * Both REG and LINK could have EXTENT_DATA.
>>>> +                   * We just fall back to REG as user can inspect the
>>>> +                   * content.
>>>> +                   */
>>>> +                  imode = S_IFREG;
>>>> +                  goto out;
>>>> +          }
>>>> +  }
>>>> +
>>>> +out:
>>>> +  /* Set default value even when something wrong happened */
>>>> +  *imode_ret = (imode | priv);
>>>> +  return ret;
>>>> +}
>>>> +
>>>>  /*
>>>>   * Reset the inode mode of the inode specified by @path.
>>>>   *
>>>> @@ -852,22 +926,18 @@ int repair_imode_common(struct btrfs_root *root, 
>>>> struct btrfs_path *path)
>>>>    u32 imode;
>>>>    int ret;
>>>>
>>>> -  if (root->root_key.objectid != BTRFS_ROOT_TREE_OBJECTID) {
>>>> -          error(
>>>> -          "repair inode mode outside of root tree is not supported yet");
>>>> -          return -ENOTTY;
>>>> -  }
>>>>    btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>>>>    ASSERT(key.type == BTRFS_INODE_ITEM_KEY);
>>>> -  if (key.objectid != BTRFS_ROOT_TREE_DIR_OBJECTID &&
>>>> -      !is_fstree(key.objectid)) {
>>>> -          error("unsupported ino %llu", key.objectid);
>>>> -          return -ENOTTY;
>>>> +  if (root->objectid == BTRFS_ROOT_TREE_OBJECTID) {
>>>> +          /* In root tree we only have two possible imode */
>>>> +          if (key.objectid == BTRFS_ROOT_TREE_OBJECTID)
>>>> +                  imode = S_IFDIR | 0755;
>>>> +          else
>>>> +                  imode = S_IFREG | 0600;
>>>> +  } else {
>>>> +          detect_imode(root, path, &imode);
>>>> +          /* Ignore error returned, just use the default value returned */
>>>
>>> Is this safe enough though?
>>
>> It depends. But I'd say if we failed in detect_imode(), then it doesn't
>> matter whatever the type we're putting here.
>>
>>> What if due to an error a directory is
>>> corrected to be file?
>>
>> If a failure happens, it means btrfs-progs fails to read the next leaf.
>> Then it really doesn't make much sense whatever the type is.
>>
>> But I get your point, indeed we should error out without trying out to
>> fix the inode.
>>
>> I'll change this behavior in next version.
>
> IMO the goal of btrfs-progs should be to make a broken filesystem better
> and not replace one type of breakage with potentially another. In such
> cases we ought to break out gracefully.

Totally agreed.

BTW, I'll also implement the INODE_REF lookup method to provide the most
accurate way to recover.

The planned recovery would be:
- INODE_REF lookup
  If everything goes fine, that's it and call it a day.
  If it failed, try the rest.
- INODE rdev lookup
  For this case, either BLK or CHR, but it doesn't really matter nor
  would cause problem.
- DIR_INDEX/ITEM lookup
- EXTENT_DATA lookup
  Both REG and LNK would be possible but falls back to REG won't cause
  further problem.

Thanks,
Qu
>
>>
>> Thanks,
>> Qu
>>
>>> Let's not forget the context we are oprating here
>>> - a broken fs so it's possible (if not likely) that any of the search
>>> functions inside detect_imode could return negative error value. OTOH
>>> I'd expect the transaction commit to fail if that's the case e.g. faulty
>>> device.
>>>
>>>>    }
>>>> -  if (key.objectid == BTRFS_ROOT_TREE_DIR_OBJECTID)
>>>> -          imode = 040755;
>>>> -  else
>>>> -          imode = 0100600;
>>>>
>>>>    trans = btrfs_start_transaction(root, 1);
>>>>    if (IS_ERR(trans)) {
>>>>
>>

Reply via email to