Re: GRUB writing to grubenv outside of kernel fs code

2018-09-17 Thread Chris Murphy
On Mon, Sep 17, 2018 at 11:24 PM, Andrei Borzenkov  wrote:
> 18.09.2018 07:21, Chris Murphy пишет:
>> On Mon, Sep 17, 2018 at 9:44 PM, Chris Murphy  
>> wrote:
>>> https://btrfs.wiki.kernel.org/index.php/FAQ#Does_grub_support_btrfs.3F
>>>
>>> Does anyone know if this is still a problem on Btrfs if grubenv has
>>> xattr +C set? In which case it should be possible to overwrite and
>>> there's no csums that are invalidated.
>>>
>>> I kinda wonder if in 2018 it's specious for, effectively out of tree
>>> code, to be making modifications to the file system, outside of the
>>> file system.
>>
>> a. The bootloader code (pre-boot, not user space setup stuff) would
>> have to know how to read xattr and refuse to overwrite a grubenv
>> lacking xattr +C.
>> b. The bootloader code, would have to have sophisticated enough Btrfs
>> knowledge to know if the grubenv has been reflinked or snapshot,
>> because even if +C, it may not be valid to overwrite, and COW must
>> still happen, and there's no way the code in GRUB can do full blow COW
>> and update a bunch of metadata.
>>
>> So answering my own question, this isn't workable. And it seems the
>> same problem for dm-thin.
>>
>> There are a couple of reserve locations in Btrfs at the start and I
>> think after the first superblock, for bootloader embedding. Possibly
>> one or both of those areas could be used for this so it's outside the
>> file system. But other implementations are going to run into this
>> problem too.
>>
>
> That's what SUSE grub2 version does - it includes patches to redirect
> writes on btrfs to reserved area. I am not sure how it behaves in case
> of multi-device btrfs though.

The patches aren't upstream yet? Will they be?

They redirect writes to grubenv specifically? Or do they use the
reserved areas like a hidden and fixed location for what grubenv would
contain?

I guess the user space grub-editenv could write to grubenv, which even
if COW, GRUB can pick up that change. But GRUB itself writes its
changes to a reserved area.

Hmmm. Complicated.

-- 
Chris Murphy


Re: [PATCH v3 5/7] btrfs-progs: lowmem: do missing check of last item after check_inode_item()

2018-09-17 Thread Qu Wenruo



On 2018/9/17 下午9:24, Su Yue wrote:
> 
> 
> On 2018/9/17 8:53 PM, Qu Wenruo wrote:
>>
>>
>> On 2018/9/17 下午3:28, Su Yue wrote:
>>> After call of check_inode_item(), path may point to the last unchecked
>>> slot of the leaf. The outer walk_up_tree() always treats the position
>>> as checked slot then skips to the next. The last item will never be
>>> checked.
>>>
>>> While checking backrefs, path passed to walk_up_tree() always
>>> points to a checked slot.
>>> While checking fs trees, path passed to walk_up_tree() always
>>> points to an unchecked slot.
>>
>> Can we unify this behavior?
>> I has considered in three ways:
> 1) Change logical of the process_one_leaf. After it returns, path
> points to the next slot checked.
> To unify it, we can use saved key but will cost one search_slot time
> during serval nodes(>=1). Or call btrfs_previous_item() after every time
> check_inode_items() returns.
> 
> But why? why should we cost some time to swing the path. So I abandoned 1).
> 
> 2) Change logical of the check_leaf_items(). What does the function
> is just traverse all items then returns, which seems quite natural.
> So I abandoned it.

Well, this can also be interpreted as "it's a pretty good place to
change the behavior".

IMHO, since check_leaf_items() are just really simple hub functions, it
will just need a btrfs_next_item() in its out: tag.

By that we can unify the behavior of them to all points to the next
*unchecked* slot.
And no need for the extra parameter.

Thanks,
Qu

> 
> 
> 3) It's what the patch does. The extra argument may seems strange,
> I preferred to this way.
> 
> Maybe we can do something after check_leaf_items() returns, is it
> acceptable? I have no idea.
> 
> Thanks,
> Su
> 
>> E.g, always points to an unchecked slot.
>>
>> It would make things easier and no need for the extra parameter.
>>
>> Thanks,
>> Qu
>>
>>>
>>> Solution:
>>> Add an argument @is_checked to walk_up_tree() to decide whether
>>> to skip current slot.
>>>
>>> Fixes: 5e2dc770471b ("btrfs-progs: check: skip shared node or leaf
>>> check for low_memory mode")
>>> Signed-off-by: Su Yue 
>>> ---
>>>   check/mode-lowmem.c | 37 +
>>>   1 file changed, 33 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
>>> index db44456fd85b..612e5e28e45b 100644
>>> --- a/check/mode-lowmem.c
>>> +++ b/check/mode-lowmem.c
>>> @@ -4597,22 +4597,38 @@ static int walk_down_tree(struct btrfs_root
>>> *root, struct btrfs_path *path,
>>>   return err;
>>>   }
>>>   +/*
>>> + * Walk up throuh the path. Make path point to next slot to be checked.
>>> + * walk_down_tree() should be called after this function.
>>> + *
>>> + * @root:    root of the tree
>>> + * @path:    will point to next slot to check for walk_down_tree()
>>> + * @level:    returns with level of next node to be checked
>>> + * @is_checked:    means is the current node checked or not
>>> + *    if false, the slot is unchecked, do not increase the slot
>>> + *    if true, means increase the slot of the current node
>>> + *
>>> + * Returns 0 means success.
>>> + * Returns >0 means the whole loop of walk up/down should be broken.
>>> + */
>>>   static int walk_up_tree(struct btrfs_root *root, struct btrfs_path
>>> *path,
>>> -    int *level)
>>> +    int *level, bool is_checked)
>>>   {
>>>   int i;
>>>   struct extent_buffer *leaf;
>>> +    int skip_cur =s_checked ? 1 : 0;
>>>     for (i =level; i < BTRFS_MAX_LEVEL - 1 && path->nodes[i]; i++) {
>>>   leaf =ath->nodes[i];
>>> -    if (path->slots[i] + 1 < btrfs_header_nritems(leaf)) {
>>> -    path->slots[i]++;
>>> +    if (path->slots[i] + skip_cur < btrfs_header_nritems(leaf)) {
>>> +    path->slots[i] +=kip_cur;
>>>   *level =;
>>>   return 0;
>>>   }
>>>   free_extent_buffer(path->nodes[*level]);
>>>   path->nodes[*level] =ULL;
>>>   *level = + 1;
>>> +    skip_cur =;
>>>   }
>>>   return 1;
>>>   }
>>> @@ -4815,7 +4831,20 @@ static int check_btrfs_root(struct btrfs_root
>>> *root, int check_all)
>>>   break;
>>>   }
>>>   -    ret =alk_up_tree(root, , );
>>> +    /*
>>> + * The logical of walk trees are shared between backrefs
>>> + * check and fs trees check.
>>> + * In checking backrefs(check_all is true), after
>>> + * check_leaf_items() returns, path points to
>>> + * last checked item.
>>> + * In checking fs roots(check_all is false), after
>>> + * process_one_leaf() returns, path points to unchecked item.
>>> + *
>>> + * walk_up_tree() is reponsible to make path point to next
>>> + * slot to be checked, above case is handled in
>>> + * walk_up_tree().
>>> + */
>>> +    ret =alk_up_tree(root, , , check_all);
>>>   if (ret !=) {
>>>   /* Normal exit, reset ret 

Re: GRUB writing to grubenv outside of kernel fs code

2018-09-17 Thread Andrei Borzenkov
18.09.2018 07:21, Chris Murphy пишет:
> On Mon, Sep 17, 2018 at 9:44 PM, Chris Murphy  wrote:
>> https://btrfs.wiki.kernel.org/index.php/FAQ#Does_grub_support_btrfs.3F
>>
>> Does anyone know if this is still a problem on Btrfs if grubenv has
>> xattr +C set? In which case it should be possible to overwrite and
>> there's no csums that are invalidated.
>>
>> I kinda wonder if in 2018 it's specious for, effectively out of tree
>> code, to be making modifications to the file system, outside of the
>> file system.
> 
> a. The bootloader code (pre-boot, not user space setup stuff) would
> have to know how to read xattr and refuse to overwrite a grubenv
> lacking xattr +C.
> b. The bootloader code, would have to have sophisticated enough Btrfs
> knowledge to know if the grubenv has been reflinked or snapshot,
> because even if +C, it may not be valid to overwrite, and COW must
> still happen, and there's no way the code in GRUB can do full blow COW
> and update a bunch of metadata.
> 
> So answering my own question, this isn't workable. And it seems the
> same problem for dm-thin.
> 
> There are a couple of reserve locations in Btrfs at the start and I
> think after the first superblock, for bootloader embedding. Possibly
> one or both of those areas could be used for this so it's outside the
> file system. But other implementations are going to run into this
> problem too.
> 

That's what SUSE grub2 version does - it includes patches to redirect
writes on btrfs to reserved area. I am not sure how it behaves in case
of multi-device btrfs though.

> I'm not sure how else to describe state. If NVRAM is sufficiently wear
> resilient enough to have writes to it possibly every day, for every
> boot, to indicate boot success/fail.
> 



Re: btrfs panic problem

2018-09-17 Thread Duncan
sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted:

> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
> btrfs_get_or_create_delayed_node.
> 
> I found that the freelist of the slub is wrong.

[Not a dev, just a btrfs list regular and user, myself.  But here's a 
general btrfs list recommendations reply...]

You appear to mean kernel 4.1.12 -- confirmed by the version reported in 
the posted dump:  4.1.12-112.14.13.el6uek.x86_64

OK, so from the perspective of this forward-development-focused list, 
kernel 4.1 is pretty ancient history, but you do have a number of options.

First let's consider the general situation.  Most people choose an 
enterprise distro for supported stability, and that's certainly a valid 
thing to want.  However, btrfs, while now reaching early maturity for the 
basics (single device in single or dup mode, and multi-device in single/
raid0/1/10 modes, note that raid56 mode is newer and less mature), 
remains under quite heavy development, and keeping reasonably current is 
recommended for that reason.

So you you chose an enterprise distro presumably to lock in supported 
stability for several years, but you chose a filesystem, btrfs, that's 
still under heavy development, with reasonably current kernels and 
userspace recommended as tending to have the known bugs fixed.  There's a 
bit of a conflict there, and the /general/ recommendation would thus be 
to consider whether one or the other of those choices are inappropriate 
for your use-case, because it's really quite likely that if you really 
want the stability of an enterprise distro and kernel, that btrfs isn't 
as stable a filesystem as you're likely to want to match with it.  
Alternatively, if you want something newer to match the still under heavy 
development btrfs, you very likely want a distro that's not focused on 
years-old stability just for the sake of it.  One or the other is likely 
to be a poor match for your needs, and choosing something else that's a 
better match is likely to be a much better experience for you.

But perhaps you do have reason to want to run the newer and not quite to 
traditional enterprise-distro level stability btrfs, on an otherwise 
older and very stable enterprise distro.  That's fine, provided you know 
what you're getting yourself into, and are prepared to deal with it.

In that case, for best support from the list, we'd recommend running one 
of the latest two kernels in either the current or mainline LTS tracks. 

For current track, With 4.18 being the latest kernel, that'd be 4.18 or 
4.17, as available on kernel.org (tho 4.17 is already EOL, no further 
releases, at 4.17.19).

For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series 
kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18 
and it's just not out of normal stable range yet so not yet marked LTS?), 
so it'll be coming up soon and 4.9 will then be dropping to third LTS 
series and thus out of our best recommended range.  4.4 was the previous 
LTS and while still in LTS support, is outside the two newest LTS series 
that this list recommends.

And of course 4.1 is older than 4.4, so as I said, in btrfs development 
terms, it's quite ancient indeed... quite out of practical support range 
here, tho of course we'll still try, but in many cases the first question 
when any problem's reported is going to be whether it's reproducible on 
something closer to current.

But... you ARE on an enterprise kernel, likely on an enterprise distro, 
and very possibly actually paying /them/ for support.  So you're not 
without options if you prefer to stay with your supported enterprise 
kernel.  If you're paying them for support, you might as well use it, and 
of course of the very many fixes since 4.1, they know what they've 
backported and what they haven't, so they're far better placed to provide 
that support in any case.

Or, given what you posted, you appear to be reasonably able to do at 
least limited kernel-dev-level analysis yourself.  Given that, you're 
already reasonably well placed to simply decide to stick with what you 
have and take the support you can get, diving into things yourself if 
necessary.


So those are your kernel options.  What about userspace btrfs-progs?

Generally speaking, while the filesystem's running, it's the kernel code 
doing most of the work.  If you have old userspace, it simply means you 
can't take advantage of some of the newer features as the old userspace 
doesn't know how to call for them.

But the situation changes as soon as you have problems and can't mount, 
because it's userspace code that runs to try to fix that sort of problem, 
or failing that, it's userspace code that btrfs restore runs to try to 
grab what files can be grabbed off of the unmountable filesystem.

So for routine operation, it's no big deal if userspace is a bit old, at 
least as long as it's new enough to have all the newer command formats, 
etc, that you 

Re: GRUB writing to grubenv outside of kernel fs code

2018-09-17 Thread Chris Murphy
On Mon, Sep 17, 2018 at 9:44 PM, Chris Murphy  wrote:
> https://btrfs.wiki.kernel.org/index.php/FAQ#Does_grub_support_btrfs.3F
>
> Does anyone know if this is still a problem on Btrfs if grubenv has
> xattr +C set? In which case it should be possible to overwrite and
> there's no csums that are invalidated.
>
> I kinda wonder if in 2018 it's specious for, effectively out of tree
> code, to be making modifications to the file system, outside of the
> file system.

a. The bootloader code (pre-boot, not user space setup stuff) would
have to know how to read xattr and refuse to overwrite a grubenv
lacking xattr +C.
b. The bootloader code, would have to have sophisticated enough Btrfs
knowledge to know if the grubenv has been reflinked or snapshot,
because even if +C, it may not be valid to overwrite, and COW must
still happen, and there's no way the code in GRUB can do full blow COW
and update a bunch of metadata.

So answering my own question, this isn't workable. And it seems the
same problem for dm-thin.

There are a couple of reserve locations in Btrfs at the start and I
think after the first superblock, for bootloader embedding. Possibly
one or both of those areas could be used for this so it's outside the
file system. But other implementations are going to run into this
problem too.

I'm not sure how else to describe state. If NVRAM is sufficiently wear
resilient enough to have writes to it possibly every day, for every
boot, to indicate boot success/fail.

-- 
Chris Murphy


GRUB writing to grubenv outside of kernel fs code

2018-09-17 Thread Chris Murphy
https://btrfs.wiki.kernel.org/index.php/FAQ#Does_grub_support_btrfs.3F

Does anyone know if this is still a problem on Btrfs if grubenv has
xattr +C set? In which case it should be possible to overwrite and
there's no csums that are invalidated.

I kinda wonder if in 2018 it's specious for, effectively out of tree
code, to be making modifications to the file system, outside of the
file system.


-- 
Chris Murphy


[PATCH v2] btrfs-progs: change filename limit to 255 when creating subvolume

2018-09-17 Thread Su Yanjun
Modify the file name length limit to meet the Linux naming convention.
In addition, the file name length is always bigger than 0, no need to
compare with 0 again.

Changelog:
v2:
 Fix the same problem in creating snapshot routine.

Issue: #145
Signed-off-by: Su Yanjun 
---

v2: Also fix the same problem in creating snapshot routine.

 cmds-subvolume.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index e7a884af1f5d..5a446c1ae2b4 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -155,7 +155,7 @@ static int cmd_subvol_create(int argc, char **argv)
}
 
len = strlen(newname);
-   if (len == 0 || len >= BTRFS_VOL_NAME_MAX) {
+   if (len > BTRFS_VOL_NAME_MAX) {
error("subvolume name too long: %s", newname);
goto out;
}
@@ -715,7 +715,7 @@ static int cmd_subvol_snapshot(int argc, char **argv)
}
 
len = strlen(newname);
-   if (len == 0 || len >= BTRFS_VOL_NAME_MAX) {
+   if (len > BTRFS_VOL_NAME_MAX) {
error("snapshot name too long '%s'", newname);
goto out;
}
-- 
2.18.0





Re: btrfs panic problem

2018-09-17 Thread sunny.s.zhang

Sorry, modify some errors:

Process A (btrfs_evict_inode)   Process B

call btrfs_remove_delayed_node   call 
btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(>refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);
在 2018年09月18日 08:28, sunny.s.zhang 写道:

Hi All,

My OS(4.1.12) panic in kmem_cache_alloc, which is called by 
btrfs_get_or_create_delayed_node.


I found that the freelist of the slub is wrong.

crash> struct kmem_cache_cpu 887e7d7a24b0

struct kmem_cache_cpu {
  freelist = 0x2026,   <<< the value is id of one inode
  tid = 29567861,
  page = 0xea0132168d00,
  partial = 0x0
}

And, I found there are two different btrfs inodes pointing 
delayed_node. It means that the same slub is used twice.


I think this slub is freed twice, and then the next pointer of this 
slub point itself. So we get the same slub twice.


When use this slub again, that break the freelist.

Folloing code will make the delayed node being freed twice. But I 
don't found what is the process.


Process A (btrfs_evict_inode) Process B

call btrfs_remove_delayed_node call  btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(>refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);


1313 void btrfs_remove_delayed_node(struct inode *inode)
1314 {
1315 struct btrfs_delayed_node *delayed_node;
1316
1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
1318 if (!delayed_node)
1319 return;
1320
1321 BTRFS_I(inode)->delayed_node = NULL;
1322 btrfs_release_delayed_node(delayed_node);
1323 }


  87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct 
inode *inode)

  88 {
  89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
  90 struct btrfs_root *root = btrfs_inode->root;
  91 u64 ino = btrfs_ino(inode);
  92 struct btrfs_delayed_node *node;
  93
  94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
  95 if (node) {
  96 atomic_inc(>refs);
  97 return node;
  98 }


Thanks,

Sunny


PS:



panic informations

PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
 #0 [88130404f940] machine_kexec at 8105ec10
 #1 [88130404f9b0] crash_kexec at 811145b8
 #2 [88130404fa80] oops_end at 8101a868
 #3 [88130404fab0] no_context at 8106ea91
 #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
 #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
 #6 [88130404fb60] __do_page_fault at 8106f328
 #7 [88130404fbd0] do_page_fault at 8106f637
 #8 [88130404fc10] page_fault at 816f6308
    [exception RIP: kmem_cache_alloc+121]
    RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
    RAX:   RBX:   RCX: 01c32b76
    RDX: 01c32b75  RSI:   RDI: 000224b0
    RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
    R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
    R13: 2026  R14: 887e3e230a00  R15: a01abf49
    ORIG_RAX:   CS: 0010  SS: 0018
 #9 [88130404fd10] btrfs_get_or_create_delayed_node at 
a01abf49 [btrfs]
#10 [88130404fd60] btrfs_delayed_update_inode at a01aea12 
[btrfs]

#11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
#12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
#13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
#14 [88130404fe50] touch_atime at 812286d3
#15 [88130404fe90] iterate_dir at 81221929
#16 [88130404fee0] sys_getdents64 at 81221a19
#17 [88130404ff50] system_call_fastpath at 816f2594
    RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
    RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
    RDX: 1000  RSI: 00c83da14000  RDI: 0011
    RBP:    R8:    R9: 
    R10:   R11: 0246  R12: 00c7
    R13: 02174e74  R14: 0555  R15: 0038
    ORIG_RAX: 00d9  CS: 0033  SS: 002b


We also find the list double add informations, including n_list and 
p_list:


[8642921.110568] [ cut here ]
[8642921.167929] WARNING: CPU: 38 PID: 73638 at lib/list_debug.c:33 
__list_add+0xbe/0xd0()
[8642921.263780] list_add corruption. prev->next should be next 
(887e40fa5368), but was ff:ff884c85a36288. 

btrfs panic problem

2018-09-17 Thread sunny.s.zhang

Hi All,

My OS(4.1.12) panic in kmem_cache_alloc, which is called by 
btrfs_get_or_create_delayed_node.


I found that the freelist of the slub is wrong.

crash> struct kmem_cache_cpu 887e7d7a24b0

struct kmem_cache_cpu {
  freelist = 0x2026,   <<< the value is id of one inode
  tid = 29567861,
  page = 0xea0132168d00,
  partial = 0x0
}

And, I found there are two different btrfs inodes pointing delayed_node. 
It means that the same slub is used twice.


I think this slub is freed twice, and then the next pointer of this slub 
point itself. So we get the same slub twice.


When use this slub again, that break the freelist.

Folloing code will make the delayed node being freed twice. But I don't 
found what is the process.


Process A (btrfs_evict_inode) Process B

call btrfs_remove_delayed_node call  btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(>refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);


1313 void btrfs_remove_delayed_node(struct inode *inode)
1314 {
1315 struct btrfs_delayed_node *delayed_node;
1316
1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
1318 if (!delayed_node)
1319 return;
1320
1321 BTRFS_I(inode)->delayed_node = NULL;
1322 btrfs_release_delayed_node(delayed_node);
1323 }


  87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct 
inode *inode)

  88 {
  89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
  90 struct btrfs_root *root = btrfs_inode->root;
  91 u64 ino = btrfs_ino(inode);
  92 struct btrfs_delayed_node *node;
  93
  94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
  95 if (node) {
  96 atomic_inc(>refs);
  97 return node;
  98 }


Thanks,

Sunny


PS:



panic informations

PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
 #0 [88130404f940] machine_kexec at 8105ec10
 #1 [88130404f9b0] crash_kexec at 811145b8
 #2 [88130404fa80] oops_end at 8101a868
 #3 [88130404fab0] no_context at 8106ea91
 #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
 #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
 #6 [88130404fb60] __do_page_fault at 8106f328
 #7 [88130404fbd0] do_page_fault at 8106f637
 #8 [88130404fc10] page_fault at 816f6308
    [exception RIP: kmem_cache_alloc+121]
    RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
    RAX:   RBX:   RCX: 01c32b76
    RDX: 01c32b75  RSI:   RDI: 000224b0
    RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
    R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
    R13: 2026  R14: 887e3e230a00  R15: a01abf49
    ORIG_RAX:   CS: 0010  SS: 0018
 #9 [88130404fd10] btrfs_get_or_create_delayed_node at 
a01abf49 [btrfs]
#10 [88130404fd60] btrfs_delayed_update_inode at a01aea12 
[btrfs]

#11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
#12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
#13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
#14 [88130404fe50] touch_atime at 812286d3
#15 [88130404fe90] iterate_dir at 81221929
#16 [88130404fee0] sys_getdents64 at 81221a19
#17 [88130404ff50] system_call_fastpath at 816f2594
    RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
    RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
    RDX: 1000  RSI: 00c83da14000  RDI: 0011
    RBP:    R8:    R9: 
    R10:   R11: 0246  R12: 00c7
    R13: 02174e74  R14: 0555  R15: 0038
    ORIG_RAX: 00d9  CS: 0033  SS: 002b


We also find the list double add informations, including n_list and p_list:

[8642921.110568] [ cut here ]
[8642921.167929] WARNING: CPU: 38 PID: 73638 at lib/list_debug.c:33 
__list_add+0xbe/0xd0()
[8642921.263780] list_add corruption. prev->next should be next 
(887e40fa5368), but was ff:ff884c85a36288. (prev=884c85a36288).
[8642921.405490] Modules linked in: ipt_MASQUERADE 
nf_nat_masquerade_ipv4 xt_conntrack iptable_filter arc4 ecb ppp_mppe 
ppp_async crc_ccitt ppp_generic slhc nfsv3 nfs_acl rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs fscache lockd sunrpc grace veth xt_nat xt_addrtype 
br_netfilter bridge tcp_diag inet_diag oracleacfs(POE) oracleadvm(POE) 
oracleoks(POE) oracleasm autofs4 dm_queue_length cpufreq_powersave 
be2iscsi iscsi_boot_sysfs 

Re: can anyone receive this message

2018-09-17 Thread sunny.s.zhang

thank you.


在 2018年09月18日 08:18, Hans van Kranenburg 写道:

On 09/18/2018 02:13 AM, sunny.s.zhang wrote:

Can anyone receive this message? I found i can receive message from the
mail list, but can't receive message that send by myself.

Yes.

You can also check public list archives to see if your message shows up,
e.g.:

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchive.com_linux-2Dbtrfs-40vger.kernel.org_msg80664.html=DwICaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=mcYQsljqnoxPHJVaWVFtwsEEDhXdP3ULRlrPW_9etWQ=b5XwlhQeXUvOaYTNqcoiJlDApD6scT_yCUR6HJKQ41k=1QOPjsg35Py77mekTtIcJZGajQsdc3-_xayvhlUUic4=

If it does, then the message reached the list, but your own incoming
mail server might be throwing away email with your own address in the
From, but forwarded via a mailing list?





Re: can anyone receive this message

2018-09-17 Thread Hans van Kranenburg
On 09/18/2018 02:13 AM, sunny.s.zhang wrote:
> 
> Can anyone receive this message? I found i can receive message from the
> mail list, but can't receive message that send by myself.

Yes.

You can also check public list archives to see if your message shows up,
e.g.:

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg80664.html

If it does, then the message reached the list, but your own incoming
mail server might be throwing away email with your own address in the
From, but forwarded via a mailing list?

-- 
Hans van Kranenburg


can anyone receive this message

2018-09-17 Thread sunny.s.zhang

Hi All,

Can anyone receive this message? I found i can receive message from the 
mail list, but can't receive message that send by myself.


Thanks,

Sunny



Re: [PATCH 1/4 v2] btrfs: tests: add separate stub for find_lock_delalloc_range

2018-09-17 Thread Omar Sandoval
On Fri, Sep 14, 2018 at 06:38:44PM +0200, David Sterba wrote:
> The helper find_lock_delalloc_range is now conditionally built static,
> dpending on whether the self-tests are enabled or not. There's a macro
> that is supposed to hide the export, used only once. To discourage
> further use, drop it an add a public wrapper for the helper needed by
> tests.

Reviewed-by: Omar Sandoval 

> Signed-off-by: David Sterba 
> ---
> 
> v2:
> - add noinline_for_stack back
>  fs/btrfs/ctree.h |  6 --
>  fs/btrfs/extent_io.c | 13 -
>  fs/btrfs/extent_io.h |  2 +-
>  fs/btrfs/tests/extent-io-tests.c | 10 +-
>  4 files changed, 18 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 2cddfe7806a4..45b7029d0f23 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -41,12 +41,6 @@ extern struct kmem_cache *btrfs_path_cachep;
>  extern struct kmem_cache *btrfs_free_space_cachep;
>  struct btrfs_ordered_sum;
>  
> -#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
> -#define STATIC noinline
> -#else
> -#define STATIC static noinline
> -#endif
> -
>  #define BTRFS_MAGIC 0x4D5F53665248425FULL /* ascii _BHRfS_M, no null */
>  
>  #define BTRFS_MAX_MIRRORS 3
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 4dd6faab02bb..93108b18b231 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -1568,7 +1568,7 @@ static noinline int lock_delalloc_pages(struct inode 
> *inode,
>   *
>   * 1 is returned if we find something, 0 if nothing was in the tree
>   */
> -STATIC u64 find_lock_delalloc_range(struct inode *inode,
> +static noinline_for_stack u64 find_lock_delalloc_range(struct inode *inode,
>   struct extent_io_tree *tree,
>   struct page *locked_page, u64 *start,
>   u64 *end, u64 max_bytes)
> @@ -1648,6 +1648,17 @@ STATIC u64 find_lock_delalloc_range(struct inode 
> *inode,
>   return found;
>  }
>  
> +#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
> +u64 btrfs_find_lock_delalloc_range(struct inode *inode,
> + struct extent_io_tree *tree,
> + struct page *locked_page, u64 *start,
> + u64 *end, u64 max_bytes)
> +{
> + return find_lock_delalloc_range(inode, tree, locked_page, start, end,
> + max_bytes);
> +}
> +#endif
> +
>  static int __process_pages_contig(struct address_space *mapping,
> struct page *locked_page,
> pgoff_t start_index, pgoff_t end_index,
> diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
> index b4d03e677e1d..1a7fdcbca49b 100644
> --- a/fs/btrfs/extent_io.h
> +++ b/fs/btrfs/extent_io.h
> @@ -546,7 +546,7 @@ int free_io_failure(struct extent_io_tree *failure_tree,
>   struct extent_io_tree *io_tree,
>   struct io_failure_record *rec);
>  #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
> -noinline u64 find_lock_delalloc_range(struct inode *inode,
> +u64 btrfs_find_lock_delalloc_range(struct inode *inode,
> struct extent_io_tree *tree,
> struct page *locked_page, u64 *start,
> u64 *end, u64 max_bytes);
> diff --git a/fs/btrfs/tests/extent-io-tests.c 
> b/fs/btrfs/tests/extent-io-tests.c
> index d9269a531a4d..9e0f4a01be14 100644
> --- a/fs/btrfs/tests/extent-io-tests.c
> +++ b/fs/btrfs/tests/extent-io-tests.c
> @@ -106,7 +106,7 @@ static int test_find_delalloc(u32 sectorsize)
>   set_extent_delalloc(, 0, sectorsize - 1, 0, NULL);
>   start = 0;
>   end = 0;
> - found = find_lock_delalloc_range(inode, , locked_page, ,
> + found = btrfs_find_lock_delalloc_range(inode, , locked_page, ,
>, max_bytes);
>   if (!found) {
>   test_err("should have found at least one delalloc");
> @@ -137,7 +137,7 @@ static int test_find_delalloc(u32 sectorsize)
>   set_extent_delalloc(, sectorsize, max_bytes - 1, 0, NULL);
>   start = test_start;
>   end = 0;
> - found = find_lock_delalloc_range(inode, , locked_page, ,
> + found = btrfs_find_lock_delalloc_range(inode, , locked_page, ,
>, max_bytes);
>   if (!found) {
>   test_err("couldn't find delalloc in our range");
> @@ -171,7 +171,7 @@ static int test_find_delalloc(u32 sectorsize)
>   }
>   start = test_start;
>   end = 0;
> - found = find_lock_delalloc_range(inode, , locked_page, ,
> + found = btrfs_find_lock_delalloc_range(inode, , locked_page, ,
>, max_bytes);
>   if (found) {
>   test_err("found range when we shouldn't have");
> @@ -192,7 +192,7 @@ static int test_find_delalloc(u32 sectorsize)
>   

Re: [PATCH 4/4 v2] btrfs: tests: polish ifdefs around testing helper

2018-09-17 Thread Omar Sandoval
On Fri, Sep 14, 2018 at 06:42:03PM +0200, David Sterba wrote:
> Avoid the inline ifdefs and use two sections for self-tests enabled and
> disabled.
> 
> Though there could be no ifdef and unconditional test_bit of
> BTRFS_FS_STATE_DUMMY_FS_INFO, the static inline can help to optimize out
> any code that would depend on conditions using btrfs_is_testing.
> 
> As this is only for the testing code, drop unlikely().

Reviewed-by: Omar Sandoval 

> Signed-off-by: David Sterba 
> ---
> 
> v2:
> - remove unlikely
> - simplify to: return test_bit(...)
> 
>  fs/btrfs/ctree.h | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 32d2fce4ac53..1656ada9200b 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3708,17 +3708,17 @@ static inline int btrfs_defrag_cancelled(struct 
> btrfs_fs_info *fs_info)
>  #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
>  void btrfs_test_inode_set_ops(struct inode *inode);
>  void btrfs_test_destroy_inode(struct inode *inode);
> -#endif
>  
>  static inline int btrfs_is_testing(struct btrfs_fs_info *fs_info)
>  {
> -#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
> - if (unlikely(test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO,
> -   _info->fs_state)))
> - return 1;
> -#endif
> + return test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, _info->fs_state);
> +}
> +#else
> +static inline int btrfs_is_testing(struct btrfs_fs_info *fs_info)
> +{
>   return 0;
>  }
> +#endif
>  
>  static inline void cond_wake_up(struct wait_queue_head *wq)
>  {
> -- 
> 2.18.0
> 


Re: [PATCH v3 6/7] btrfs-progs: lowmem: optimization and repair for check_inode_extref()

2018-09-17 Thread Nikolay Borisov



On 17.09.2018 10:28, Su Yue wrote:
> inode_extref is much similar with inode_ref, most codes are reused in
> check_inode_extref().
> Exceptions:
> There is no need to check root directory, so remove it.
> Make check_inode_extref() verify hash value with key offset now.
> 
> And lowmem check can detect errors about inode_extref and try to
> repair errors.
> 
> Signed-off-by: Su Yue 
> ---
>  check/mode-lowmem.c | 125 +++-
>  1 file changed, 99 insertions(+), 26 deletions(-)
> 
> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
> index 612e5e28e45b..53e4fdccd740 100644
> --- a/check/mode-lowmem.c
> +++ b/check/mode-lowmem.c
> @@ -1182,37 +1182,86 @@ out:
>   *
>   * Return 0 if no error occurred.
>   */
> -static int check_inode_extref(struct btrfs_root *root,
> -   struct btrfs_key *ref_key,
> -   struct extent_buffer *node, int slot, u64 *refs,
> -   int mode)
> +static int check_inode_extref(struct btrfs_root *root, struct btrfs_key 
> *ref_key,
> +   struct btrfs_path *path, char *name_ret,
> +   u32 *namelen_ret, u64 *refs_ret, int mode)
>  {
>   struct btrfs_key key;
>   struct btrfs_key location;
>   struct btrfs_inode_extref *extref;
> + struct extent_buffer *node;
>   char namebuf[BTRFS_NAME_LEN] = {0};
>   u32 total;
> - u32 cur = 0;
> + u32 cur;
>   u32 len;
>   u32 name_len;
>   u64 index;
>   u64 parent;
> + int err;
> + int tmp_err;
>   int ret;
> - int err = 0;
> + int slot;
> + u64 refs;
> + bool search_again = false;
>  
>   location.objectid = ref_key->objectid;
>   location.type = BTRFS_INODE_ITEM_KEY;
>   location.offset = 0;
> +begin:

Please don't implement loops with labels, instead use the appropriate
loop construct. Otherwise just factor out the repetitive code in a
function and call that in a loop. Labels should ideally be used only for
error cases to terminate the current function.

> + cur = 0;
> + err = 0;
> + refs = *refs_ret;
> + *namelen_ret = 0;
> +
> + /* since after repair, path and the dir item may be changed */
> + if (search_again) {
> + search_again = false;
> + btrfs_release_path(path);
> + ret = btrfs_search_slot(NULL, root, ref_key, path, 0, 0);
> + /*
> +  * The item was deleted, let the path point to the last checked
> +  * item.
> +  */
> + if (ret > 0) {
> + if (path->slots[0] == 0) {
> + ret = btrfs_prev_leaf(root, path);
> + /*
> +  * we are checking the inode item, there must
> +  * be some items before, the case shall never
> +  * happen.
> +  */
> + if (ret > 0)
> + ret = -EIO;
> + if (ret)
> + error(
> + "failed to get item before INODE_REF[%llu, %llu] root %llu: %s",
> +   ref_key->objectid, 
> ref_key->offset,
> +   root->objectid, strerror(-ret));
> + } else {
> + path->slots[0]--;
> + ret = 0;
> + }
> + goto out;
> + } else if (ret < 0) {
> + err |= FATAL_ERROR;
> + goto out;
> + }
> + }
> +
> + node = path->nodes[0];
> + slot = path->slots[0];
>  
>   extref = btrfs_item_ptr(node, slot, struct btrfs_inode_extref);
>   total = btrfs_item_size_nr(node, slot);
>  
> -next:
> - /* update inode ref count */
> - (*refs)++;
> - name_len = btrfs_inode_extref_name_len(node, extref);
> - index = btrfs_inode_extref_index(node, extref);
> +loop:
> + /* Update inode ref count */
> + refs++;
> + tmp_err = 0;
>   parent = btrfs_inode_extref_parent(node, extref);
> + index = btrfs_inode_extref_index(node, extref);
> + name_len = btrfs_inode_extref_name_len(node, extref);
> +
>   if (name_len <= BTRFS_NAME_LEN) {
>   len = name_len;
>   } else {
> @@ -1220,37 +1269,61 @@ next:
>   warning("root %llu INODE_EXTREF[%llu %llu] name too long",
>   root->objectid, ref_key->objectid, ref_key->offset);
>   }
> +
>   read_extent_buffer(node, namebuf, (unsigned long)(extref + 1), len);
>  
> - /* Check root dir ref name */
> - if (index == 0 && strncmp(namebuf, "..", name_len)) {
> - error("root %llu INODE_EXTREF[%llu %llu] ROOT_DIR name 
> shouldn't be %s",
> + /* verify hash value */
> +  

Re: [PATCH v3 5/7] btrfs-progs: lowmem: do missing check of last item after check_inode_item()

2018-09-17 Thread Su Yue




On 2018/9/17 8:53 PM, Qu Wenruo wrote:



On 2018/9/17 下午3:28, Su Yue wrote:

After call of check_inode_item(), path may point to the last unchecked
slot of the leaf. The outer walk_up_tree() always treats the position
as checked slot then skips to the next. The last item will never be
checked.

While checking backrefs, path passed to walk_up_tree() always
points to a checked slot.
While checking fs trees, path passed to walk_up_tree() always
points to an unchecked slot.


Can we unify this behavior?
I has considered in three ways:

1) Change logical of the process_one_leaf. After it returns, path
points to the next slot checked.
To unify it, we can use saved key but will cost one search_slot time 
during serval nodes(>=1). Or call btrfs_previous_item() after every time 
check_inode_items() returns.


But why? why should we cost some time to swing the path. So I abandoned 1).

2) Change logical of the check_leaf_items(). What does the function
is just traverse all items then returns, which seems quite natural.
So I abandoned it.


3) It's what the patch does. The extra argument may seems strange,
I preferred to this way.

Maybe we can do something after check_leaf_items() returns, is it 
acceptable? I have no idea.


Thanks,
Su


E.g, always points to an unchecked slot.

It would make things easier and no need for the extra parameter.

Thanks,
Qu



Solution:
Add an argument @is_checked to walk_up_tree() to decide whether
to skip current slot.

Fixes: 5e2dc770471b ("btrfs-progs: check: skip shared node or leaf check for 
low_memory mode")
Signed-off-by: Su Yue 
---
  check/mode-lowmem.c | 37 +
  1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index db44456fd85b..612e5e28e45b 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -4597,22 +4597,38 @@ static int walk_down_tree(struct btrfs_root *root, 
struct btrfs_path *path,
return err;
  }
  
+/*

+ * Walk up throuh the path. Make path point to next slot to be checked.
+ * walk_down_tree() should be called after this function.
+ *
+ * @root:  root of the tree
+ * @path:  will point to next slot to check for walk_down_tree()
+ * @level: returns with level of next node to be checked
+ * @is_checked:means is the current node checked or not
+ * if false, the slot is unchecked, do not increase the slot
+ * if true, means increase the slot of the current node
+ *
+ * Returns 0 means success.
+ * Returns >0 means the whole loop of walk up/down should be broken.
+ */
  static int walk_up_tree(struct btrfs_root *root, struct btrfs_path *path,
-   int *level)
+   int *level, bool is_checked)
  {
int i;
struct extent_buffer *leaf;
+   int skip_cur =s_checked ? 1 : 0;
  
  	for (i =level; i < BTRFS_MAX_LEVEL - 1 && path->nodes[i]; i++) {

leaf =ath->nodes[i];
-   if (path->slots[i] + 1 < btrfs_header_nritems(leaf)) {
-   path->slots[i]++;
+   if (path->slots[i] + skip_cur < btrfs_header_nritems(leaf)) {
+   path->slots[i] +=kip_cur;
*level =;
return 0;
}
free_extent_buffer(path->nodes[*level]);
path->nodes[*level] =ULL;
*level = + 1;
+   skip_cur =;
}
return 1;
  }
@@ -4815,7 +4831,20 @@ static int check_btrfs_root(struct btrfs_root *root, int 
check_all)
break;
}
  
-		ret =alk_up_tree(root, , );

+   /*
+* The logical of walk trees are shared between backrefs
+* check and fs trees check.
+* In checking backrefs(check_all is true), after
+* check_leaf_items() returns, path points to
+* last checked item.
+* In checking fs roots(check_all is false), after
+* process_one_leaf() returns, path points to unchecked item.
+*
+* walk_up_tree() is reponsible to make path point to next
+* slot to be checked, above case is handled in
+* walk_up_tree().
+*/
+   ret =alk_up_tree(root, , , check_all);
if (ret !=) {
/* Normal exit, reset ret to err */
ret =rr;



Re: [PATCH v3 4/7] btrfs-progs: lowmem: search key of root again after check_fs_root() after repair

2018-09-17 Thread Nikolay Borisov



On 17.09.2018 15:51, Qu Wenruo wrote:
> 
> 
> On 2018/9/17 下午3:28, Su Yue wrote:
>> In check_fs_roots_lowmem(), we do search and follow the resulted path
>> to call check_fs_root(), then call btrfs_next_item() to check next
>> root.
>> However, if repair is enabled, the root tree can be cowed, the
>> existed path can cause strange errors.
>>
>> Solution:
>>   If repair, save the key before calling check_fs_root,
>>   search the saved key again before checking next root.
>>
>> Signed-off-by: Su Yue 
> 
> The idea is pretty valid, however I'm not pretty sure about one practice
> below.
> 
>> ---
>>  check/mode-lowmem.c | 22 ++
>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
>> index 4db12cc7f9fe..db44456fd85b 100644
>> --- a/check/mode-lowmem.c
>> +++ b/check/mode-lowmem.c
>> @@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info 
>> *fs_info)
>>  }
>>  
>>  while (1) {
>> +struct btrfs_key saved_key;
>> +
>>  node = path.nodes[0];
>>  slot = path.slots[0];
>>  btrfs_item_key_to_cpu(node, , slot);
>> +if (repair)
>> +saved_key = key;
> 
> Is this OK? Shouldn't it be something like memcpy(_key, ,
> sizeof(key));
> 
> Or some new C standard make it work?
It's not new, basically assigning one struct to another i.e a value
assign, pretty much results in memory copy. One has to be careful when
the structs contains pointer member because in this case both structures
will point to the same memory.

In this particular case since btrfs_key consists of primitive types
doing a direct assign is fine.
> 
> Although compiler doesn't give any warning on this.
> 
> Thanks,
> Qu
> 
>>  if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
>>  goto out;
>>  if (key.type == BTRFS_ROOT_ITEM_KEY &&
>> @@ -5000,6 +5004,24 @@ int check_fs_roots_lowmem(struct btrfs_fs_info 
>> *fs_info)
>>  err |= ret;
>>  }
>>  next:
>> +/*
>> + * Since root tree can be cowed during repair,
>> + * here search the saved key again.
>> + */
>> +if (repair) {
>> +btrfs_release_path();
>> +ret = btrfs_search_slot(NULL, fs_info->tree_root,
>> +_key, , 0, 0);
>> +/* Repair never deletes trees, search must succeed. */
>> +if (ret > 0)
>> +ret = -ENOENT;
>> +if (ret) {
>> +error(
>> +"search root key[%llu %u %llu] failed after repair: %s",
>> +  saved_key.objectid, saved_key.type,
>> +  saved_key.offset, strerror(-ret));
>> +}
>> +}
>>  ret = btrfs_next_item(tree_root, );
>>  if (ret > 0)
>>  goto out;
>>
> 


Re: [PATCH v3 4/7] btrfs-progs: lowmem: search key of root again after check_fs_root() after repair

2018-09-17 Thread Qu Wenruo



On 2018/9/17 下午8:55, Su Yue wrote:
> 
> 
> On 2018/9/17 8:51 PM, Qu Wenruo wrote:
>>
>>
>> On 2018/9/17 下午3:28, Su Yue wrote:
>>> In check_fs_roots_lowmem(), we do search and follow the resulted path
>>> to call check_fs_root(), then call btrfs_next_item() to check next
>>> root.
>>> However, if repair is enabled, the root tree can be cowed, the
>>> existed path can cause strange errors.
>>>
>>> Solution:
>>>    If repair, save the key before calling check_fs_root,
>>>    search the saved key again before checking next root.
>>>
>>> Signed-off-by: Su Yue 
>>
>> The idea is pretty valid, however I'm not pretty sure about one practice
>> below.
>>
>>> ---
>>>   check/mode-lowmem.c | 22 ++
>>>   1 file changed, 22 insertions(+)
>>>
>>> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
>>> index 4db12cc7f9fe..db44456fd85b 100644
>>> --- a/check/mode-lowmem.c
>>> +++ b/check/mode-lowmem.c
>>> @@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info
>>> *fs_info)
>>>   }
>>>     while (1) {
>>> +    struct btrfs_key saved_key;
>>> +
>>>   node =ath.nodes[0];
>>>   slot =ath.slots[0];
>>>   btrfs_item_key_to_cpu(node, , slot);
>>> +    if (repair)
>>> +    saved_key =ey;
>>
>> Is this OK? Shouldn't it be something like memcpy(_key, ,
>> sizeof(key));
>>
>> Or some new C standard make it work?
>>
> I don't quite know how C89 standard defines this behavior.
> It should work in C99...
> 
> Seems you are not familiar with this style, will use memcpy.

I just want to make sure what's the preferred way, I'm not saying it's
wrong, just wondering if it's OK for btrfs-progs coding style.
(As it indeed saves some chars)

It could completely turn out that I'm just too stubborn to learn new tricks.

Maybe David could answer this better?

Thanks,
Qu

> 
> Thanks,
> Su
>> Although compiler doesn't give any warning on this.
>>
>> Thanks,
>> Qu
>>
>>>   if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
>>>   goto out;
>>>   if (key.type =BTRFS_ROOT_ITEM_KEY &&
>>> @@ -5000,6 +5004,24 @@ int check_fs_roots_lowmem(struct btrfs_fs_info
>>> *fs_info)
>>>   err |=et;
>>>   }
>>>   next:
>>> +    /*
>>> + * Since root tree can be cowed during repair,
>>> + * here search the saved key again.
>>> + */
>>> +    if (repair) {
>>> +    btrfs_release_path();
>>> +    ret =trfs_search_slot(NULL, fs_info->tree_root,
>>> +    _key, , 0, 0);
>>> +    /* Repair never deletes trees, search must succeed. */
>>> +    if (ret > 0)
>>> +    ret =ENOENT;
>>> +    if (ret) {
>>> +    error(
>>> +    "search root key[%llu %u %llu] failed after repair: %s",
>>> +  saved_key.objectid, saved_key.type,
>>> +  saved_key.offset, strerror(-ret));
>>> +    }
>>> +    }
>>>   ret =trfs_next_item(tree_root, );
>>>   if (ret > 0)
>>>   goto out;
>>>


Re: [PATCH v3 6/7] btrfs-progs: lowmem: optimization and repair for check_inode_extref()

2018-09-17 Thread Qu Wenruo



On 2018/9/17 下午3:28, Su Yue wrote:
> inode_extref is much similar with inode_ref, most codes are reused in
> check_inode_extref().
> Exceptions:
> There is no need to check root directory, so remove it.

As root directory should only have one INODE_REF "..", so it shouldn't
have any extref at all.

This part is pretty valid.

> Make check_inode_extref() verify hash value with key offset now.
> 
> And lowmem check can detect errors about inode_extref and try to
> repair errors.
> 
> Signed-off-by: Su Yue 

Looks OK to me.

Reviewed-by: Qu Wenruo 

Thanks,
Qu

> ---
>  check/mode-lowmem.c | 125 +++-
>  1 file changed, 99 insertions(+), 26 deletions(-)
> 
> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
> index 612e5e28e45b..53e4fdccd740 100644
> --- a/check/mode-lowmem.c
> +++ b/check/mode-lowmem.c
> @@ -1182,37 +1182,86 @@ out:
>   *
>   * Return 0 if no error occurred.
>   */
> -static int check_inode_extref(struct btrfs_root *root,
> -   struct btrfs_key *ref_key,
> -   struct extent_buffer *node, int slot, u64 *refs,
> -   int mode)
> +static int check_inode_extref(struct btrfs_root *root, struct btrfs_key 
> *ref_key,
> +   struct btrfs_path *path, char *name_ret,
> +   u32 *namelen_ret, u64 *refs_ret, int mode)
>  {
>   struct btrfs_key key;
>   struct btrfs_key location;
>   struct btrfs_inode_extref *extref;
> + struct extent_buffer *node;
>   char namebuf[BTRFS_NAME_LEN] = {0};
>   u32 total;
> - u32 cur = 0;
> + u32 cur;
>   u32 len;
>   u32 name_len;
>   u64 index;
>   u64 parent;
> + int err;
> + int tmp_err;
>   int ret;
> - int err = 0;
> + int slot;
> + u64 refs;
> + bool search_again = false;
>  
>   location.objectid = ref_key->objectid;
>   location.type = BTRFS_INODE_ITEM_KEY;
>   location.offset = 0;
> +begin:
> + cur = 0;
> + err = 0;
> + refs = *refs_ret;
> + *namelen_ret = 0;
> +
> + /* since after repair, path and the dir item may be changed */
> + if (search_again) {
> + search_again = false;
> + btrfs_release_path(path);
> + ret = btrfs_search_slot(NULL, root, ref_key, path, 0, 0);
> + /*
> +  * The item was deleted, let the path point to the last checked
> +  * item.
> +  */
> + if (ret > 0) {
> + if (path->slots[0] == 0) {
> + ret = btrfs_prev_leaf(root, path);
> + /*
> +  * we are checking the inode item, there must
> +  * be some items before, the case shall never
> +  * happen.
> +  */
> + if (ret > 0)
> + ret = -EIO;
> + if (ret)
> + error(
> + "failed to get item before INODE_REF[%llu, %llu] root %llu: %s",
> +   ref_key->objectid, 
> ref_key->offset,
> +   root->objectid, strerror(-ret));
> + } else {
> + path->slots[0]--;
> + ret = 0;
> + }
> + goto out;
> + } else if (ret < 0) {
> + err |= FATAL_ERROR;
> + goto out;
> + }
> + }
> +
> + node = path->nodes[0];
> + slot = path->slots[0];
>  
>   extref = btrfs_item_ptr(node, slot, struct btrfs_inode_extref);
>   total = btrfs_item_size_nr(node, slot);
>  
> -next:
> - /* update inode ref count */
> - (*refs)++;
> - name_len = btrfs_inode_extref_name_len(node, extref);
> - index = btrfs_inode_extref_index(node, extref);
> +loop:
> + /* Update inode ref count */
> + refs++;
> + tmp_err = 0;
>   parent = btrfs_inode_extref_parent(node, extref);
> + index = btrfs_inode_extref_index(node, extref);
> + name_len = btrfs_inode_extref_name_len(node, extref);
> +
>   if (name_len <= BTRFS_NAME_LEN) {
>   len = name_len;
>   } else {
> @@ -1220,37 +1269,61 @@ next:
>   warning("root %llu INODE_EXTREF[%llu %llu] name too long",
>   root->objectid, ref_key->objectid, ref_key->offset);
>   }
> +
>   read_extent_buffer(node, namebuf, (unsigned long)(extref + 1), len);
>  
> - /* Check root dir ref name */
> - if (index == 0 && strncmp(namebuf, "..", name_len)) {
> - error("root %llu INODE_EXTREF[%llu %llu] ROOT_DIR name 
> shouldn't be %s",
> + /* verify hash value */
> + if (ref_key->offset != btrfs_extref_hash(parent, namebuf, len)) {
> +   

Re: btrfs problems

2018-09-17 Thread Stefan K
> If your primary concern is to make the fs as stable as possible, then
> keep snapshots to a minimal amount, avoid any functionality you won't
> use, like qgroup, routinely balance, RAID5/6.
> 
> And keep the necessary btrfs specific operations to minimal, like
> subvolume/snapshot (and don't keep too many snapshots, say over 20),
> shrink, send/receive.

hehe, that sound like "hey use btrfs, its cool, but please - don't use any 
btrfs specific feature" ;)

best
Stefan


Re: [PATCH v3 4/7] btrfs-progs: lowmem: search key of root again after check_fs_root() after repair

2018-09-17 Thread Su Yue




On 2018/9/17 8:51 PM, Qu Wenruo wrote:



On 2018/9/17 下午3:28, Su Yue wrote:

In check_fs_roots_lowmem(), we do search and follow the resulted path
to call check_fs_root(), then call btrfs_next_item() to check next
root.
However, if repair is enabled, the root tree can be cowed, the
existed path can cause strange errors.

Solution:
   If repair, save the key before calling check_fs_root,
   search the saved key again before checking next root.

Signed-off-by: Su Yue 


The idea is pretty valid, however I'm not pretty sure about one practice
below.


---
  check/mode-lowmem.c | 22 ++
  1 file changed, 22 insertions(+)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 4db12cc7f9fe..db44456fd85b 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
}
  
  	while (1) {

+   struct btrfs_key saved_key;
+
node = path.nodes[0];
slot = path.slots[0];
btrfs_item_key_to_cpu(node, , slot);
+   if (repair)
+   saved_key = key;


Is this OK? Shouldn't it be something like memcpy(_key, ,
sizeof(key));

Or some new C standard make it work?


I don't quite know how C89 standard defines this behavior.
It should work in C99...

Seems you are not familiar with this style, will use memcpy.

Thanks,
Su

Although compiler doesn't give any warning on this.

Thanks,
Qu


if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
goto out;
if (key.type == BTRFS_ROOT_ITEM_KEY &&
@@ -5000,6 +5004,24 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
err |= ret;
}
  next:
+   /*
+* Since root tree can be cowed during repair,
+* here search the saved key again.
+*/
+   if (repair) {
+   btrfs_release_path();
+   ret = btrfs_search_slot(NULL, fs_info->tree_root,
+   _key, , 0, 0);
+   /* Repair never deletes trees, search must succeed. */
+   if (ret > 0)
+   ret = -ENOENT;
+   if (ret) {
+   error(
+   "search root key[%llu %u %llu] failed after repair: %s",
+ saved_key.objectid, saved_key.type,
+ saved_key.offset, strerror(-ret));
+   }
+   }
ret = btrfs_next_item(tree_root, );
if (ret > 0)
goto out;



Re: [PATCH v3 5/7] btrfs-progs: lowmem: do missing check of last item after check_inode_item()

2018-09-17 Thread Qu Wenruo



On 2018/9/17 下午3:28, Su Yue wrote:
> After call of check_inode_item(), path may point to the last unchecked
> slot of the leaf. The outer walk_up_tree() always treats the position
> as checked slot then skips to the next. The last item will never be
> checked.
> 
> While checking backrefs, path passed to walk_up_tree() always
> points to a checked slot.
> While checking fs trees, path passed to walk_up_tree() always
> points to an unchecked slot.

Can we unify this behavior?

E.g, always points to an unchecked slot.

It would make things easier and no need for the extra parameter.

Thanks,
Qu

> 
> Solution:
> Add an argument @is_checked to walk_up_tree() to decide whether
> to skip current slot.
> 
> Fixes: 5e2dc770471b ("btrfs-progs: check: skip shared node or leaf check for 
> low_memory mode")
> Signed-off-by: Su Yue 
> ---
>  check/mode-lowmem.c | 37 +
>  1 file changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
> index db44456fd85b..612e5e28e45b 100644
> --- a/check/mode-lowmem.c
> +++ b/check/mode-lowmem.c
> @@ -4597,22 +4597,38 @@ static int walk_down_tree(struct btrfs_root *root, 
> struct btrfs_path *path,
>   return err;
>  }
>  
> +/*
> + * Walk up throuh the path. Make path point to next slot to be checked.
> + * walk_down_tree() should be called after this function.
> + *
> + * @root:root of the tree
> + * @path:will point to next slot to check for walk_down_tree()
> + * @level:   returns with level of next node to be checked
> + * @is_checked:  means is the current node checked or not
> + *   if false, the slot is unchecked, do not increase the slot
> + *   if true, means increase the slot of the current node
> + *
> + * Returns 0 means success.
> + * Returns >0 means the whole loop of walk up/down should be broken.
> + */
>  static int walk_up_tree(struct btrfs_root *root, struct btrfs_path *path,
> - int *level)
> + int *level, bool is_checked)
>  {
>   int i;
>   struct extent_buffer *leaf;
> + int skip_cur = is_checked ? 1 : 0;
>  
>   for (i = *level; i < BTRFS_MAX_LEVEL - 1 && path->nodes[i]; i++) {
>   leaf = path->nodes[i];
> - if (path->slots[i] + 1 < btrfs_header_nritems(leaf)) {
> - path->slots[i]++;
> + if (path->slots[i] + skip_cur < btrfs_header_nritems(leaf)) {
> + path->slots[i] += skip_cur;
>   *level = i;
>   return 0;
>   }
>   free_extent_buffer(path->nodes[*level]);
>   path->nodes[*level] = NULL;
>   *level = i + 1;
> + skip_cur = 1;
>   }
>   return 1;
>  }
> @@ -4815,7 +4831,20 @@ static int check_btrfs_root(struct btrfs_root *root, 
> int check_all)
>   break;
>   }
>  
> - ret = walk_up_tree(root, , );
> + /*
> +  * The logical of walk trees are shared between backrefs
> +  * check and fs trees check.
> +  * In checking backrefs(check_all is true), after
> +  * check_leaf_items() returns, path points to
> +  * last checked item.
> +  * In checking fs roots(check_all is false), after
> +  * process_one_leaf() returns, path points to unchecked item.
> +  *
> +  * walk_up_tree() is reponsible to make path point to next
> +  * slot to be checked, above case is handled in
> +  * walk_up_tree().
> +  */
> + ret = walk_up_tree(root, , , check_all);
>   if (ret != 0) {
>   /* Normal exit, reset ret to err */
>   ret = err;
> 


Re: [PATCH v3 4/7] btrfs-progs: lowmem: search key of root again after check_fs_root() after repair

2018-09-17 Thread Qu Wenruo



On 2018/9/17 下午3:28, Su Yue wrote:
> In check_fs_roots_lowmem(), we do search and follow the resulted path
> to call check_fs_root(), then call btrfs_next_item() to check next
> root.
> However, if repair is enabled, the root tree can be cowed, the
> existed path can cause strange errors.
> 
> Solution:
>   If repair, save the key before calling check_fs_root,
>   search the saved key again before checking next root.
> 
> Signed-off-by: Su Yue 

The idea is pretty valid, however I'm not pretty sure about one practice
below.

> ---
>  check/mode-lowmem.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
> index 4db12cc7f9fe..db44456fd85b 100644
> --- a/check/mode-lowmem.c
> +++ b/check/mode-lowmem.c
> @@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info 
> *fs_info)
>   }
>  
>   while (1) {
> + struct btrfs_key saved_key;
> +
>   node = path.nodes[0];
>   slot = path.slots[0];
>   btrfs_item_key_to_cpu(node, , slot);
> + if (repair)
> + saved_key = key;

Is this OK? Shouldn't it be something like memcpy(_key, ,
sizeof(key));

Or some new C standard make it work?

Although compiler doesn't give any warning on this.

Thanks,
Qu

>   if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
>   goto out;
>   if (key.type == BTRFS_ROOT_ITEM_KEY &&
> @@ -5000,6 +5004,24 @@ int check_fs_roots_lowmem(struct btrfs_fs_info 
> *fs_info)
>   err |= ret;
>   }
>  next:
> + /*
> +  * Since root tree can be cowed during repair,
> +  * here search the saved key again.
> +  */
> + if (repair) {
> + btrfs_release_path();
> + ret = btrfs_search_slot(NULL, fs_info->tree_root,
> + _key, , 0, 0);
> + /* Repair never deletes trees, search must succeed. */
> + if (ret > 0)
> + ret = -ENOENT;
> + if (ret) {
> + error(
> + "search root key[%llu %u %llu] failed after repair: %s",
> +   saved_key.objectid, saved_key.type,
> +   saved_key.offset, strerror(-ret));
> + }
> + }
>   ret = btrfs_next_item(tree_root, );
>   if (ret > 0)
>   goto out;
> 


Re: [PATCH v3 3/7] btrfs-progs: lowmem check: find dir_item by di_key in check_dir_item()

2018-09-17 Thread Qu Wenruo



On 2018/9/17 下午3:28, Su Yue wrote:
> In check_dir_item, we are going to search corresponding
> dir_item/index.
> 
> Commit 564901eac7a4 ("btrfs-progs: check: introduce
> print_dir_item_err()") Changed argument name from key to di_key but
> forgot to change the key name for dir_item search.
> So @key shouldn't be used here. It should be @di_key.
> Change comment about parameters too.
> 
> To keep compactness, move declarations into while loop in
> check_dir_item().
> 
> Fixes: 564901eac7a4 ("btrfs-progs: check: introduce print_dir_item_err()")
> Signed-off-by: Su Yue 
> ---
>  check/mode-lowmem.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
> index 1bce44f5658a..4db12cc7f9fe 100644
> --- a/check/mode-lowmem.c
> +++ b/check/mode-lowmem.c
> @@ -1529,7 +1529,7 @@ static void print_dir_item_err(struct btrfs_root *root, 
> struct btrfs_key *key,
>   * call find_inode_ref() to check related INODE_REF/INODE_EXTREF.
>   *
>   * @root:the root of the fs/file tree
> - * @key: the key of the INODE_REF/INODE_EXTREF
> + * @di_key:  the key of the dir_item/dir_index
>   * @path:   the path
>   * @size:the st_size of the INODE_ITEM
>   *
> @@ -1540,20 +1540,11 @@ static int check_dir_item(struct btrfs_root *root, 
> struct btrfs_key *di_key,
> struct btrfs_path *path, u64 *size)
>  {
>   struct btrfs_dir_item *di;
> - struct btrfs_inode_item *ii;
> - struct btrfs_key key;
> - struct btrfs_key location;
>   struct extent_buffer *node;
>   int slot;
>   char namebuf[BTRFS_NAME_LEN] = {0};
>   u32 total;
>   u32 cur = 0;
> - u32 len;
> - u32 name_len;
> - u32 data_len;
> - u8 filetype;
> - u32 mode = 0;
> - u64 index;
>   int ret;
>   int err;
>   int tmp_err;
> @@ -1588,6 +1579,15 @@ begin:
>   memset(namebuf, 0, sizeof(namebuf) / sizeof(*namebuf));
>  
>   while (cur < total) {
> + struct btrfs_inode_item *ii;
> + struct btrfs_key key;

If we have several keys, it's better to avoid generic name like @key.

In this case, the @key is only used for find_dir_item(), thus it could
be called @found_dir_key.

Thanks,
Qu

> + struct btrfs_key location;
> + u8 filetype;
> + u32 data_len;
> + u32 name_len;
> + u32 len;
> + u32 mode = 0;
> + u64 index;
>   /*
>* For DIR_ITEM set index to (u64)-1, so that find_inode_ref
>* ignore index check.
> @@ -1658,7 +1658,7 @@ begin:
>  
>   /* check relative INDEX/ITEM */
>   key.objectid = di_key->objectid;
> - if (key.type == BTRFS_DIR_ITEM_KEY) {
> + if (di_key->type == BTRFS_DIR_ITEM_KEY) {
>   key.type = BTRFS_DIR_INDEX_KEY;
>   key.offset = index;
>   } else {
> 


Re: btrfs problems

2018-09-17 Thread Qu Wenruo


On 2018/9/17 下午7:55, Adrian Bastholm wrote:
>> Well, I'd say Debian is really not your first choice for btrfs.
>> The kernel is really old for btrfs.
>>
>> My personal recommend is to use rolling release distribution like
>> vanilla Archlinux, whose kernel is already 4.18.7 now.
> 
> I just upgraded to Debian Testing which has the 4.18 kernel

Then I strongly recommend to use the latest upstream kernel and progs
for btrfs. (thus using Debian Testing)

And if anything went wrong, please report asap to the mail list.

Especially for fs corruption, that's the ghost I'm always chasing for.
So if any corruption happens again (although I hope it won't happen), I
may have a chance to catch it.

> 
>> Anyway, enjoy your stable fs even it's not btrfs anymore.
> 
> My new stable fs is too rigid. Can't grow it, can't shrink it, can't
> remove vdevs from it , so I'm planning a comeback to BTRFS. I guess
> after the dust settled I realize I like the flexibility of BTRFS.
> 
> 
>  This time I'm considering BTRFS as rootfs as well, can I do an
> in-place conversion ? There's this guide
> (https://www.howtoforge.com/how-to-convert-an-ext3-ext4-root-file-system-to-btrfs-on-ubuntu-12.10)
> I was planning on following.

Btrfs-convert is recommended mostly for short term trial (the ability to
rollback to ext* without anything modified)

From the code aspect, the biggest difference is the chunk layout.
Due to the ext* block group usage, each block group header (except some
sparse bg) is always used, thus btrfs can't use them.

This leads to highly fragmented chunk layout.
We doesn't have error report about such layout yet, but if you want
everything to be as stable as possible, I still recommend to use a newly
created fs.

> 
> Another thing is I'd like to see a "first steps after getting started
> " section in the wiki. Something like take your first snapshot, back
> up, how to think when running it - can i just set some cron jobs and
> forget about it, or does it need constant attention, and stuff like
> that.

There are projects do such things automatically, like snapper.

If your primary concern is to make the fs as stable as possible, then
keep snapshots to a minimal amount, avoid any functionality you won't
use, like qgroup, routinely balance, RAID5/6.

And keep the necessary btrfs specific operations to minimal, like
subvolume/snapshot (and don't keep too many snapshots, say over 20),
shrink, send/receive.

Thanks,
Qu

> 
> BR Adrian
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: btrfs problems

2018-09-17 Thread Adrian Bastholm
> Well, I'd say Debian is really not your first choice for btrfs.
> The kernel is really old for btrfs.
>
> My personal recommend is to use rolling release distribution like
> vanilla Archlinux, whose kernel is already 4.18.7 now.

I just upgraded to Debian Testing which has the 4.18 kernel

> Anyway, enjoy your stable fs even it's not btrfs anymore.

My new stable fs is too rigid. Can't grow it, can't shrink it, can't
remove vdevs from it , so I'm planning a comeback to BTRFS. I guess
after the dust settled I realize I like the flexibility of BTRFS.


 This time I'm considering BTRFS as rootfs as well, can I do an
in-place conversion ? There's this guide
(https://www.howtoforge.com/how-to-convert-an-ext3-ext4-root-file-system-to-btrfs-on-ubuntu-12.10)
I was planning on following.

Another thing is I'd like to see a "first steps after getting started
" section in the wiki. Something like take your first snapshot, back
up, how to think when running it - can i just set some cron jobs and
forget about it, or does it need constant attention, and stuff like
that.

BR Adrian


-- 
Vänliga hälsningar / Kind regards,
Adrian Bastholm

``I would change the world, but they won't give me the sourcecode``


Re: Move data and mount point to subvolume

2018-09-17 Thread Rory Campbell-Lange
On 16/09/18, Chris Murphy (li...@colorremedies.com) wrote:
> On Sun, Sep 16, 2018 at 12:40 PM, Rory Campbell-Lange
>  wrote:

> > I'm a bit confused about the difference between / and backup, which is
> > at /bkp/backup.
> 
> top level, subvolid=5, subvolid=0, subvol=/, FS_TREE are all the same
> thing. This is the subvolume that's created at mkfs time, it has no
> name, it can't be deleted, and at mkfs time if you do
> 
> # btrfs sub get-default 
> ID 5 (FS_TREE)
> 
> So long as you haven't changed the default subvolume, the top level
> subvolume is what gets mounted, unless you use "-o subvol=" or "-o
> subvolid=" mount option.
> 
> If you do
> # btrfs sub list -ta /bkp
> 
> It might become a bit more clear what the layout is on disk. And for
> an even more verbose output you can do:
> 
> # btrfs insp dump-t -t fs_tree /dev/### for this you need to
> specify device not mountpoint, you don't need to umount, it's a read
> only command
> 
> Anything that in the "top level" or the "file system root" you will
> see listed. The first number is the inode, you'll see 256 is a special
> inode for subvolumes. You can do 'ls -li' and compare. Any subvolume
> you create is not FS_TREE, it is a "file tree". And note that each
> subvolume has it's own pile of inode numbers, meaning
> files/directories only have unique inode numbers *in a given
> subvolume*. Those inode numbers start over in a new subvolume.
> 
> Subvolumes share extent, chunk, csum, uuid and other trees, so a
> subvolume is not a completely isolated "file system".

Thanks for the wonderfully clear explanation.

> > Anyhow I've verified I can snapshot /bkp/backup to another subvolume.
> > This means I don't need to move anything, simply remount /bkp at
> > /bkp/backup.
> 
> Uhh, that's the reverse of what you said in the first message. I'm not
> sure what you want to do. It sounds like you want to mount the
> subvolume "backup" at /bkp/ so that all the other files/dirs on this
> Btrfs volume are not visible through the /bkp/ mount path?
> 
> Anyway if you want to explicitly mount the subvolume "backup"
> somewhere, you use -o subvol=backup to specify "the subvolume named
> backup, not the top level subvolume".

Yes, apologies, my intentions changed after you pointed out /bkp/backup
was a subvolume. My simple aim is to mount backup at /bkp.

Thanks for the note about the mount subvolume option.

> > Presumably I can therefore remount /bkp at subvolume /backup?
> >
> > # btrfs subvolume show /bkp/backup | egrep -i 'name|uuid|subvol'
> > Name:   backup
> > UUID:   d17cf2ca-a6db-ca43-8054-1fd76533e84b
> > Parent UUID:-
> > Received UUID:  -
> > Subvolume ID:   258
> >
> > My fstab is presently
> >
> > UUID=da90602a-b98e-4f0b-959a-ce431ac0cdfa /bkp  btrfs  
> > noauto,noatime,compress=lzo 0  2
> >
> > I guess it would now be
> >
> > UUID=d17cf2ca-a6db-ca43-8054-1fd76533e84b /bkp  btrfs  
> > noauto,noatime,compress=lzo 0  2
> 
> No you can't mount by subvolume UUID. You continue to specify the
> volume UUID, but then add a mount option
> 
> 
> noauto,noatime,compress=lzo,subvol=backup
> 
> or
> 
> noauto,noatime,compress=lzo,subvolid=258
> 
> The advantage of subvolid is that it doesn't change when you rename
> the subvolume.

That is terrific advice; thanks a lot.

Rory


[PATCH v3 4/7] btrfs-progs: lowmem: search key of root again after check_fs_root() after repair

2018-09-17 Thread Su Yue
In check_fs_roots_lowmem(), we do search and follow the resulted path
to call check_fs_root(), then call btrfs_next_item() to check next
root.
However, if repair is enabled, the root tree can be cowed, the
existed path can cause strange errors.

Solution:
  If repair, save the key before calling check_fs_root,
  search the saved key again before checking next root.

Signed-off-by: Su Yue 
---
 check/mode-lowmem.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 4db12cc7f9fe..db44456fd85b 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -4967,9 +4967,13 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
}
 
while (1) {
+   struct btrfs_key saved_key;
+
node = path.nodes[0];
slot = path.slots[0];
btrfs_item_key_to_cpu(node, , slot);
+   if (repair)
+   saved_key = key;
if (key.objectid > BTRFS_LAST_FREE_OBJECTID)
goto out;
if (key.type == BTRFS_ROOT_ITEM_KEY &&
@@ -5000,6 +5004,24 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info)
err |= ret;
}
 next:
+   /*
+* Since root tree can be cowed during repair,
+* here search the saved key again.
+*/
+   if (repair) {
+   btrfs_release_path();
+   ret = btrfs_search_slot(NULL, fs_info->tree_root,
+   _key, , 0, 0);
+   /* Repair never deletes trees, search must succeed. */
+   if (ret > 0)
+   ret = -ENOENT;
+   if (ret) {
+   error(
+   "search root key[%llu %u %llu] failed after repair: %s",
+ saved_key.objectid, saved_key.type,
+ saved_key.offset, strerror(-ret));
+   }
+   }
ret = btrfs_next_item(tree_root, );
if (ret > 0)
goto out;
-- 
2.17.1





[PATCH v3 7/7] btrfs-progs: fsck-tests: add test case inode_extref without dir_item and dir_index

2018-09-17 Thread Su Yue
Inode 257 is only with inode_extref without inode_ref.
And This case contains an inode_extref:
==
...
   item 1 key (257 INODE_EXTREF 3460996356) itemoff 3947 itemsize 24
index 257 parent 256 namelen 6 name: foo255
...
==
The related dir_item and dir_index are missing.

Add the case to ensure both original and lowmem mode check can handle
the case of inode_extref.

Lowmem part is supported since patch named
'btrfs-progs: lowmem: optimization and repair for check_inode_extref()'.

Rename default_case.img to inode_ref_without_dir_item_and_index.img.

Signed-off-by: Su Yue 
---
 .../inode_extref_without_dir_item_and_index.img  | Bin 0 -> 4096 bytes
 ... => inode_ref_without_dir_item_and_index.img} | Bin
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 
tests/fsck-tests/009-no-dir-item-or-index/inode_extref_without_dir_item_and_index.img
 rename tests/fsck-tests/009-no-dir-item-or-index/{default_case.img => 
inode_ref_without_dir_item_and_index.img} (100%)

diff --git 
a/tests/fsck-tests/009-no-dir-item-or-index/inode_extref_without_dir_item_and_index.img
 
b/tests/fsck-tests/009-no-dir-item-or-index/inode_extref_without_dir_item_and_index.img
new file mode 100644
index 
..769dabe9bdeff9b9582a49b6de9a7ff14be9b389
GIT binary patch
literal 4096
zcmeHJeKgeBAO4OpjF(})d7VKBADZ0yK17=|%Q8B2-Kcpp@RG?l?%nb9C`8B}DJ
zyro1NiA>(MvKp;QDlA3VijWBNV_(kso|c|MrjWx#zjhz31F}?(^Jpp3glUm>V
z@7WCe*<{v6(C>YX%p+%IU59x3Is(<}z;Iqiz({%xivQphQfnL!U)v7=7uVrS
z`{KYC2mZ|tum%bn2b!}%$_;S$Taiz+ecOBf6xfaFr%w
zkPPF%J;*=`PgWr6lY(ZzHs6D0Ymhs5x|j_XzRhLdGv2
z^fe#EOw^`7O`e#~gt2;GtH>5lOv>qQHP3%5}dDM^p8g5T^#>-PhI0Qt<
zP*)`4K@aO+ah{Bv2|*2#VB~Bk@~AO+)J!^llLl(HA2sWMLHX`*Z_l@+cmpf?TgUkN
zK67z_-BlwH1$2{?^dNm!n+T=UY?858db24|H40q{}2MYv9@9*GmNO#)_9HC
zsv|7_QQF%NXQjDWiuBos$*GXGB|3q%ha@mb%?^kc@J*Ts~=>;1czG;IY+Omo`
z7&(VAIWV}rKQBg65mc_qFluvj>?6^8V;lI@UF#(~02IJ18djSmV
zUriED#qnuqyZv9)xHV1;4_WPbmgh;DE$t!(YJ(|KV8Sdyu0VETi)!dbp9RP;{sE}!
zM0g7zC0x{8Ba47FB3teSY8iN8JGV*#oPPFLUK6(BtF_~E!eF3(5VnQRX2HePT8_+2zIhTBvtRv4`CxO{>V=(Nl^!EMrd^MR?5UBKvi^Yxp0h4(x0T{
zt4zmqi^4~%w~d-sB+>_ev|q_JbhuQm@l35~_NNO)F$W1_(M(qZ5{R$f~k3B}#qLX{oxBbW~o;NB09+&}3Klg%HIwuAx}W
zYfywGaw%dwubh)?XA3bsq+8~(T6N$R97EB|;JO(QY{Fx6^8I7~v_W!|nYD1P|M*Wi
zmQ5M+O%ze=mFu&^S__T1XyRg&$idCbJtE=ArR(6Z;kZ4nnf3EilR`U!nSRJo7x|yN
zH(e{tGG!s$
z+vo1g)$+YEyThZu)|3x>wK=q6kGP=cg!hYE_^OpSO7tg5a8}>_p--PuXb))vXPjI~
zHogj`Er-x6C|;@f-3VdN+|UxD!-Vk#fC;sUjt3~}I+PIwN;StY^u
z@f?gu&4*w2?82#)>^#QzJ)0f~6K)CQmhmw;GeO)CM8|Xc0ayMtOr>21PP@dY645aj
z2EE5AH-ZRJ+>=5^4p(CKoA*Z1i)ruZO{%d%)SOcdp$0{y-rbZ=0IRB7r
zBszP`T4X=HT159!>{wP*)PlModhmE?JhF!@2X^HFcnZSB7c7_{t{J?m@-e-|Nob
za%?Tb!@+?F@Z&2l>gyr@0*7EO0|hc{0s~E^wSOj_+^x`f!u_x0EzTf?QRDZmdQXho$T8Rd=-OC{F
zB>E>IkLZngJ+L*k6dqr~97P-SjuGegDm$!5!0q!AAFo);m#r3n_GRW+5aSGRnu6VT
zr+SyvPMj_*1r?u`!V*Q1L_>$wn5rne9Mil>RU`YsVzDZOQ%+15VYS5Avv;r>
z#SZnYSVFM_whgOP?66$WD~HTjTR_oFBB0=p5lGiJ%9%~-8F!cePhDR7a!iYW
z==#B7nLTrZ1Xq0pB{lK*^WLUBZ2@!>*^DFW1yaC3s$Jb$6KViL9Nz^LbFCP3szGEL
zAq@5JR2jxY0QK|ZcUc2dJ!)=#tzaWtjd#!IJBunXbvJvP4N?U8-^yd5E
zQH;rzhq*&@(FX=4$CJ>N8EQLCOmwP}^Op)UI6j~(s4{=%f^xeLqam|2mQZ(hrt95b
zmj7x_Pe}CnT(w>k!D>zI4DuEK;@4TaAIQDWZI7LzwP^48{Y_!O#I~)D|K^|Qm%IHx
G9QY4={DzYN

literal 0
HcmV?d1

diff --git a/tests/fsck-tests/009-no-dir-item-or-index/default_case.img 
b/tests/fsck-tests/009-no-dir-item-or-index/inode_ref_without_dir_item_and_index.img
similarity index 100%
rename from tests/fsck-tests/009-no-dir-item-or-index/default_case.img
rename to 
tests/fsck-tests/009-no-dir-item-or-index/inode_ref_without_dir_item_and_index.img
-- 
2.17.1





[PATCH v3 1/7] btrfs-progs: adjust arguments of btrfs_lookup_inode_extref()

2018-09-17 Thread Su Yue
The argument index is not used in btrfs_lookup_inode_extref(),
so remove it.
And adjust positions its arguments to make it consistent with
kernel part.

No functional change.

Fixes: 260675657767 ("btrfs-progs: Import btrfs_insert/del/lookup_extref() 
functions.")
Signed-off-by: Su Yue 
Reviewed-by: Qu Wenruo 
---
 ctree.h  | 9 +
 inode-item.c | 9 +
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/ctree.h b/ctree.h
index 2a2437070ef9..541055a335c2 100644
--- a/ctree.h
+++ b/ctree.h
@@ -2710,10 +2710,11 @@ int btrfs_insert_inode(struct btrfs_trans_handle 
*trans, struct btrfs_root
 int btrfs_lookup_inode(struct btrfs_trans_handle *trans, struct btrfs_root
   *root, struct btrfs_path *path,
   struct btrfs_key *location, int mod);
-struct btrfs_inode_extref *btrfs_lookup_inode_extref(struct btrfs_trans_handle
-   *trans, struct btrfs_path *path, struct btrfs_root *root,
-   u64 ino, u64 parent_ino, u64 index, const char *name,
-   int namelen, int ins_len);
+struct btrfs_inode_extref *
+btrfs_lookup_inode_extref(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root, struct btrfs_path *path,
+ const char *name, int namelen, u64 ino,
+ u64 parent_ino, int ins_len);
 int btrfs_del_inode_extref(struct btrfs_trans_handle *trans,
   struct btrfs_root *root,
   const char *name, int name_len,
diff --git a/inode-item.c b/inode-item.c
index 1cc106670cd4..ca0052bc2f87 100644
--- a/inode-item.c
+++ b/inode-item.c
@@ -227,10 +227,11 @@ static int btrfs_find_name_in_ext_backref(struct 
btrfs_path *path,
return 0;
 }
 
-struct btrfs_inode_extref *btrfs_lookup_inode_extref(struct btrfs_trans_handle
-   *trans, struct btrfs_path *path, struct btrfs_root *root,
-   u64 ino, u64 parent_ino, u64 index, const char *name,
-   int namelen, int ins_len)
+struct btrfs_inode_extref *
+btrfs_lookup_inode_extref(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root, struct btrfs_path *path,
+ const char *name, int namelen, u64 ino,
+ u64 parent_ino, int ins_len)
 {
struct btrfs_key key;
struct btrfs_inode_extref *extref;
-- 
2.17.1





[PATCH v3 6/7] btrfs-progs: lowmem: optimization and repair for check_inode_extref()

2018-09-17 Thread Su Yue
inode_extref is much similar with inode_ref, most codes are reused in
check_inode_extref().
Exceptions:
There is no need to check root directory, so remove it.
Make check_inode_extref() verify hash value with key offset now.

And lowmem check can detect errors about inode_extref and try to
repair errors.

Signed-off-by: Su Yue 
---
 check/mode-lowmem.c | 125 +++-
 1 file changed, 99 insertions(+), 26 deletions(-)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 612e5e28e45b..53e4fdccd740 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -1182,37 +1182,86 @@ out:
  *
  * Return 0 if no error occurred.
  */
-static int check_inode_extref(struct btrfs_root *root,
- struct btrfs_key *ref_key,
- struct extent_buffer *node, int slot, u64 *refs,
- int mode)
+static int check_inode_extref(struct btrfs_root *root, struct btrfs_key 
*ref_key,
+ struct btrfs_path *path, char *name_ret,
+ u32 *namelen_ret, u64 *refs_ret, int mode)
 {
struct btrfs_key key;
struct btrfs_key location;
struct btrfs_inode_extref *extref;
+   struct extent_buffer *node;
char namebuf[BTRFS_NAME_LEN] = {0};
u32 total;
-   u32 cur = 0;
+   u32 cur;
u32 len;
u32 name_len;
u64 index;
u64 parent;
+   int err;
+   int tmp_err;
int ret;
-   int err = 0;
+   int slot;
+   u64 refs;
+   bool search_again = false;
 
location.objectid = ref_key->objectid;
location.type = BTRFS_INODE_ITEM_KEY;
location.offset = 0;
+begin:
+   cur = 0;
+   err = 0;
+   refs = *refs_ret;
+   *namelen_ret = 0;
+
+   /* since after repair, path and the dir item may be changed */
+   if (search_again) {
+   search_again = false;
+   btrfs_release_path(path);
+   ret = btrfs_search_slot(NULL, root, ref_key, path, 0, 0);
+   /*
+* The item was deleted, let the path point to the last checked
+* item.
+*/
+   if (ret > 0) {
+   if (path->slots[0] == 0) {
+   ret = btrfs_prev_leaf(root, path);
+   /*
+* we are checking the inode item, there must
+* be some items before, the case shall never
+* happen.
+*/
+   if (ret > 0)
+   ret = -EIO;
+   if (ret)
+   error(
+   "failed to get item before INODE_REF[%llu, %llu] root %llu: %s",
+ ref_key->objectid, 
ref_key->offset,
+ root->objectid, strerror(-ret));
+   } else {
+   path->slots[0]--;
+   ret = 0;
+   }
+   goto out;
+   } else if (ret < 0) {
+   err |= FATAL_ERROR;
+   goto out;
+   }
+   }
+
+   node = path->nodes[0];
+   slot = path->slots[0];
 
extref = btrfs_item_ptr(node, slot, struct btrfs_inode_extref);
total = btrfs_item_size_nr(node, slot);
 
-next:
-   /* update inode ref count */
-   (*refs)++;
-   name_len = btrfs_inode_extref_name_len(node, extref);
-   index = btrfs_inode_extref_index(node, extref);
+loop:
+   /* Update inode ref count */
+   refs++;
+   tmp_err = 0;
parent = btrfs_inode_extref_parent(node, extref);
+   index = btrfs_inode_extref_index(node, extref);
+   name_len = btrfs_inode_extref_name_len(node, extref);
+
if (name_len <= BTRFS_NAME_LEN) {
len = name_len;
} else {
@@ -1220,37 +1269,61 @@ next:
warning("root %llu INODE_EXTREF[%llu %llu] name too long",
root->objectid, ref_key->objectid, ref_key->offset);
}
+
read_extent_buffer(node, namebuf, (unsigned long)(extref + 1), len);
 
-   /* Check root dir ref name */
-   if (index == 0 && strncmp(namebuf, "..", name_len)) {
-   error("root %llu INODE_EXTREF[%llu %llu] ROOT_DIR name 
shouldn't be %s",
+   /* verify hash value */
+   if (ref_key->offset != btrfs_extref_hash(parent, namebuf, len)) {
+   err |= FATAL_ERROR;
+   error("root %llu INODE_EXTREF[%llu %llu] name %s namelen %u 
mode %d mismatch with its hash, wanted %llu have %llu",
  root->objectid, ref_key->objectid, ref_key->offset,
- namebuf);
-   err |= 

[PATCH v3 2/7] btrfs-progs: make btrfs_unlink() lookup inode_extref

2018-09-17 Thread Su Yue
btrfs_unlink() uses btrfs_lookup_inode_ref() to look up inode_ref
but forget inode_extref case.

Let btrfs_unlink() call btrfs_lookup_inode_extref() if inode_ref is
found and EXTENDED_IREF feature is enabled.

Fixes: 0cc75eddd093 ("btrfs-progs: Add btrfs_unlink() and btrfs_add_link() 
functions.")
Signed-off-by: Su Yue 
Reviewed-by: Qu Wenruo 
---
 inode.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/inode.c b/inode.c
index 2398bca4a109..598ad0ab6b4c 100644
--- a/inode.c
+++ b/inode.c
@@ -277,6 +277,7 @@ int btrfs_unlink(struct btrfs_trans_handle *trans, struct 
btrfs_root *root,
struct btrfs_key key;
struct btrfs_inode_item *inode_item;
struct btrfs_inode_ref *inode_ref;
+   struct btrfs_inode_extref *inode_extref = NULL;
struct btrfs_dir_item *dir_item;
u64 inode_size;
u32 nlinks;
@@ -296,7 +297,18 @@ int btrfs_unlink(struct btrfs_trans_handle *trans, struct 
btrfs_root *root,
ret = PTR_ERR(inode_ref);
goto out;
}
-   if (inode_ref)
+
+   if (!inode_ref && btrfs_fs_incompat(root->fs_info, EXTENDED_IREF)) {
+   btrfs_release_path(path);
+   inode_extref = btrfs_lookup_inode_extref(trans, root, path,
+name, namelen, ino, parent_ino, 0);
+   if (IS_ERR(inode_extref)) {
+   ret = PTR_ERR(inode_extref);
+   goto out;
+   }
+   }
+
+   if (inode_ref || inode_extref)
del_inode_ref = 1;
btrfs_release_path(path);
 
-- 
2.17.1





[PATCH v3 5/7] btrfs-progs: lowmem: do missing check of last item after check_inode_item()

2018-09-17 Thread Su Yue
After call of check_inode_item(), path may point to the last unchecked
slot of the leaf. The outer walk_up_tree() always treats the position
as checked slot then skips to the next. The last item will never be
checked.

While checking backrefs, path passed to walk_up_tree() always
points to a checked slot.
While checking fs trees, path passed to walk_up_tree() always
points to an unchecked slot.

Solution:
Add an argument @is_checked to walk_up_tree() to decide whether
to skip current slot.

Fixes: 5e2dc770471b ("btrfs-progs: check: skip shared node or leaf check for 
low_memory mode")
Signed-off-by: Su Yue 
---
 check/mode-lowmem.c | 37 +
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index db44456fd85b..612e5e28e45b 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -4597,22 +4597,38 @@ static int walk_down_tree(struct btrfs_root *root, 
struct btrfs_path *path,
return err;
 }
 
+/*
+ * Walk up throuh the path. Make path point to next slot to be checked.
+ * walk_down_tree() should be called after this function.
+ *
+ * @root:  root of the tree
+ * @path:  will point to next slot to check for walk_down_tree()
+ * @level: returns with level of next node to be checked
+ * @is_checked:means is the current node checked or not
+ * if false, the slot is unchecked, do not increase the slot
+ * if true, means increase the slot of the current node
+ *
+ * Returns 0 means success.
+ * Returns >0 means the whole loop of walk up/down should be broken.
+ */
 static int walk_up_tree(struct btrfs_root *root, struct btrfs_path *path,
-   int *level)
+   int *level, bool is_checked)
 {
int i;
struct extent_buffer *leaf;
+   int skip_cur = is_checked ? 1 : 0;
 
for (i = *level; i < BTRFS_MAX_LEVEL - 1 && path->nodes[i]; i++) {
leaf = path->nodes[i];
-   if (path->slots[i] + 1 < btrfs_header_nritems(leaf)) {
-   path->slots[i]++;
+   if (path->slots[i] + skip_cur < btrfs_header_nritems(leaf)) {
+   path->slots[i] += skip_cur;
*level = i;
return 0;
}
free_extent_buffer(path->nodes[*level]);
path->nodes[*level] = NULL;
*level = i + 1;
+   skip_cur = 1;
}
return 1;
 }
@@ -4815,7 +4831,20 @@ static int check_btrfs_root(struct btrfs_root *root, int 
check_all)
break;
}
 
-   ret = walk_up_tree(root, , );
+   /*
+* The logical of walk trees are shared between backrefs
+* check and fs trees check.
+* In checking backrefs(check_all is true), after
+* check_leaf_items() returns, path points to
+* last checked item.
+* In checking fs roots(check_all is false), after
+* process_one_leaf() returns, path points to unchecked item.
+*
+* walk_up_tree() is reponsible to make path point to next
+* slot to be checked, above case is handled in
+* walk_up_tree().
+*/
+   ret = walk_up_tree(root, , , check_all);
if (ret != 0) {
/* Normal exit, reset ret to err */
ret = err;
-- 
2.17.1





[PATCH v3 3/7] btrfs-progs: lowmem check: find dir_item by di_key in check_dir_item()

2018-09-17 Thread Su Yue
In check_dir_item, we are going to search corresponding
dir_item/index.

Commit 564901eac7a4 ("btrfs-progs: check: introduce
print_dir_item_err()") Changed argument name from key to di_key but
forgot to change the key name for dir_item search.
So @key shouldn't be used here. It should be @di_key.
Change comment about parameters too.

To keep compactness, move declarations into while loop in
check_dir_item().

Fixes: 564901eac7a4 ("btrfs-progs: check: introduce print_dir_item_err()")
Signed-off-by: Su Yue 
---
 check/mode-lowmem.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c
index 1bce44f5658a..4db12cc7f9fe 100644
--- a/check/mode-lowmem.c
+++ b/check/mode-lowmem.c
@@ -1529,7 +1529,7 @@ static void print_dir_item_err(struct btrfs_root *root, 
struct btrfs_key *key,
  * call find_inode_ref() to check related INODE_REF/INODE_EXTREF.
  *
  * @root:  the root of the fs/file tree
- * @key:   the key of the INODE_REF/INODE_EXTREF
+ * @di_key:the key of the dir_item/dir_index
  * @path:   the path
  * @size:  the st_size of the INODE_ITEM
  *
@@ -1540,20 +1540,11 @@ static int check_dir_item(struct btrfs_root *root, 
struct btrfs_key *di_key,
  struct btrfs_path *path, u64 *size)
 {
struct btrfs_dir_item *di;
-   struct btrfs_inode_item *ii;
-   struct btrfs_key key;
-   struct btrfs_key location;
struct extent_buffer *node;
int slot;
char namebuf[BTRFS_NAME_LEN] = {0};
u32 total;
u32 cur = 0;
-   u32 len;
-   u32 name_len;
-   u32 data_len;
-   u8 filetype;
-   u32 mode = 0;
-   u64 index;
int ret;
int err;
int tmp_err;
@@ -1588,6 +1579,15 @@ begin:
memset(namebuf, 0, sizeof(namebuf) / sizeof(*namebuf));
 
while (cur < total) {
+   struct btrfs_inode_item *ii;
+   struct btrfs_key key;
+   struct btrfs_key location;
+   u8 filetype;
+   u32 data_len;
+   u32 name_len;
+   u32 len;
+   u32 mode = 0;
+   u64 index;
/*
 * For DIR_ITEM set index to (u64)-1, so that find_inode_ref
 * ignore index check.
@@ -1658,7 +1658,7 @@ begin:
 
/* check relative INDEX/ITEM */
key.objectid = di_key->objectid;
-   if (key.type == BTRFS_DIR_ITEM_KEY) {
+   if (di_key->type == BTRFS_DIR_ITEM_KEY) {
key.type = BTRFS_DIR_INDEX_KEY;
key.offset = index;
} else {
-- 
2.17.1





[PATCH v3 0/7] btrfs-progs: lowmem: bug fixes and inode_extref repair

2018-09-17 Thread Su Yue
This patchset can be fetched from my github:
https://github.com/Damenly/btrfs-progs/tree/lowmem_extref
based on kdave/devel whose HEAD is:
commit 4f20c27ab33aab3efffe13cdae1b8837c821d0d7 (kdave/devel)
Author: Nikolay Borisov 
Date:   Fri Jun 15 07:13:50 2018 +
btrfs-progs: tests: test for FST corruption detection/repair

The patchset aims to support check and repair errors about
inode_extref in lowmem mode. 
patch[1-2] let btrfs_unlink() detect inode_extref.
patch[3] fixes a minor bug in check_dir_item due to my careless
 long ago.
patch[4] fixes a bug about inconsistent path in check_fs_roots()
 under repair.
patch[5] fixes a corner case about traversal of inode items.
patch[6] enable inode_extref repair support and remove unnecessary
 checks.
patch[7] add a test image, it can verify above patches except
 patch[3].

Changelog:
v3:
   Handle fatal errors instead of BUG_ON() in patch[4, 6].
  Thanks, Qu Wenruo and Nikolay Borisov.
   Change comments and move declarations in patch[3]. Pointed by Qu Wenruo.
   Handle corner case mentioned in patch[5] in walk_up_tree() not
  process_one_leaf().
   Test image are only with inode_extref now.  Suggested by Qu Wenruo.
v2:
   Resend with patches in right order.
   
Su Yue (7):
  btrfs-progs: adjust arguments of btrfs_lookup_inode_extref()
  btrfs-progs: make btrfs_unlink() lookup inode_extref
  btrfs-progs: lowmem check: find dir_item by di_key in check_dir_item()
  btrfs-progs: lowmem: search key of root again after check_fs_root()
after repair
  btrfs-progs: lowmem: do missing check of last item after
check_inode_item()
  btrfs-progs: lowmem: optimization and repair for check_inode_extref()
  btrfs-progs: fsck-tests: add test case inode_extref without dir_item
and dir_index

 check/mode-lowmem.c   | 206 ++
 ctree.h   |   9 +-
 inode-item.c  |   9 +-
 inode.c   |  14 +-
 ...node_extref_without_dir_item_and_index.img | Bin 0 -> 4096 bytes
 ... inode_ref_without_dir_item_and_index.img} | Bin
 6 files changed, 188 insertions(+), 50 deletions(-)
 create mode 100644 
tests/fsck-tests/009-no-dir-item-or-index/inode_extref_without_dir_item_and_index.img
 rename tests/fsck-tests/009-no-dir-item-or-index/{default_case.img => 
inode_ref_without_dir_item_and_index.img} (100%)

-- 
2.17.1