Re: [PATCH] Btrfs: fix race that makes btrfs_lookup_extent_info miss skinny extent items
Hello, [...] > This patch seems to fix https://bugzilla.kernel.org/show_bug.cgi?id=64961 > for me: I've been testing it together with > [PATCH] Btrfs: fix invalid leaf slot access in btrfs_lookup_extent() > on top of 3.18-rc2 since yesterday, and so far no crashes during balance > or device remove. with a little more data on the fs, btrfs balance aborted with ERROR: error during balancing '/mnt/b3' - No space left on device There may be more info in syslog - try dmesg | tail [104437.841446] BTRFS info (device sdh): 8 enospc errors during balance There is still plenty of free space on the fs: root@fs0:~# df /mnt/b3 Filesystem 1K-blocks Used Available Use% Mounted on /dev/sde 7814042460 989236936 5127850672 17% /mnt/b3 root@fs0:~# btrfs fi df /mnt/b3 Data, RAID10: total=974.00GiB, used=967.21GiB System, RAID1: total=32.00MiB, used=128.00KiB Metadata, RAID1: total=24.00GiB, used=20.79GiB GlobalReserve, single: total=512.00MiB, used=0.00B root@fs0:~# btrfs fi sh /mnt/b3 Label: 'BTR3' uuid: 9de82766-9f9a-4605-9be7-7b6da9de720c Total devices 5 FS bytes used 985.69GiB devid1 size 2.73TiB used 371.03GiB path /dev/sde devid2 size 2.73TiB used 371.00GiB path /dev/sdf devid3 size 2.73TiB used 371.00GiB path /dev/sdg devid4 size 2.73TiB used 371.00GiB path /dev/sdh devid5 size 3.64TiB used 500.03GiB path /dev/sdp Btrfs v3.17 There was some activity (copying) on that fs during the balance. Thanks, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix race that makes btrfs_lookup_extent_info miss skinny extent items
Hello, >On Mon, 27 Oct 2014 13:44:22 +, Filipe David Manana wrote: >> On Mon, Oct 27, 2014 at 12:11 PM, Filipe David Manana >> wrote: >>> On Mon, Oct 27, 2014 at 11:08 AM, Miao Xie wrote: On Mon, 27 Oct 2014 09:19:52 +, Filipe Manana wrote: > We have a race that can lead us to miss skinny extent items in the > function > btrfs_lookup_extent_info() when the skinny metadata feature is enabled. > So basically the sequence of steps is: > > 1) We search in the extent tree for the skinny extent, which returns > 0 >(not found); > > 2) We check the previous item in the returned leaf for a non-skinny > extent, >and we don't find it; > > 3) Because we didn't find the non-skinny extent in step 2), we release our >path to search the extent tree again, but this time for a non-skinny >extent key; > > 4) Right after we released our path in step 3), a skinny extent was > inserted >in the extent tree (delayed refs were run) - our second extent tree > search >will miss it, because it's not looking for a skinny extent; > > 5) After the second search returned (with ret > 0), we look for any > delayed >ref for our extent's bytenr (and we do it while holding a read lock on > the >leaf), but we won't find any, as such delayed ref had just run and > completed >after we released out path in step 3) before doing the second search. > > Fix this by removing completely the path release and re-search logic. > This is > safe, because if we seach for a metadata item and we don't find it, we > have the > guarantee that the returned leaf is the one where the item would be > inserted, > and so path->slots[0] > 0 and path->slots[0] - 1 must be the slot where > the > non-skinny extent item is if it exists. The only case where > path->slots[0] is I think this analysis is wrong if there are some independent shared ref metadata for a tree block, just like: ++-+-+ | tree block extent item | shared ref1 | shared ref2 | ++-+-+ >> >> Trying to guess what's in your mind. >> >> Is the concern that if after a non-skinny extent item we have >> non-inlined references, the assumption that path->slots[0] - 1 points >> to the extent item would be wrong when searching for a skinny extent? >> >> That wouldn't be the case because BTRFS_EXTENT_ITEM_KEY == 168 and >> BTRFS_METADATA_ITEM_KEY == 169, with BTRFS_SHARED_BLOCK_REF_KEY == >> 182. So in the presence of such non-inlined shared tree block >> reference items, searching for a skinny extent item leaves us at a >> slot that points to the first non-inlined ref (regardless of its type, >> since they're all > 169), and therefore path->slots[0] - 1 is the >> non-skinny extent item. > > You are right. I forget to check the value of key type. Sorry. > > This patch seems good for me. This patch seems to fix https://bugzilla.kernel.org/show_bug.cgi?id=64961 for me: I've been testing it together with [PATCH] Btrfs: fix invalid leaf slot access in btrfs_lookup_extent() on top of 3.18-rc2 since yesterday, and so far no crashes during balance or device remove. Thanks, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix race that makes btrfs_lookup_extent_info miss skinny extent items
On Mon, 27 Oct 2014 13:44:22 +, Filipe David Manana wrote: > On Mon, Oct 27, 2014 at 12:11 PM, Filipe David Manana > wrote: >> On Mon, Oct 27, 2014 at 11:08 AM, Miao Xie wrote: >>> On Mon, 27 Oct 2014 09:19:52 +, Filipe Manana wrote: We have a race that can lead us to miss skinny extent items in the function btrfs_lookup_extent_info() when the skinny metadata feature is enabled. So basically the sequence of steps is: 1) We search in the extent tree for the skinny extent, which returns > 0 (not found); 2) We check the previous item in the returned leaf for a non-skinny extent, and we don't find it; 3) Because we didn't find the non-skinny extent in step 2), we release our path to search the extent tree again, but this time for a non-skinny extent key; 4) Right after we released our path in step 3), a skinny extent was inserted in the extent tree (delayed refs were run) - our second extent tree search will miss it, because it's not looking for a skinny extent; 5) After the second search returned (with ret > 0), we look for any delayed ref for our extent's bytenr (and we do it while holding a read lock on the leaf), but we won't find any, as such delayed ref had just run and completed after we released out path in step 3) before doing the second search. Fix this by removing completely the path release and re-search logic. This is safe, because if we seach for a metadata item and we don't find it, we have the guarantee that the returned leaf is the one where the item would be inserted, and so path->slots[0] > 0 and path->slots[0] - 1 must be the slot where the non-skinny extent item is if it exists. The only case where path->slots[0] is >>> >>> I think this analysis is wrong if there are some independent shared ref >>> metadata for >>> a tree block, just like: >>> ++-+-+ >>> | tree block extent item | shared ref1 | shared ref2 | >>> ++-+-+ > > Trying to guess what's in your mind. > > Is the concern that if after a non-skinny extent item we have > non-inlined references, the assumption that path->slots[0] - 1 points > to the extent item would be wrong when searching for a skinny extent? > > That wouldn't be the case because BTRFS_EXTENT_ITEM_KEY == 168 and > BTRFS_METADATA_ITEM_KEY == 169, with BTRFS_SHARED_BLOCK_REF_KEY == > 182. So in the presence of such non-inlined shared tree block > reference items, searching for a skinny extent item leaves us at a > slot that points to the first non-inlined ref (regardless of its type, > since they're all > 169), and therefore path->slots[0] - 1 is the > non-skinny extent item. You are right. I forget to check the value of key type. Sorry. This patch seems good for me. Reviewed-by: Miao Xie > > thanks. > >> >> Why does that matters? Can you elaborate why it's not correct? >> >> We're looking for the extent item only in btrfs_lookup_extent_info(), >> and running a delayed ref, independently of being inlined/shared, it >> implies inserting a new extent item or updating an existing extent >> item (updating ref count). >> >> thanks >> >>> >>> Thanks >>> Miao >>> zero is when there are no smaller keys in the tree (i.e. no left siblings for our leaf), in which case the re-search logic isn't needed as well. This race has been present since the introduction of skinny metadata (change 3173a18f70554fe7880bb2d85c7da566e364eb3c). Signed-off-by: Filipe Manana --- fs/btrfs/extent-tree.c | 8 1 file changed, 8 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 9141b2b..2cedd06 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -780,7 +780,6 @@ search_again: else key.type = BTRFS_EXTENT_ITEM_KEY; -again: ret = btrfs_search_slot(trans, root->fs_info->extent_root, &key, path, 0, 0); if (ret < 0) @@ -796,13 +795,6 @@ again: key.offset == root->nodesize) ret = 0; } - if (ret) { - key.objectid = bytenr; - key.type = BTRFS_EXTENT_ITEM_KEY; - key.offset = root->nodesize; - btrfs_release_path(path); - goto again; - } } if (ret == 0) { >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>
Re: [PATCH] Btrfs: fix race that makes btrfs_lookup_extent_info miss skinny extent items
On Mon, Oct 27, 2014 at 12:11 PM, Filipe David Manana wrote: > On Mon, Oct 27, 2014 at 11:08 AM, Miao Xie wrote: >> On Mon, 27 Oct 2014 09:19:52 +, Filipe Manana wrote: >>> We have a race that can lead us to miss skinny extent items in the function >>> btrfs_lookup_extent_info() when the skinny metadata feature is enabled. >>> So basically the sequence of steps is: >>> >>> 1) We search in the extent tree for the skinny extent, which returns > 0 >>>(not found); >>> >>> 2) We check the previous item in the returned leaf for a non-skinny extent, >>>and we don't find it; >>> >>> 3) Because we didn't find the non-skinny extent in step 2), we release our >>>path to search the extent tree again, but this time for a non-skinny >>>extent key; >>> >>> 4) Right after we released our path in step 3), a skinny extent was inserted >>>in the extent tree (delayed refs were run) - our second extent tree >>> search >>>will miss it, because it's not looking for a skinny extent; >>> >>> 5) After the second search returned (with ret > 0), we look for any delayed >>>ref for our extent's bytenr (and we do it while holding a read lock on >>> the >>>leaf), but we won't find any, as such delayed ref had just run and >>> completed >>>after we released out path in step 3) before doing the second search. >>> >>> Fix this by removing completely the path release and re-search logic. This >>> is >>> safe, because if we seach for a metadata item and we don't find it, we have >>> the >>> guarantee that the returned leaf is the one where the item would be >>> inserted, >>> and so path->slots[0] > 0 and path->slots[0] - 1 must be the slot where the >>> non-skinny extent item is if it exists. The only case where path->slots[0] >>> is >> >> I think this analysis is wrong if there are some independent shared ref >> metadata for >> a tree block, just like: >> ++-+-+ >> | tree block extent item | shared ref1 | shared ref2 | >> ++-+-+ Trying to guess what's in your mind. Is the concern that if after a non-skinny extent item we have non-inlined references, the assumption that path->slots[0] - 1 points to the extent item would be wrong when searching for a skinny extent? That wouldn't be the case because BTRFS_EXTENT_ITEM_KEY == 168 and BTRFS_METADATA_ITEM_KEY == 169, with BTRFS_SHARED_BLOCK_REF_KEY == 182. So in the presence of such non-inlined shared tree block reference items, searching for a skinny extent item leaves us at a slot that points to the first non-inlined ref (regardless of its type, since they're all > 169), and therefore path->slots[0] - 1 is the non-skinny extent item. thanks. > > Why does that matters? Can you elaborate why it's not correct? > > We're looking for the extent item only in btrfs_lookup_extent_info(), > and running a delayed ref, independently of being inlined/shared, it > implies inserting a new extent item or updating an existing extent > item (updating ref count). > > thanks > >> >> Thanks >> Miao >> >>> zero is when there are no smaller keys in the tree (i.e. no left siblings >>> for >>> our leaf), in which case the re-search logic isn't needed as well. >>> >>> This race has been present since the introduction of skinny metadata (change >>> 3173a18f70554fe7880bb2d85c7da566e364eb3c). >>> >>> Signed-off-by: Filipe Manana >>> --- >>> fs/btrfs/extent-tree.c | 8 >>> 1 file changed, 8 deletions(-) >>> >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>> index 9141b2b..2cedd06 100644 >>> --- a/fs/btrfs/extent-tree.c >>> +++ b/fs/btrfs/extent-tree.c >>> @@ -780,7 +780,6 @@ search_again: >>> else >>> key.type = BTRFS_EXTENT_ITEM_KEY; >>> >>> -again: >>> ret = btrfs_search_slot(trans, root->fs_info->extent_root, >>> &key, path, 0, 0); >>> if (ret < 0) >>> @@ -796,13 +795,6 @@ again: >>> key.offset == root->nodesize) >>> ret = 0; >>> } >>> - if (ret) { >>> - key.objectid = bytenr; >>> - key.type = BTRFS_EXTENT_ITEM_KEY; >>> - key.offset = root->nodesize; >>> - btrfs_release_path(path); >>> - goto again; >>> - } >>> } >>> >>> if (ret == 0) { >>> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > Filipe David Manana, > > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That's why all progress depends on unreasonable men." -- Filipe David Manana, "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progres
Re: [PATCH] Btrfs: fix race that makes btrfs_lookup_extent_info miss skinny extent items
On Mon, Oct 27, 2014 at 11:08 AM, Miao Xie wrote: > On Mon, 27 Oct 2014 09:19:52 +, Filipe Manana wrote: >> We have a race that can lead us to miss skinny extent items in the function >> btrfs_lookup_extent_info() when the skinny metadata feature is enabled. >> So basically the sequence of steps is: >> >> 1) We search in the extent tree for the skinny extent, which returns > 0 >>(not found); >> >> 2) We check the previous item in the returned leaf for a non-skinny extent, >>and we don't find it; >> >> 3) Because we didn't find the non-skinny extent in step 2), we release our >>path to search the extent tree again, but this time for a non-skinny >>extent key; >> >> 4) Right after we released our path in step 3), a skinny extent was inserted >>in the extent tree (delayed refs were run) - our second extent tree search >>will miss it, because it's not looking for a skinny extent; >> >> 5) After the second search returned (with ret > 0), we look for any delayed >>ref for our extent's bytenr (and we do it while holding a read lock on the >>leaf), but we won't find any, as such delayed ref had just run and >> completed >>after we released out path in step 3) before doing the second search. >> >> Fix this by removing completely the path release and re-search logic. This is >> safe, because if we seach for a metadata item and we don't find it, we have >> the >> guarantee that the returned leaf is the one where the item would be inserted, >> and so path->slots[0] > 0 and path->slots[0] - 1 must be the slot where the >> non-skinny extent item is if it exists. The only case where path->slots[0] is > > I think this analysis is wrong if there are some independent shared ref > metadata for > a tree block, just like: > ++-+-+ > | tree block extent item | shared ref1 | shared ref2 | > ++-+-+ Why does that matters? Can you elaborate why it's not correct? We're looking for the extent item only in btrfs_lookup_extent_info(), and running a delayed ref, independently of being inlined/shared, it implies inserting a new extent item or updating an existing extent item (updating ref count). thanks > > Thanks > Miao > >> zero is when there are no smaller keys in the tree (i.e. no left siblings for >> our leaf), in which case the re-search logic isn't needed as well. >> >> This race has been present since the introduction of skinny metadata (change >> 3173a18f70554fe7880bb2d85c7da566e364eb3c). >> >> Signed-off-by: Filipe Manana >> --- >> fs/btrfs/extent-tree.c | 8 >> 1 file changed, 8 deletions(-) >> >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 9141b2b..2cedd06 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -780,7 +780,6 @@ search_again: >> else >> key.type = BTRFS_EXTENT_ITEM_KEY; >> >> -again: >> ret = btrfs_search_slot(trans, root->fs_info->extent_root, >> &key, path, 0, 0); >> if (ret < 0) >> @@ -796,13 +795,6 @@ again: >> key.offset == root->nodesize) >> ret = 0; >> } >> - if (ret) { >> - key.objectid = bytenr; >> - key.type = BTRFS_EXTENT_ITEM_KEY; >> - key.offset = root->nodesize; >> - btrfs_release_path(path); >> - goto again; >> - } >> } >> >> if (ret == 0) { >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix race that makes btrfs_lookup_extent_info miss skinny extent items
On Mon, 27 Oct 2014 09:19:52 +, Filipe Manana wrote: > We have a race that can lead us to miss skinny extent items in the function > btrfs_lookup_extent_info() when the skinny metadata feature is enabled. > So basically the sequence of steps is: > > 1) We search in the extent tree for the skinny extent, which returns > 0 >(not found); > > 2) We check the previous item in the returned leaf for a non-skinny extent, >and we don't find it; > > 3) Because we didn't find the non-skinny extent in step 2), we release our >path to search the extent tree again, but this time for a non-skinny >extent key; > > 4) Right after we released our path in step 3), a skinny extent was inserted >in the extent tree (delayed refs were run) - our second extent tree search >will miss it, because it's not looking for a skinny extent; > > 5) After the second search returned (with ret > 0), we look for any delayed >ref for our extent's bytenr (and we do it while holding a read lock on the >leaf), but we won't find any, as such delayed ref had just run and > completed >after we released out path in step 3) before doing the second search. > > Fix this by removing completely the path release and re-search logic. This is > safe, because if we seach for a metadata item and we don't find it, we have > the > guarantee that the returned leaf is the one where the item would be inserted, > and so path->slots[0] > 0 and path->slots[0] - 1 must be the slot where the > non-skinny extent item is if it exists. The only case where path->slots[0] is I think this analysis is wrong if there are some independent shared ref metadata for a tree block, just like: ++-+-+ | tree block extent item | shared ref1 | shared ref2 | ++-+-+ Thanks Miao > zero is when there are no smaller keys in the tree (i.e. no left siblings for > our leaf), in which case the re-search logic isn't needed as well. > > This race has been present since the introduction of skinny metadata (change > 3173a18f70554fe7880bb2d85c7da566e364eb3c). > > Signed-off-by: Filipe Manana > --- > fs/btrfs/extent-tree.c | 8 > 1 file changed, 8 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 9141b2b..2cedd06 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -780,7 +780,6 @@ search_again: > else > key.type = BTRFS_EXTENT_ITEM_KEY; > > -again: > ret = btrfs_search_slot(trans, root->fs_info->extent_root, > &key, path, 0, 0); > if (ret < 0) > @@ -796,13 +795,6 @@ again: > key.offset == root->nodesize) > ret = 0; > } > - if (ret) { > - key.objectid = bytenr; > - key.type = BTRFS_EXTENT_ITEM_KEY; > - key.offset = root->nodesize; > - btrfs_release_path(path); > - goto again; > - } > } > > if (ret == 0) { > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: fix race that makes btrfs_lookup_extent_info miss skinny extent items
We have a race that can lead us to miss skinny extent items in the function btrfs_lookup_extent_info() when the skinny metadata feature is enabled. So basically the sequence of steps is: 1) We search in the extent tree for the skinny extent, which returns > 0 (not found); 2) We check the previous item in the returned leaf for a non-skinny extent, and we don't find it; 3) Because we didn't find the non-skinny extent in step 2), we release our path to search the extent tree again, but this time for a non-skinny extent key; 4) Right after we released our path in step 3), a skinny extent was inserted in the extent tree (delayed refs were run) - our second extent tree search will miss it, because it's not looking for a skinny extent; 5) After the second search returned (with ret > 0), we look for any delayed ref for our extent's bytenr (and we do it while holding a read lock on the leaf), but we won't find any, as such delayed ref had just run and completed after we released out path in step 3) before doing the second search. Fix this by removing completely the path release and re-search logic. This is safe, because if we seach for a metadata item and we don't find it, we have the guarantee that the returned leaf is the one where the item would be inserted, and so path->slots[0] > 0 and path->slots[0] - 1 must be the slot where the non-skinny extent item is if it exists. The only case where path->slots[0] is zero is when there are no smaller keys in the tree (i.e. no left siblings for our leaf), in which case the re-search logic isn't needed as well. This race has been present since the introduction of skinny metadata (change 3173a18f70554fe7880bb2d85c7da566e364eb3c). Signed-off-by: Filipe Manana --- fs/btrfs/extent-tree.c | 8 1 file changed, 8 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 9141b2b..2cedd06 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -780,7 +780,6 @@ search_again: else key.type = BTRFS_EXTENT_ITEM_KEY; -again: ret = btrfs_search_slot(trans, root->fs_info->extent_root, &key, path, 0, 0); if (ret < 0) @@ -796,13 +795,6 @@ again: key.offset == root->nodesize) ret = 0; } - if (ret) { - key.objectid = bytenr; - key.type = BTRFS_EXTENT_ITEM_KEY; - key.offset = root->nodesize; - btrfs_release_path(path); - goto again; - } } if (ret == 0) { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html