Re: Moving btrfs fs disk between computers
Adam Borowski posted on Sat, 19 Aug 2017 04:35:26 +0200 as excerpted: > On Fri, Aug 18, 2017 at 11:09:22PM -0300, Hérikz Nawarro wrote: >> Hello everyone, >> >> Can i create a btrfs fs in a machine and move the disk to another >> machine like a ext4 or ntfs? > > Yeah, no problem whatsoever, even for multi-device filesystems. > Btrfs doesn't care about what devices it is on. > > There has been no new incompat features in quite a while, so old kernels > shouldn't be an issue either. > > I think the only way for valid btrfs to be unmountable on another > machine with a modern kernel is to mess with page size, which is AFAIK > not even possible on x86. IOW, as long as the two machines are the same arch, or more precisely, have the same page size if different archs, it should be fine as long as you're not running a really old kernel on one of them. But don't try mounting a filesystem created on x86_64 on something exotic like sparc or ppc or something. If the two archs' page sizes are the same it should work, but I'd still be sure I had an extra set of backups before I tried something like that. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
THANK YOU ALL! I just had to truncate the first 1.5 KiB of the image file to get the offsets right (God knows why), but I could then get the btrfs driver to recognize it, and I could MOUNT THE FILESYSTEM! I'm going to run btrfs check on it, free some space on it, PROPERLY remove the extra device, and then boot the system and set up both more cloud backups and probably throw in a 500GB disk I have laying around so I can set up snapshots. Everyone has been very helpful, and sorry for the real issue being my own stupidity. :) On Fri, Aug 18, 2017 at 10:28 PM, Zirconium Hackerwrote: > The image doesn't have a valid superblock. I'm really confused as to > how that could've happened. > > On Fri, Aug 18, 2017 at 7:21 PM, Qu Wenruo wrote: >> >> >> On 2017年08月19日 05:52, Zirconium Hacker wrote: >>> >>> Ok, so since it's clear now that I need that 5 GB device to be >>> present... I found the image file. But how do I get BTRFS to >>> recognize the image as a device? >> >> >> # losetup -f >> Remember the loop*, here use /dev/loop1 as example. >> >> # losetup /dev/loop1 >> # partprobe /dev/loop1 >> Then you should have /dev/loop1p1 >> >> # btrfs dev rescan >> If nothing wrong happened, you should be good to go. >> >> Thanks, >> Qu >> >> >>> I have zero experience with >>> multi-device systems. Setting it up as a loop device doesn't fix >>> mounting, and wipefs doesn't detect the BTRFS magic number, but >>> printing some of it to console shows it does have real data. Writing >>> the magic number onto it (it's a copy of the original to be safe) >>> shows in dump-super, but all other values are zero. >>> >>> I tried sending the above on my phone earlier but it was detected as a >>> "virus" because it contained HTML. Whoops. >>> >>> On Fri, Aug 18, 2017 at 11:00 AM, Chris Murphy >>> wrote: On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hacker wrote: > I vaguely remember following this guide at some point: > > http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html > -- specifically the "Balance cannot run because the filesystem is > full" part. This may have broken some things? > If you don't do 'btrfs device delete /dev/loop0' or if that command does not complete, then it's possible to get into the situation you're in. Have you ever mounted this file system with -o degraded? I'm going to guess the history is something like: 1. enospc 2. btrfs dev add 3. some kind of filtered balance, which only causes data block groups to be moved to the 2nd device 4. 2nd device is physically removed without first 'btrfs dev del' Zirco's superblock very clearly says num_devices 2, so I'd expect normal mount to always fail unless both devices are present. Is there some weird edge case where Btrfs might permit non-degraded mount when only data bg's are on a 2nd device? And then trouble only happens later when a balance is done and it goes looking for these bg's? And then, boom! -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving btrfs fs disk between computers
On Fri, Aug 18, 2017 at 11:09:22PM -0300, Hérikz Nawarro wrote: > Hello everyone, > > Can i create a btrfs fs in a machine and move the disk to another > machine like a ext4 or ntfs? Yeah, no problem whatsoever, even for multi-device filesystems. Btrfs doesn't care about what devices it is on. There has been no new incompat features in quite a while, so old kernels shouldn't be an issue either. I think the only way for valid btrfs to be unmountable on another machine with a modern kernel is to mess with page size, which is AFAIK not even possible on x86. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀-- Genghis Ht'rok'din ⠈⠳⣄ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
The image doesn't have a valid superblock. I'm really confused as to how that could've happened. On Fri, Aug 18, 2017 at 7:21 PM, Qu Wenruowrote: > > > On 2017年08月19日 05:52, Zirconium Hacker wrote: >> >> Ok, so since it's clear now that I need that 5 GB device to be >> present... I found the image file. But how do I get BTRFS to >> recognize the image as a device? > > > # losetup -f > Remember the loop*, here use /dev/loop1 as example. > > # losetup /dev/loop1 > # partprobe /dev/loop1 > Then you should have /dev/loop1p1 > > # btrfs dev rescan > If nothing wrong happened, you should be good to go. > > Thanks, > Qu > > >> I have zero experience with >> multi-device systems. Setting it up as a loop device doesn't fix >> mounting, and wipefs doesn't detect the BTRFS magic number, but >> printing some of it to console shows it does have real data. Writing >> the magic number onto it (it's a copy of the original to be safe) >> shows in dump-super, but all other values are zero. >> >> I tried sending the above on my phone earlier but it was detected as a >> "virus" because it contained HTML. Whoops. >> >> On Fri, Aug 18, 2017 at 11:00 AM, Chris Murphy >> wrote: >>> >>> On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hacker >>> wrote: >>> I vaguely remember following this guide at some point: http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html -- specifically the "Balance cannot run because the filesystem is full" part. This may have broken some things? >>> >>> If you don't do 'btrfs device delete /dev/loop0' or if that command >>> does not complete, then it's possible to get into the situation you're >>> in. >>> >>> Have you ever mounted this file system with -o degraded? >>> >>> I'm going to guess the history is something like: >>> 1. enospc >>> 2. btrfs dev add >>> 3. some kind of filtered balance, which only causes data block groups >>> to be moved to the 2nd device >>> 4. 2nd device is physically removed without first 'btrfs dev del' >>> >>> Zirco's superblock very clearly says num_devices 2, so I'd expect >>> normal mount to always fail unless both devices are present. Is there >>> some weird edge case where Btrfs might permit non-degraded mount when >>> only data bg's are on a 2nd device? And then trouble only happens >>> later when a balance is done and it goes looking for these bg's? And >>> then, boom! >>> >>> -- >>> Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Moving btrfs fs disk between computers
Hello everyone, Can i create a btrfs fs in a machine and move the disk to another machine like a ext4 or ntfs? Thanks in advance. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qcow2 images make scrub believe the filesystem is corrupted.
On 2017年08月19日 01:43, Josef Bacik wrote: On Fri, Aug 18, 2017 at 06:23:18PM +0200, Goffredo Baroncelli wrote: On 08/18/2017 01:39 AM, Josef Bacik wrote: [...] This is happening because the app (the guest OS in this case, we saw this a lot with windows guests) is changing the pages while they are in flight. We calculate the checksum of the page before it's written, so if it changes while in flight we'll end up with a csum mismatch. To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 image. You'll have to re-create the image because NODATASUM won't apply to the already invalid checksums. Thanks, Hi Josef, could you elaborate: do you are saying that using O_DIRECT is incompatible with DATASUM ? No, I'm saying using O_DIRECT with applications that don't protect in-flight memory are incompatible with DATASUM. We have no way of making sure nobody touches the page while we're writing it out, so after we calculate the checksum any changes to the page are going to cause a checksum mismatch. O_DIRECT are user space pages, there's nothing we can do to stop user space from doing stupid things. The options I looked into before were things like detecting the page had changed since we calculated the checksum, and re-submitting the write. This punishes applications that do the right thing (databases for example) by forcing us to calculate checksums twice. Just curious about this. Why not just scrubbing data/metadata in commit roots? And don't use any page cache, but always read them out from disk? For datacsum case, it's always cowed, so it won't change in-flight. Although the cost is obvious, such method can only check data/metadata in previous trans and doesn't use page cache means tons of IO. Thanks, Qu This is a shit situation because users aren't going to understand this limitation, and it bites them in the ass with all these weird errors. I think maybe we need to go back to the double-checksum thing by default, and have a flag or something for users to set if they know their application behaves properly. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
On 2017年08月19日 05:52, Zirconium Hacker wrote: Ok, so since it's clear now that I need that 5 GB device to be present... I found the image file. But how do I get BTRFS to recognize the image as a device? # losetup -f Remember the loop*, here use /dev/loop1 as example. # losetup /dev/loop1 # partprobe /dev/loop1 Then you should have /dev/loop1p1 # btrfs dev rescan If nothing wrong happened, you should be good to go. Thanks, Qu I have zero experience with multi-device systems. Setting it up as a loop device doesn't fix mounting, and wipefs doesn't detect the BTRFS magic number, but printing some of it to console shows it does have real data. Writing the magic number onto it (it's a copy of the original to be safe) shows in dump-super, but all other values are zero. I tried sending the above on my phone earlier but it was detected as a "virus" because it contained HTML. Whoops. On Fri, Aug 18, 2017 at 11:00 AM, Chris Murphywrote: On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hacker wrote: I vaguely remember following this guide at some point: http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html -- specifically the "Balance cannot run because the filesystem is full" part. This may have broken some things? If you don't do 'btrfs device delete /dev/loop0' or if that command does not complete, then it's possible to get into the situation you're in. Have you ever mounted this file system with -o degraded? I'm going to guess the history is something like: 1. enospc 2. btrfs dev add 3. some kind of filtered balance, which only causes data block groups to be moved to the 2nd device 4. 2nd device is physically removed without first 'btrfs dev del' Zirco's superblock very clearly says num_devices 2, so I'd expect normal mount to always fail unless both devices are present. Is there some weird edge case where Btrfs might permit non-degraded mount when only data bg's are on a 2nd device? And then trouble only happens later when a balance is done and it goes looking for these bg's? And then, boom! -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qcow2 images make scrub believe the filesystem is corrupted.
On 08/18/2017 07:43 PM, Josef Bacik wrote: > On Fri, Aug 18, 2017 at 06:23:18PM +0200, Goffredo Baroncelli wrote: >> On 08/18/2017 01:39 AM, Josef Bacik wrote: >> [...] >>> This is happening because the app (the guest OS in this case, we saw this a >>> lot >>> with windows guests) is changing the pages while they are in flight. We >>> calculate the checksum of the page before it's written, so if it changes >>> while >>> in flight we'll end up with a csum mismatch. >>> >>> To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 >>> image. >>> You'll have to re-create the image because NODATASUM won't apply to the >>> already >>> invalid checksums. Thanks, >> >> Hi Josef, >> >> could you elaborate: do you are saying that using O_DIRECT is incompatible >> with DATASUM ? >> > > No, I'm saying using O_DIRECT with applications that don't protect in-flight > memory are incompatible with DATASUM. This is what I call an 'incompatibility'. Even is a "corner" case, it is still an incompatibility. And to be honest, it is still difficult to say that a "VM" is a "corner" case. > We have no way of making sure nobody > touches the page while we're writing it out, so after we calculate the > checksum > any changes to the page are going to cause a checksum mismatch. O_DIRECT are > user space pages, there's nothing we can do to stop user space from doing > stupid > things. I understand the technical difficulties; however I can't agree about "user space [...] doing *stupid* things". If it is not explicitly forbidden, it is legal; not "stupid" How the application know that the page aren't in-flight anymore ? It is sufficient to wait the end of the write() syscall ? Or it has to wait the end of a fsync() ? > The options I looked into before were things like detecting the page had > changed > since we calculated the checksum, and re-submitting the write. This punishes > applications that do the right thing (databases for example) by forcing us to > calculate checksums twice. There are other "cases" where it is possible to have the same problem ? It is the same for mmap() ? > > This is a shit situation because users aren't going to understand this > limitation, and it bites them in the ass with all these weird errors. I think > maybe we need to go back to the double-checksum thing by default, and have a > flag or something for users to set if they know their application behaves > properly. Or... disable checksum for the "O_DIRECT" writings... If you can't trust the checksums at 100%, these don't make sense. > > Josef > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 3/7] Btrfs: remove BUG() in btrfs_extent_inline_ref_size
Now that btrfs_get_extent_inline_ref_type() can report if type is a valid one and all callers can gracefully deal with that, we don't need to crash here. Signed-off-by: Liu Bo--- fs/btrfs/ctree.h | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 714e779..38b0735 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1799,7 +1799,6 @@ static inline u32 btrfs_extent_inline_ref_size(int type) if (type == BTRFS_EXTENT_DATA_REF_KEY) return sizeof(struct btrfs_extent_data_ref) + offsetof(struct btrfs_extent_inline_ref, offset); - BUG(); return 0; } -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 7/7] Btrfs: add one more sanity check for shared ref type
Every shared ref has a parent tree block, which can be get from btrfs_extent_inline_ref_offset(). And the tree block must be aligned to the nodesize, so we'd know this inline ref is not valid if this block's bytenr is not aligned to the nodesize, in which case, most likely the ref type has been misused. This adds the above mentioned check and also updates print_extent_item() called by btrfs_print_leaf() to point out the invalid ref while printing the tree structure. Signed-off-by: Liu Bo--- fs/btrfs/extent-tree.c | 29 + fs/btrfs/print-tree.c | 27 +-- 2 files changed, 46 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index d75129a..13264cd 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1158,19 +1158,40 @@ int btrfs_get_extent_inline_ref_type(struct extent_buffer *eb, enum btrfs_inline_ref_type is_data) { int type = btrfs_extent_inline_ref_type(eb, iref); + u64 offset = btrfs_extent_inline_ref_offset(eb, iref); if (type == BTRFS_TREE_BLOCK_REF_KEY || type == BTRFS_SHARED_BLOCK_REF_KEY || type == BTRFS_SHARED_DATA_REF_KEY || type == BTRFS_EXTENT_DATA_REF_KEY) { if (is_data == BTRFS_REF_TYPE_BLOCK) { - if (type == BTRFS_TREE_BLOCK_REF_KEY || - type == BTRFS_SHARED_BLOCK_REF_KEY) + if (type == BTRFS_TREE_BLOCK_REF_KEY) return type; + if (type == BTRFS_SHARED_BLOCK_REF_KEY) { + ASSERT(eb->fs_info); + /* +* Every shared one has parent tree +* block, which must be aligned to +* nodesize. +*/ + if (offset && + IS_ALIGNED(offset, eb->fs_info->nodesize)) + return type; + } } else if (is_data == BTRFS_REF_TYPE_DATA) { - if (type == BTRFS_EXTENT_DATA_REF_KEY || - type == BTRFS_SHARED_DATA_REF_KEY) + if (type == BTRFS_EXTENT_DATA_REF_KEY) return type; + if (type == BTRFS_SHARED_DATA_REF_KEY) { + ASSERT(eb->fs_info); + /* +* Every shared one has parent tree +* block, which must be aligned to +* nodesize. +*/ + if (offset && + IS_ALIGNED(offset, eb->fs_info->nodesize)) + return type; + } } else { ASSERT(is_data == BTRFS_REF_TYPE_ANY); return type; diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c index 0e52e47..9f8c5ee 100644 --- a/fs/btrfs/print-tree.c +++ b/fs/btrfs/print-tree.c @@ -44,7 +44,7 @@ static void print_dev_item(struct extent_buffer *eb, static void print_extent_data_ref(struct extent_buffer *eb, struct btrfs_extent_data_ref *ref) { - pr_info("\t\textent data backref root %llu objectid %llu offset %llu count %u\n", + pr_cont("extent data backref root %llu objectid %llu offset %llu count %u\n", btrfs_extent_data_ref_root(eb, ref), btrfs_extent_data_ref_objectid(eb, ref), btrfs_extent_data_ref_offset(eb, ref), @@ -63,6 +63,7 @@ static void print_extent_item(struct extent_buffer *eb, int slot, int type) u32 item_size = btrfs_item_size_nr(eb, slot); u64 flags; u64 offset; + int ref_index = 0; if (item_size < sizeof(*ei)) { #ifdef BTRFS_COMPAT_EXTENT_TREE_V0 @@ -104,12 +105,20 @@ static void print_extent_item(struct extent_buffer *eb, int slot, int type) iref = (struct btrfs_extent_inline_ref *)ptr; type = btrfs_extent_inline_ref_type(eb, iref); offset = btrfs_extent_inline_ref_offset(eb, iref); + pr_info("\t\tref#%d: ", ref_index++); switch (type) { case BTRFS_TREE_BLOCK_REF_KEY: - pr_info("\t\ttree block backref root %llu\n", offset); + pr_cont("tree block backref root %llu\n", offset); break; case BTRFS_SHARED_BLOCK_REF_KEY: - pr_info("\t\tshared block backref parent %llu\n", offset); + pr_cont("shared block backref parent %llu\n", offset); +
[PATCH v3 0/7] add sanity check for extent inline ref type
An invalid extent inline ref type could be read from a btrfs image and it ends up with a panic[1], this set is to deal with the insane value gracefully in patch 1-2 and clean up BUG() in the code in patch 3-6. Patch 7 adds one more check to see if the ref is a valid shared one. I'm not sure in the real world what may result in this corruption, but I've seen several reports on the ML about __btrfs_free_extent saying something was missing (or simply wrong), while testing this set with btrfs-corrupt-block, I found that switching ref type could end up that situation as well, eg. a data extent's ref type (BTRFS_EXTENT_DATA_REF_KEY) is switched to (BTRFS_TREE_BLOCK_REF_KEY). Hopefully this can give people more sights next time when that happens. [1]:https://www.spinics.net/lists/linux-btrfs/msg65646.html v3: - btrfs_inline_ref_types -> btrfs_inline_ref_type - convert WARN(1) to btrfs_err so that we know which btrfs has that error. v2: - add enum type and return BTRFS_REF_TYPE_INVALID instead of -EINVAL. - remove one more BUG_ON which is in __add_tree_block. - add validation check for shared refs. - improve btrfs_print_leaf to show which refs has something wrong. Liu Bo (7): Btrfs: add a helper to retrive extent inline ref type Btrfs: convert to use btrfs_get_extent_inline_ref_type Btrfs: remove BUG() in btrfs_extent_inline_ref_size Btrfs: remove BUG() in print_extent_item Btrfs: remove BUG() in add_data_reference Btrfs: remove BUG_ON in __add_tree_block Btrfs: add one more sanity check for shared ref type fs/btrfs/backref.c | 11 -- fs/btrfs/ctree.h | 12 ++- fs/btrfs/extent-tree.c | 94 ++ fs/btrfs/print-tree.c | 28 --- fs/btrfs/relocation.c | 30 +--- 5 files changed, 157 insertions(+), 18 deletions(-) -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/7] Btrfs: convert to use btrfs_get_extent_inline_ref_type
Since we have a helper which can do sanity check, this converts all btrfs_extent_inline_ref_type to it. Signed-off-by: Liu Bo--- fs/btrfs/backref.c | 11 +-- fs/btrfs/extent-tree.c | 36 ++-- fs/btrfs/relocation.c | 13 +++-- 3 files changed, 50 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index f723c11..1b3d9df 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -1012,7 +1012,11 @@ static int __add_inline_refs(struct btrfs_path *path, u64 bytenr, int type; iref = (struct btrfs_extent_inline_ref *)ptr; - type = btrfs_extent_inline_ref_type(leaf, iref); + type = btrfs_get_extent_inline_ref_type(leaf, iref, + BTRFS_REF_TYPE_ANY); + if (type == BTRFS_REF_TYPE_INVALID) + return -EINVAL; + offset = btrfs_extent_inline_ref_offset(leaf, iref); switch (type) { @@ -1908,7 +1912,10 @@ static int __get_extent_inline_ref(unsigned long *ptr, struct extent_buffer *eb, end = (unsigned long)ei + item_size; *out_eiref = (struct btrfs_extent_inline_ref *)(*ptr); - *out_type = btrfs_extent_inline_ref_type(eb, *out_eiref); + *out_type = btrfs_get_extent_inline_ref_type(eb, *out_eiref, +BTRFS_REF_TYPE_ANY); + if (*out_type == BTRFS_REF_TYPE_INVALID) + return -EINVAL; *ptr += btrfs_extent_inline_ref_size(*out_type); WARN_ON(*ptr > end); diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index c540578..d75129a 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1454,12 +1454,18 @@ static noinline u32 extent_data_ref_count(struct btrfs_path *path, struct btrfs_extent_data_ref *ref1; struct btrfs_shared_data_ref *ref2; u32 num_refs = 0; + int type; leaf = path->nodes[0]; btrfs_item_key_to_cpu(leaf, , path->slots[0]); if (iref) { - if (btrfs_extent_inline_ref_type(leaf, iref) == - BTRFS_EXTENT_DATA_REF_KEY) { + /* +* If type is invalid, we should have bailed out earlier than +* this call. +*/ + type = btrfs_get_extent_inline_ref_type(leaf, iref, BTRFS_REF_TYPE_DATA); + ASSERT(type != BTRFS_REF_TYPE_INVALID); + if (type == BTRFS_EXTENT_DATA_REF_KEY) { ref1 = (struct btrfs_extent_data_ref *)(>offset); num_refs = btrfs_extent_data_ref_count(leaf, ref1); } else { @@ -1620,6 +1626,7 @@ int lookup_inline_extent_backref(struct btrfs_trans_handle *trans, int ret; int err = 0; bool skinny_metadata = btrfs_fs_incompat(fs_info, SKINNY_METADATA); + int needed; key.objectid = bytenr; key.type = BTRFS_EXTENT_ITEM_KEY; @@ -1711,6 +1718,11 @@ int lookup_inline_extent_backref(struct btrfs_trans_handle *trans, BUG_ON(ptr > end); } + if (owner >= BTRFS_FIRST_FREE_OBJECTID) + needed = BTRFS_REF_TYPE_DATA; + else + needed = BTRFS_REF_TYPE_BLOCK; + err = -ENOENT; while (1) { if (ptr >= end) { @@ -1718,7 +1730,12 @@ int lookup_inline_extent_backref(struct btrfs_trans_handle *trans, break; } iref = (struct btrfs_extent_inline_ref *)ptr; - type = btrfs_extent_inline_ref_type(leaf, iref); + type = btrfs_get_extent_inline_ref_type(leaf, iref, needed); + if (type == BTRFS_REF_TYPE_INVALID) { + err = -EINVAL; + goto out; + } + if (want < type) break; if (want > type) { @@ -1910,7 +1927,12 @@ void update_inline_extent_backref(struct btrfs_fs_info *fs_info, if (extent_op) __run_delayed_extent_op(extent_op, leaf, ei); - type = btrfs_extent_inline_ref_type(leaf, iref); + /* +* If type is invalid, we should have bailed out after +* lookup_inline_extent_backref(). +*/ + type = btrfs_get_extent_inline_ref_type(leaf, iref, BTRFS_REF_TYPE_ANY); + ASSERT(type != BTRFS_REF_TYPE_INVALID); if (type == BTRFS_EXTENT_DATA_REF_KEY) { dref = (struct btrfs_extent_data_ref *)(>offset); @@ -3195,6 +3217,7 @@ static noinline int check_committed_ref(struct btrfs_root *root, struct btrfs_extent_item *ei; struct btrfs_key key; u32 item_size; + int type; int ret; key.objectid = bytenr; @@ -3236,8 +3259,9 @@ static noinline int check_committed_ref(struct btrfs_root *root,
[PATCH v3 6/7] Btrfs: remove BUG_ON in __add_tree_block
The BUG_ON() can be triggered when the caller is processing an invalid extent inline ref, e.g. a shared data ref is offered instead of a extent data ref, such that it tries to find a non-exist tree block and then btrfs_search_slot returns 1 for no such item. This replaces the BUG_ON() with a WARN() followed by calling btrfs_print_leaf() to show more details about what's going on and returning -EINVAL to upper callers. Signed-off-by: Liu Bo--- fs/btrfs/relocation.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 53fc798..71d38dd 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -32,6 +32,7 @@ #include "free-space-cache.h" #include "inode-map.h" #include "qgroup.h" +#include "print-tree.h" /* * backref_node, mapping_node and tree_block start with this @@ -3485,7 +3486,16 @@ static int __add_tree_block(struct reloc_control *rc, goto again; } } - BUG_ON(ret); + if (ret) { + ASSERT(ret == 1); + btrfs_print_leaf(rc->extent_root->fs_info, path->nodes[0]); + btrfs_err(fs_info, +"tree block extent item (%llu) is not found in extent tree", +bytenr); + WARN_ON(1); + ret = -EINVAL; + goto out; + } ret = add_tree_block(rc, , path, blocks); out: -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 4/7] Btrfs: remove BUG() in print_extent_item
btrfs_print_leaf() is used in btrfs_get_extent_inline_ref_type, so here we really want to print the invalid value of ref type instead of causing a kernel panic. Signed-off-by: Liu Bo--- fs/btrfs/print-tree.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c index fcae61e..0e52e47 100644 --- a/fs/btrfs/print-tree.c +++ b/fs/btrfs/print-tree.c @@ -121,7 +121,10 @@ static void print_extent_item(struct extent_buffer *eb, int slot, int type) offset, btrfs_shared_data_ref_count(eb, sref)); break; default: - BUG(); + btrfs_err(eb->fs_info, + "extent %llu has invalid ref type %d\n", + eb->start, type); + return; } ptr += btrfs_extent_inline_ref_size(type); } -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 5/7] Btrfs: remove BUG() in add_data_reference
Now that we have a helper to report invalid value of extent inline ref type, we need to quit gracefully instead of throwing out a kernel panic. Signed-off-by: Liu Bo--- fs/btrfs/relocation.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index f0bef3c..53fc798 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3774,7 +3774,10 @@ int add_data_references(struct reloc_control *rc, ret = find_data_references(rc, extent_key, eb, dref, blocks); } else { - BUG(); + ret = -EINVAL; + btrfs_err(rc->extent_root->fs_info, +"extent %llu slot %d has an invalid inline ref type", +eb->start, path->slots[0]); } if (ret) { err = ret; -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/7] Btrfs: add a helper to retrive extent inline ref type
An invalid value of extent inline ref type may be read from a malicious image which may force btrfs to crash. This adds a helper which does sanity check for the ref type, so we can know if it's sane, return type if so, otherwise return an error. Signed-off-by: Liu Bo--- fs/btrfs/ctree.h | 11 +++ fs/btrfs/extent-tree.c | 37 + 2 files changed, 48 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 3f3eb7b..714e779 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2561,6 +2561,17 @@ static inline gfp_t btrfs_alloc_write_mask(struct address_space *mapping) /* extent-tree.c */ +enum btrfs_inline_ref_type { + BTRFS_REF_TYPE_INVALID = 0, + BTRFS_REF_TYPE_BLOCK = 1, + BTRFS_REF_TYPE_DATA =2, + BTRFS_REF_TYPE_ANY = 3, +}; + +int btrfs_get_extent_inline_ref_type(struct extent_buffer *eb, +struct btrfs_extent_inline_ref *iref, +enum btrfs_inline_ref_type is_data); + u64 btrfs_csum_bytes_to_leaves(struct btrfs_fs_info *fs_info, u64 csum_bytes); static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_fs_info *fs_info, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index e3b0b41..c540578 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1148,6 +1148,43 @@ static int convert_extent_item_v0(struct btrfs_trans_handle *trans, } #endif +/* + * is_data == BTRFS_REF_TYPE_BLOCK, tree block type is required, + * is_data == BTRFS_REF_TYPE_DATA, data type is requried, + * is_data == BTRFS_REF_TYPE_ANY, either type is OK. + */ +int btrfs_get_extent_inline_ref_type(struct extent_buffer *eb, +struct btrfs_extent_inline_ref *iref, +enum btrfs_inline_ref_type is_data) +{ + int type = btrfs_extent_inline_ref_type(eb, iref); + + if (type == BTRFS_TREE_BLOCK_REF_KEY || + type == BTRFS_SHARED_BLOCK_REF_KEY || + type == BTRFS_SHARED_DATA_REF_KEY || + type == BTRFS_EXTENT_DATA_REF_KEY) { + if (is_data == BTRFS_REF_TYPE_BLOCK) { + if (type == BTRFS_TREE_BLOCK_REF_KEY || + type == BTRFS_SHARED_BLOCK_REF_KEY) + return type; + } else if (is_data == BTRFS_REF_TYPE_DATA) { + if (type == BTRFS_EXTENT_DATA_REF_KEY || + type == BTRFS_SHARED_DATA_REF_KEY) + return type; + } else { + ASSERT(is_data == BTRFS_REF_TYPE_ANY); + return type; + } + } + + btrfs_print_leaf(eb->fs_info, eb); + btrfs_err(eb->fs_info, "eb %llu invalid extent inline ref type %d", + eb->start, type); + WARN_ON(1); + + return BTRFS_REF_TYPE_INVALID; +} + static u64 hash_extent_data_ref(u64 root_objectid, u64 owner, u64 offset) { u32 high_crc = ~(u32)0; -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
Ok, so since it's clear now that I need that 5 GB device to be present... I found the image file. But how do I get BTRFS to recognize the image as a device? I have zero experience with multi-device systems. Setting it up as a loop device doesn't fix mounting, and wipefs doesn't detect the BTRFS magic number, but printing some of it to console shows it does have real data. Writing the magic number onto it (it's a copy of the original to be safe) shows in dump-super, but all other values are zero. I tried sending the above on my phone earlier but it was detected as a "virus" because it contained HTML. Whoops. On Fri, Aug 18, 2017 at 11:00 AM, Chris Murphywrote: > On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hacker > wrote: > >> I vaguely remember following this guide at some point: >> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html >> -- specifically the "Balance cannot run because the filesystem is >> full" part. This may have broken some things? >> > > If you don't do 'btrfs device delete /dev/loop0' or if that command > does not complete, then it's possible to get into the situation you're > in. > > Have you ever mounted this file system with -o degraded? > > I'm going to guess the history is something like: > 1. enospc > 2. btrfs dev add > 3. some kind of filtered balance, which only causes data block groups > to be moved to the 2nd device > 4. 2nd device is physically removed without first 'btrfs dev del' > > Zirco's superblock very clearly says num_devices 2, so I'd expect > normal mount to always fail unless both devices are present. Is there > some weird edge case where Btrfs might permit non-degraded mount when > only data bg's are on a 2nd device? And then trouble only happens > later when a balance is done and it goes looking for these bg's? And > then, boom! > > -- > Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: change how we decide to commit transactions during flushing
From: Josef BacikNikolay reported that generic/273 was failing currently with ENOSPC. Turns out this is because we get to the point where the outstanding reservations are greater than the pinned space on the fs. This is a mistake, previously we used the current reservation amount in may_commit_transaction, not the entire outstanding reservation amount. Fix this to find the minimum byte size needed to make progress in flushing, and pass that into may_commit_transaction. From there we can make a smarter decision on whether to commit the transaction or not. This fixes the failure in generic/273. Reported-by: Nikolay Borisov Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 40 +--- 1 file changed, 29 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index fac42b8..4509e47 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4872,8 +4872,12 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info, return -ENOSPC; spin_lock(_rsv->lock); + if (delayed_rsv->size > bytes) + bytes = 0; + else + bytes -= delayed_rsv->size; if (percpu_counter_compare(_info->total_bytes_pinned, - bytes - delayed_rsv->size) < 0) { + bytes) < 0) { spin_unlock(_rsv->lock); return -ENOSPC; } @@ -4901,7 +4905,7 @@ struct reserve_ticket { */ static void flush_space(struct btrfs_fs_info *fs_info, struct btrfs_space_info *space_info, u64 num_bytes, - int state) + u64 min_bytes, int state) { struct btrfs_root *root = fs_info->extent_root; struct btrfs_trans_handle *trans; @@ -4944,7 +4948,7 @@ static void flush_space(struct btrfs_fs_info *fs_info, break; case COMMIT_TRANS: ret = may_commit_transaction(fs_info, space_info, -num_bytes, 0); +min_bytes, 0); break; default: ret = -ENOSPC; @@ -4959,17 +4963,23 @@ static void flush_space(struct btrfs_fs_info *fs_info, static inline u64 btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info, struct btrfs_space_info *space_info, -bool system_chunk) +u64 *min_reclaim, bool system_chunk) { struct reserve_ticket *ticket; u64 used; u64 expected; u64 to_reclaim = 0; - list_for_each_entry(ticket, _info->tickets, list) + list_for_each_entry(ticket, _info->tickets, list) { + if (min_reclaim && *min_reclaim == 0) + *min_reclaim = ticket->bytes; to_reclaim += ticket->bytes; - list_for_each_entry(ticket, _info->priority_tickets, list) + } + list_for_each_entry(ticket, _info->priority_tickets, list) { + if (min_reclaim && *min_reclaim == 0) + *min_reclaim = ticket->bytes; to_reclaim += ticket->bytes; + } if (to_reclaim) return to_reclaim; @@ -4992,6 +5002,8 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info, to_reclaim = 0; to_reclaim = min(to_reclaim, space_info->bytes_may_use + space_info->bytes_reserved); + if (min_reclaim) + *min_reclaim = to_reclaim; return to_reclaim; } @@ -5005,7 +5017,7 @@ static inline int need_do_async_reclaim(struct btrfs_fs_info *fs_info, if ((space_info->bytes_used + space_info->bytes_reserved) >= thresh) return 0; - if (!btrfs_calc_reclaim_metadata_size(fs_info, space_info, + if (!btrfs_calc_reclaim_metadata_size(fs_info, space_info, NULL, system_chunk)) return 0; @@ -5038,13 +5050,14 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work) int flush_state; int commit_cycles = 0; u64 last_tickets_id; + u64 min_reclaim = 0; fs_info = container_of(work, struct btrfs_fs_info, async_reclaim_work); space_info = __find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA); spin_lock(_info->lock); to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info, - false); + _reclaim, false); if (!to_reclaim) { space_info->flush = 0; spin_unlock(_info->lock); @@ -5055,15 +5068,18 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
Re: [PATCH] btrfs: submit superblock io with REQ_META
On Fri, Aug 18, 2017 at 06:21:53PM +0200, David Sterba wrote: > The superblock is also metadata of the filesystem so the relevant IO > should be tagged as such. > Reviewed-by: Liu Bo-liubo > Signed-off-by: David Sterba > --- > fs/btrfs/disk-io.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 90b967ae46d0..21b55e8caf33 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -3433,9 +3433,11 @@ static int write_dev_supers(struct btrfs_device > *device, >*/ > if (i == 0) { > ret = btrfsic_submit_bh(REQ_OP_WRITE, > - REQ_SYNC | REQ_FUA, bh); > + REQ_SYNC | REQ_FUA | REQ_META, > + bh); > } else { > - ret = btrfsic_submit_bh(REQ_OP_WRITE, REQ_SYNC, bh); > + ret = btrfsic_submit_bh(REQ_OP_WRITE, > + REQ_SYNC | REQ_META, bh); > } > if (ret) > errors++; > -- > 2.14.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: do not async submit for nodatacsum inodes
While we submit direct writes, if the inode is flagged with nodatasum, there's no benefit to submit asynchronously, because a) we don't have to calculate checksum across processors, b) and direct IO has started a plug, but async submit makes us queue IO on each device's scheduled IO list instead of DIO's plug list, so that IOs get much less merges in general. Lets use sync submit for nodatasum inodes. Signed-off-by: Liu Bo--- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 95c2120..f4a48d8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8467,7 +8467,7 @@ static inline int __btrfs_submit_dio_bio(struct bio *bio, struct inode *inode, goto err; } map: - ret = btrfs_map_bio(fs_info, bio, 0, async_submit); + ret = btrfs_map_bio(fs_info, bio, 0, 0); err: bio_put(bio); return ret; -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: remove unnecessary skip_sum
Now that we can get inode from dip->inode, it's not necessary to pass skip_sum from callers. Signed-off-by: Liu Bo--- fs/btrfs/inode.c | 18 ++ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f4a48d8..91e701c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8424,8 +8424,7 @@ static inline blk_status_t btrfs_lookup_and_bind_dio_csum(struct inode *inode, } static inline int __btrfs_submit_dio_bio(struct bio *bio, struct inode *inode, -u64 file_offset, int skip_sum, -int async_submit) +u64 file_offset, int async_submit) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_dio_private *dip = bio->bi_private; @@ -8443,7 +8442,7 @@ static inline int __btrfs_submit_dio_bio(struct bio *bio, struct inode *inode, goto err; } - if (skip_sum) + if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM) goto map; if (write && async_submit) { @@ -8473,8 +8472,7 @@ static inline int __btrfs_submit_dio_bio(struct bio *bio, struct inode *inode, return ret; } -static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip, - int skip_sum) +static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip) { struct inode *inode = dip->inode; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); @@ -8537,7 +8535,7 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip, */ atomic_inc(>pending_bios); - ret = __btrfs_submit_dio_bio(bio, inode, file_offset, skip_sum, + ret = __btrfs_submit_dio_bio(bio, inode, file_offset, async_submit); if (ret) { bio_put(bio); @@ -8557,8 +8555,7 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip, } while (submit_len > 0); submit: - ret = __btrfs_submit_dio_bio(bio, inode, file_offset, skip_sum, -async_submit); + ret = __btrfs_submit_dio_bio(bio, inode, file_offset, async_submit); if (!ret) return 0; @@ -8583,12 +8580,9 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, struct btrfs_dio_private *dip = NULL; struct bio *bio = NULL; struct btrfs_io_bio *io_bio; - int skip_sum; bool write = (bio_op(dio_bio) == REQ_OP_WRITE); int ret = 0; - skip_sum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM; - bio = btrfs_bio_clone(dio_bio); dip = kzalloc(sizeof(*dip), GFP_NOFS); @@ -8631,7 +8625,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode, dio_data->unsubmitted_oe_range_end; } - ret = btrfs_submit_direct_hook(dip, skip_sum); + ret = btrfs_submit_direct_hook(dip); if (!ret) return; -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: make plug in writing meta blocks really work
We have started plug in btrfs_write_and_wait_marked_extents() but the generated IOs actually go to device's schedule IO list where the work is doing in another task, thus the started plug doesn't make any sense. And since we wait for IOs immediately after writing meta blocks, it's the same case as writing log tree, doing sync submit can merge more IOs. Signed-off-by: Liu Bo--- fs/btrfs/disk-io.c | 6 -- fs/btrfs/transaction.c | 2 ++ 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 080e2eb..8d097ba 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1005,8 +1005,10 @@ static blk_status_t __btree_submit_bio_done(void *private_data, struct bio *bio, return ret; } -static int check_async_write(unsigned long bio_flags) +static int check_async_write(struct btrfs_inode *bi, unsigned long bio_flags) { + if (atomic_read(>sync_writers)) + return 0; if (bio_flags & EXTENT_BIO_TREE_LOG) return 0; #ifdef CONFIG_X86 @@ -1022,7 +1024,7 @@ static blk_status_t btree_submit_bio_hook(void *private_data, struct bio *bio, { struct inode *inode = private_data; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - int async = check_async_write(bio_flags); + int async = check_async_write(BTRFS_I(inode), bio_flags); blk_status_t ret; if (bio_op(bio) != REQ_OP_WRITE) { diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index f615d59..9c5f126 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -950,6 +950,7 @@ int btrfs_write_marked_extents(struct btrfs_fs_info *fs_info, u64 start = 0; u64 end; + atomic_inc(_I(fs_info->btree_inode)->sync_writers); while (!find_first_extent_bit(dirty_pages, start, , , mark, _state)) { bool wait_writeback = false; @@ -985,6 +986,7 @@ int btrfs_write_marked_extents(struct btrfs_fs_info *fs_info, cond_resched(); start = end + 1; } + atomic_dec(_I(fs_info->btree_inode)->sync_writers); return werr; } -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qcow2 images make scrub believe the filesystem is corrupted.
HI all.. So this sucks, having to check every single disk image so it wont get corrupted stuff, while it works on other filesystems isn't exactly stellar. i changed the qcow2 settings per this page: https://pve.proxmox.com/wiki/Performance_Tweaks i went with cache=writeback since it neither uses O_DSYNC nor O_DIRECT semantics. I understand the reasons checksumming might fail and that isn't exactly btrfs fault but i scoured the entire btrfs wiki and couldnt find anything warning about btrfs + qcow2 (or other image types) in the wiki pages. Maybe adding some warning would help unlucky ppl like myself? Also, is is safe to enable compress=lzo , or is it also a no no, im also starting to suspect that discard wasnt the culprit since i have it enabled for more then a year and the only problem i got with corruption was precisely this images? | Paulo Dias | paulo.miguel.d...@gmail.com Tempora mutantur, nos et mutamur in illis. On Fri, Aug 18, 2017 at 2:59 PM, Liu Bowrote: > On Fri, Aug 18, 2017 at 06:23:18PM +0200, Goffredo Baroncelli wrote: >> On 08/18/2017 01:39 AM, Josef Bacik wrote: >> [...] >> > This is happening because the app (the guest OS in this case, we saw this >> > a lot >> > with windows guests) is changing the pages while they are in flight. We >> > calculate the checksum of the page before it's written, so if it changes >> > while >> > in flight we'll end up with a csum mismatch. >> > >> > To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 >> > image. >> > You'll have to re-create the image because NODATASUM won't apply to the >> > already >> > invalid checksums. Thanks, >> >> Hi Josef, >> >> could you elaborate: do you are saying that using O_DIRECT is incompatible >> with DATASUM ? >> > > They're compatible, but applications need to be careful. > > O_DIRECT takes userspace page, and it works like > DIO write > p = get_user_page(); > add p to bio > #btrfs submits this bio > calc_checksum(bio); > submit_bio(); > > There's a chance that page p got changed between calc_checksum() and > submit_bio(), which then causes the mismatch. > > For buffered IO, dirty page cache pages is synchronized with page > fault by page lock and page writeback bit. > > thanks, > -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qcow2 images make scrub believe the filesystem is corrupted.
On Fri, Aug 18, 2017 at 06:23:18PM +0200, Goffredo Baroncelli wrote: > On 08/18/2017 01:39 AM, Josef Bacik wrote: > [...] > > This is happening because the app (the guest OS in this case, we saw this a > > lot > > with windows guests) is changing the pages while they are in flight. We > > calculate the checksum of the page before it's written, so if it changes > > while > > in flight we'll end up with a csum mismatch. > > > > To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 > > image. > > You'll have to re-create the image because NODATASUM won't apply to the > > already > > invalid checksums. Thanks, > > Hi Josef, > > could you elaborate: do you are saying that using O_DIRECT is incompatible > with DATASUM ? > They're compatible, but applications need to be careful. O_DIRECT takes userspace page, and it works like DIO write p = get_user_page(); add p to bio #btrfs submits this bio calc_checksum(bio); submit_bio(); There's a chance that page p got changed between calc_checksum() and submit_bio(), which then causes the mismatch. For buffered IO, dirty page cache pages is synchronized with page fault by page lock and page writeback bit. thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qcow2 images make scrub believe the filesystem is corrupted.
On Fri, Aug 18, 2017 at 06:23:18PM +0200, Goffredo Baroncelli wrote: > On 08/18/2017 01:39 AM, Josef Bacik wrote: > [...] > > This is happening because the app (the guest OS in this case, we saw this a > > lot > > with windows guests) is changing the pages while they are in flight. We > > calculate the checksum of the page before it's written, so if it changes > > while > > in flight we'll end up with a csum mismatch. > > > > To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 > > image. > > You'll have to re-create the image because NODATASUM won't apply to the > > already > > invalid checksums. Thanks, > > Hi Josef, > > could you elaborate: do you are saying that using O_DIRECT is incompatible > with DATASUM ? > No, I'm saying using O_DIRECT with applications that don't protect in-flight memory are incompatible with DATASUM. We have no way of making sure nobody touches the page while we're writing it out, so after we calculate the checksum any changes to the page are going to cause a checksum mismatch. O_DIRECT are user space pages, there's nothing we can do to stop user space from doing stupid things. The options I looked into before were things like detecting the page had changed since we calculated the checksum, and re-submitting the write. This punishes applications that do the right thing (databases for example) by forcing us to calculate checksums twice. This is a shit situation because users aren't going to understand this limitation, and it bites them in the ass with all these weird errors. I think maybe we need to go back to the double-checksum thing by default, and have a flag or something for users to set if they know their application behaves properly. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qcow2 images make scrub believe the filesystem is corrupted.
On 08/18/2017 01:39 AM, Josef Bacik wrote: [...] > This is happening because the app (the guest OS in this case, we saw this a > lot > with windows guests) is changing the pages while they are in flight. We > calculate the checksum of the page before it's written, so if it changes while > in flight we'll end up with a csum mismatch. > > To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 > image. > You'll have to re-create the image because NODATASUM won't apply to the > already > invalid checksums. Thanks, Hi Josef, could you elaborate: do you are saying that using O_DIRECT is incompatible with DATASUM ? Let me know BR G.Baroncelli > > Josef > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: submit superblock io with REQ_META
The superblock is also metadata of the filesystem so the relevant IO should be tagged as such. Signed-off-by: David Sterba--- fs/btrfs/disk-io.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 90b967ae46d0..21b55e8caf33 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3433,9 +3433,11 @@ static int write_dev_supers(struct btrfs_device *device, */ if (i == 0) { ret = btrfsic_submit_bh(REQ_OP_WRITE, - REQ_SYNC | REQ_FUA, bh); + REQ_SYNC | REQ_FUA | REQ_META, + bh); } else { - ret = btrfsic_submit_bh(REQ_OP_WRITE, REQ_SYNC, bh); + ret = btrfsic_submit_bh(REQ_OP_WRITE, + REQ_SYNC | REQ_META, bh); } if (ret) errors++; -- 2.14.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH preview] btrfs: allow to set compression level for zlib
On Sat, Aug 05, 2017 at 02:15:42AM +, Nick Terrell wrote: > Looking at the memory usage of BtrFS zstd, the 128 KB window size keeps the > memory usage very reasonable up to level 19. The zstd compression levels > are computed using a tool that selects the parameters that give the best > compression ratio for a given compression speed target. Since BtrFS has a > fixed window size, the default compression levels might not be optimal. We > could compute our own compression levels for a 128 KB window size. > > | Level | Memory | > |---|| > | 1 | 0.8 MB | > | 2 | 1.0 MB | > | 3 | 1.3 MB | > | 4 | 0.9 MB | > | 5 | 1.4 MB | > | 6 | 1.5 MB | > | 7 | 1.4 MB | > | 8 | 1.8 MB | > | 9 | 1.8 MB | > | 10| 1.8 MB | > | 11| 1.8 MB | > | 12| 1.8 MB | > | 13| 2.4 MB | > | 14| 2.6 MB | > | 15| 2.6 MB | > | 16| 3.1 MB | > | 17| 3.1 MB | > | 18| 3.1 MB | > | 19| 3.1 MB | > > The workspace memory usage for each compression level. That's quite a lot, in kernel. IIRC zlib and lzo use less than 200kb, zstd wants 800kb for level 1. And this needs to be contiguous memory, so if we're lucky and get the memory at the mount time, fine. In general the memory can be fragmented (in the worst case, there are only 4k chunks available), so we'd have to vmalloc and consume the virtual mappings in great numbers. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH preview] btrfs: allow to set compression level for zlib
On Fri, Aug 04, 2017 at 09:51:44PM +, Nick Terrell wrote: > > + * @type_level is encoded algorithm and level, where level 0 means whatever > > + * default the algorithm chooses and is opaque here; > > + * - compression algo are 0-3 > > + * - the level are bits 4-7 > > zstd has 19 levels, but we can either only allow the first 15 + default, or > provide a mapping from zstd-level to BtrFS zstd-level. 19 levels sounds too much to me, I hoped that 15 should be enough for everybody. When I teste various zlib level, there were only small compression gains at a high runtime cost from levels 6-9. So some kind of mapping would be desirable if the levels 16+ prove to be better than < 15 under the btrfs contraints. > > * @out_pages is an in/out parameter, holds maximum number of pages to > > allocate > > * and returns number of actually allocated pages > > * > > @@ -880,7 +885,7 @@ static void free_workspaces(void) > > * @max_out tells us the max number of bytes that we're allowed to > > * stuff into pages > > */ > > -int btrfs_compress_pages(int type, struct address_space *mapping, > > +int btrfs_compress_pages(unsigned int type_level, struct address_space > > *mapping, > > u64 start, struct page **pages, > > unsigned long *out_pages, > > unsigned long *total_in, > > @@ -888,9 +893,11 @@ int btrfs_compress_pages(int type, struct > > address_space *mapping, > > { > > struct list_head *workspace; > > int ret; > > + int type = type_level & 0xF; > > > > workspace = find_workspace(type); > > > > + btrfs_compress_op[type - 1]->set_level(workspace, type_level); > > zlib uses the same amount of memory independently of the compression level, > but zstd uses a different amount of memory for each level. We could extend the code to provide 2 types of workspaces, one to cover the "fast" levels and one for the "high compression" levels. And the larger one can be used for 'fast' so it does not just idle around. > zstd will have > to allocate memory here if it doesn't have enough (or has way to much), > will that be okay? This would be a problem. Imagine that the system is short on free memory, starts to flush dirty data and now some filesystem starts asking for hundreds of kilobytes of new memory just to write the data (and free resources and the memory in turn). That's the case we need to preallocate when there's enough memory, at least one workspace, so there's a guarantee of forward progress. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 3/4] btrfs: Add zstd support
On Wed, Jul 26, 2017 at 12:19:29AM +0100, Giovanni Cabiddu wrote: > Hi Nick, > > On Thu, Jul 20, 2017 at 10:27:42PM +0100, Nick Terrell wrote: > > Add zstd compression and decompression support to BtrFS. zstd at its > > fastest level compresses almost as well as zlib, while offering much > > faster compression and decompression, approaching lzo speeds. > Can we look at integrating the zstd implementation below the acomp API > available in the crypto subsystem? > (https://github.com/torvalds/linux/blob/master/crypto/acompress.c) > Acomp was designed to provide a generic and uniform API for compression > in the kernel which hides algorithm specific details to frameworks. > In future it would be nice to see btrfs using exclusively acomp > for compression. This way when a new compression algorithm is exposed > through acomp, it will be available immediately in btrfs. The compression algorithm is considered part of the filesystem format and must be covered by an incompatibility bit. Regarding the acomp API, I don't see much point adding the extra abstraction layer for the two currently supported algorithms. We only call 2 functions, for compression and decompression, no need to allocate the acomp helper structures and call the functions indirectly. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND PATCH] btrfs-progs: Specify C standard to gnu90 explicitly
On Tue, Aug 15, 2017 at 09:17:12AM +0900, Qu Wenruo wrote: > Different C compilers have different default language standard. > This sometimes causes problem on different system. > > For distribution like CentOS/RHEL7, its gcc is still 4.8 and will report > error for c90 style declaration, while most developers are using newer > gcc which will just ignore it. > This makes us hard to detect such language standard problem. > > This patch will specify standard to gnu90 explicitly to avoid such problem. > Gnu90 is a good mix of c90 while provide a lot of useful gnu extension, > and is supported by all modern gcc and clang. > > Reported-by: Marco Lorenzo Crociani> Signed-off-by: Qu Wenruo Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs: remove superfluous chunk_tree argument from btrfs_alloc_dev_extent
Currently this function is always called with the object id of the root key of the chunk_tree, which is always BTRFS_CHUNK_TREE_OBJECTID. So let's subsume it straight into the function itself. No functional change. Signed-off-by: Nikolay Borisov--- fs/btrfs/volumes.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 63608c5f4487..d024f1b07282 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1571,8 +1571,7 @@ static int btrfs_free_dev_extent(struct btrfs_trans_handle *trans, static int btrfs_alloc_dev_extent(struct btrfs_trans_handle *trans, struct btrfs_device *device, - u64 chunk_tree, u64 chunk_offset, u64 start, - u64 num_bytes) + u64 chunk_offset, u64 start, u64 num_bytes) { int ret; struct btrfs_path *path; @@ -1599,7 +1598,8 @@ static int btrfs_alloc_dev_extent(struct btrfs_trans_handle *trans, leaf = path->nodes[0]; extent = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_dev_extent); - btrfs_set_dev_extent_chunk_tree(leaf, extent, chunk_tree); + btrfs_set_dev_extent_chunk_tree(leaf, extent, + BTRFS_CHUNK_TREE_OBJECTID); btrfs_set_dev_extent_chunk_objectid(leaf, extent, BTRFS_FIRST_CHUNK_TREE_OBJECTID); btrfs_set_dev_extent_chunk_offset(leaf, extent, chunk_offset); @@ -4903,10 +4903,8 @@ int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans, ret = btrfs_update_device(trans, device); if (ret) break; - ret = btrfs_alloc_dev_extent(trans, device, -chunk_root->root_key.objectid, -chunk_offset, dev_offset, -stripe_size); + ret = btrfs_alloc_dev_extent(trans, device, chunk_offset, +dev_offset, stripe_size); if (ret) break; } -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hackerwrote: > I vaguely remember following this guide at some point: > http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html > -- specifically the "Balance cannot run because the filesystem is > full" part. This may have broken some things? > If you don't do 'btrfs device delete /dev/loop0' or if that command does not complete, then it's possible to get into the situation you're in. Have you ever mounted this file system with -o degraded? I'm going to guess the history is something like: 1. enospc 2. btrfs dev add 3. some kind of filtered balance, which only causes data block groups to be moved to the 2nd device 4. 2nd device is physically removed without first 'btrfs dev del' Zirco's superblock very clearly says num_devices 2, so I'd expect normal mount to always fail unless both devices are present. Is there some weird edge case where Btrfs might permit non-degraded mount when only data bg's are on a 2nd device? And then trouble only happens later when a balance is done and it goes looking for these bg's? And then, boom! -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs: Remove chunk_objectid parameter of btrfs_alloc_dev_extent
THe function is always called with chunk_objectid set to BTRFS_FIRST_CHUNK_TREE_OBJECTID. Let's collapse the parameter in the function itself. No functional changes Signed-off-by: Nikolay Borisov--- fs/btrfs/volumes.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index a37a31ba6843..63608c5f4487 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1571,8 +1571,8 @@ static int btrfs_free_dev_extent(struct btrfs_trans_handle *trans, static int btrfs_alloc_dev_extent(struct btrfs_trans_handle *trans, struct btrfs_device *device, - u64 chunk_tree, u64 chunk_objectid, - u64 chunk_offset, u64 start, u64 num_bytes) + u64 chunk_tree, u64 chunk_offset, u64 start, + u64 num_bytes) { int ret; struct btrfs_path *path; @@ -1600,7 +1600,8 @@ static int btrfs_alloc_dev_extent(struct btrfs_trans_handle *trans, extent = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_dev_extent); btrfs_set_dev_extent_chunk_tree(leaf, extent, chunk_tree); - btrfs_set_dev_extent_chunk_objectid(leaf, extent, chunk_objectid); + btrfs_set_dev_extent_chunk_objectid(leaf, extent, + BTRFS_FIRST_CHUNK_TREE_OBJECTID); btrfs_set_dev_extent_chunk_offset(leaf, extent, chunk_offset); btrfs_set_dev_extent_length(leaf, extent, num_bytes); @@ -4904,7 +4905,6 @@ int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans, break; ret = btrfs_alloc_dev_extent(trans, device, chunk_root->root_key.objectid, -BTRFS_FIRST_CHUNK_TREE_OBJECTID, chunk_offset, dev_offset, stripe_size); if (ret) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs: Simplify btrfs_alloc_dev_extent
On Fri, Jul 28, 2017 at 08:59:12AM +0300, Nikolay Borisov wrote: > > > On 27.07.2017 20:57, Filipe Manana wrote: > > On Thu, Jul 27, 2017 at 6:36 PM, Nikolay Borisovwrote: > >> Currently btrfs_alloc_dev_extent essentially open codes btrfs_insert_item. > >> So > >> let's remove the superfluous code, leaving only the important bits, namely > >> initialising the device extent and just calling btrfs_insert_item. So > >> first add > >> definition for the stack-based set/get function. And then use them. > >> Additionally, remove the code which sets the uuid of the block header, > >> since > >> this is something which is already handled by the core btree code. > > > > Quite honestly, I don't see the value of this change at all. > > It doesn't make things simpler nor more readable nor nothing. > > We have many, really many places using btrfs_insert_empty_item() > > instead of calling btrfs_insert_item(), are you planning on sending a > > patch to do the replacement for each of them? What's the point? > > I beg you to differ. Some of the code in btrfs is a mess, it's working > but it's messy. There is constant violation of abstractions (as is the > case in this function, heck the uuid setting of the block header > function doesn't even belong here). All of this hampers reading > comprehension of the code for newcomers. You are experienced in the code > and likely this doesn't apply to you but since I'm someone relatively > new to the code this has been my experience. And I believe any effort to > actually simplify the code, make it more coherent and succinct is a win > long-term. > > I will wait for other feedback, if people feel patches like that are > just bikeshedding then I will refrain from such cleanups in the future. What I don't like about this patch, also pointed out by Filipe, is the additional memory allocation. Just for the sake of code beauty, this is not the direction the cleanups should go. The function is IMHO readable and below my current threshold of cleanup need. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs: remove superfluous chunk_tree argument from btrfs_alloc_dev_extent
On Thu, Jul 27, 2017 at 08:36:36PM +0300, Nikolay Borisov wrote: > Currently this function is always called with the object id of the root key of > the chunk_tree, which is always BTRFS_CHUNK_TREE_OBJECTID. So let's subsume it > straight into the function itself. No functional change. > > Signed-off-by: Nikolay Borisov> --- > fs/btrfs/volumes.c | 12 +--- > 1 file changed, 5 insertions(+), 7 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 1c0a5fff0e2f..5a1913956f20 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1578,8 +1578,7 @@ static int btrfs_free_dev_extent(struct > btrfs_trans_handle *trans, > > static int btrfs_alloc_dev_extent(struct btrfs_trans_handle *trans, > struct btrfs_device *device, > - u64 chunk_tree, u64 chunk_offset, u64 start, > - u64 num_bytes) > + u64 chunk_offset, u64 start, u64 num_bytes) This hunk does not apply, the context in git contains u64 chunk_objectid and I haven't found any patch that would remove or rename it. Please refresh and resend. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: Remove redundant setting of uuid in btrfs_block_header.
On Fri, Jul 28, 2017 at 10:50:14AM +0300, Nikolay Borisov wrote: > btrfs_alloc_dev_extent currently unconditionally sets the uuid in the leaf > block > header the function is working with. This is unnecessary since this operation > is peformed by the core btree handling code (splitting a node, allocating a > new > btree block etc). So let's remove it. > > Signed-off-by: Nikolay BorisovReviewed-by: David Sterba -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Next btrfs development cycle open - 4.15
Hi, a friendly reminder of the timetable and what's expected at this phase. 4.12 - current 4.13 - upcoming, urgent regression fixes only 4.14 - development closed, pull request in prep, fixes or regressions only 4.15 - development open, until 4.14-rc5 (https://btrfs.wiki.kernel.org/index.php/Developer%27s_FAQ#Development_schedule) Besides the the usual cleanups and fixes, you can now start sending any patches that could be more intrusive and would benefit from a longer period of testing, or development revisions. The base of the patches should be the last announced pull request, which is going to be named 'for-4.14-part1' in my k.org tree. Reviewed patches will be collected in a branch that's usually named 'misc-next' in my devel git repos and is part of the for-next at k.org git repo. k.org: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git devel1: http://repo.or.cz/linux-2.6/btrfs-unstable.git devel2: https://github.com/kdave/btrfs-devel d. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: use btrfsic_submit_bio instead of submit_bio in write_dev_flush
On Fri, Aug 18, 2017 at 04:38:07PM +0800, Lu Fengqi wrote: > Although this bio has no data attached, it will reach this condition > (bio->bi_opf & REQ_PREFLUSH) and then update the flush_gen of dev_state > in __btrfsic_submit_bio. So we should still submit it through integrity > checker. Otherwise, the integrity checker will throw the following warning > when I mount a newly created btrfs filesystem. Right, thanks. Reviewed-by: David Sterba-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
On 2017年08月18日 18:20, Zirconium Hacker wrote: # ./btrfs-debug-tree -b 131072 /dev/sda4 https://pastebin.com/TDa0GuqB At least this output explains everything. ( although the result may not make you happy ) Check this: item 59 key (FIRST_CHUNK_TREE CHUNK_ITEM 62351474688) itemoff 11447 itemsize 80 length 1073741824 owner 2 stripe_len 65536 type DATA io_align 65536 io_width 65536 sector_size 4096 num_stripes 1 sub_stripes 1 stripe 0 devid 2 offset 1074790400 dev_uuid 039224e8-dd3a-4e95-af1d-660a2021ac55 This means that, for logical address in range [62351474688, 62351474688+1073741824), all these very important metadata are in your *2nd* device!! ( And 4G data is also on that device) That's why all tree backup roots and tree roots read fails. Because there is no such device for btrfs to read, and that's why all we get is 0. And considering all your tree root and backup roots are in this range, that's to say, without finding your 2nd device (whose dev uuid is 039224e8-dd3a-4e95-af1d-660a2021ac55) you lost all your possibility to recovery the fs. (This reminds me to enhance btrfs kernel message about missing device) Since both super block and dump-tree result point to a missing device, I really recommend you to double check the history of the fs. ( Maybe you added a usb device to do balance but forget to run "btrfs device remove"? ) BTW, to check the dev uuid, you could use the following command to get dev_uuid: # btrfs inspect dump-super | grep dev_item.uuid Good luck. Thanks, Qu # ./btrfs-debug-tree -b 61809344512 /dev/sda4 btrfs-progs v4.12-dirty bytenr mismatch, want=61809344512, have=0 Couldn't read tree root bytenr mismatch, want=61809344512, have=0 ERROR: failed to read 61809344512 # ./btrfs-debug-tree -b 61807755264 /dev/sda4 btrfs-progs v4.12-dirty bytenr mismatch, want=61809344512, have=0 Couldn't read tree root bytenr mismatch, want=61807755264, have=0 ERROR: failed to read 61807755264 And that last one you wanted me to run debug-tree on was a duplicate. Bonus: # ./btrfs-debug-tree -b 108544 /dev/sda4 btrfs-progs v4.12-dirty bytenr mismatch, want=61809344512, have=0 Couldn't read tree root node 108544 level 1 items 2 free 491 generation 325709 owner 1 fs uuid 29889b3a-1c10-48e4-ad6d-21d03d06e90b chunk uuid 33f664ec-d0bc-42f9-87f1-d2c05046 key (EXTENT_TREE ROOT_ITEM 0) block 1085456384 (66251) gen 325709 key (286 INODE_ITEM 0) block 1085505536 (66254) gen 325709 BTW, thank you for your quick responses and help so far. On Fri, Aug 18, 2017 at 5:46 AM, Qu Wenruowrote: Would you please try this patch? https://patchwork.kernel.org/patch/9908173/ This should allow btrfs-debug-tree to output tree block even tree root is corrupted. You could apply it on lasted master branch (tagged as v4.12). Then re-execute the following command (with patched btrfs-progs): # btrfs-debug-tree -b 131072 /dev/sda4 And some new output: # btrfs-debug-tree -b 61809344512 /dev/sda4 # btrfs-debug-tree -b 61807755264 /dev/sda4 # btrfs-debug-tree -b 61809344512 /dev/sda4 Thanks, Qu On 2017年08月18日 17:29, Zirconium Hacker wrote: $ sudo btrfs check -r 108544 /dev/sda4 parent transid verify failed on 108544 wanted 325966 found 325709 parent transid verify failed on 108544 wanted 325966 found 325709 Ignoring transid failure bytenr mismatch, want=61352312832, have=0 Couldn't setup device tree ERROR: cannot open file system On Fri, Aug 18, 2017 at 5:19 AM, Qu Wenruo wrote: On 2017年08月18日 17:08, Zirconium Hacker wrote: I already ran that earlier, here's the pastebin: https://pastebin.com/KGB8nVRA Running debug-tree on all 1084 of them (I guess that was unnecessary) gave the same errors every time: bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 Then try using btrfs check with new root: # btrfs check -r 108544 /dev/sda4 Please note that, the generation in superblock differs quite a lot with find-root result. So I'm afraid it will cause quite a lot of problems. But least, it should help btrfs check to get over "Couldn't read tree root" error message. And for btrfs-debug-tree error, I'll submit a patch soon to allow it to be run on such heavily damaged fs. Thanks, Qu On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruo wrote: On 2017年08月18日 16:47, Zirconium Hacker wrote: $ sudo btrfs-debug-tree -b 131072 /dev/sda4 btrfs-progs v4.12 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 I think this can be improved for case like this. I'll try to submit a patch to enhance btrfs-debug-tree. Would you please try "btrfs-find-root /dev/sda4"? This will try to locate on-disk old tree root, and if we're lucky, old tree root can allow us to mount the fs. Mounting with degraded,ro does not fix the multi-device issue. The system was
Re: BTRFS error (device sda4): failed to read chunk tree: -5
# ./btrfs-debug-tree -b 131072 /dev/sda4 https://pastebin.com/TDa0GuqB # ./btrfs-debug-tree -b 61809344512 /dev/sda4 btrfs-progs v4.12-dirty bytenr mismatch, want=61809344512, have=0 Couldn't read tree root bytenr mismatch, want=61809344512, have=0 ERROR: failed to read 61809344512 # ./btrfs-debug-tree -b 61807755264 /dev/sda4 btrfs-progs v4.12-dirty bytenr mismatch, want=61809344512, have=0 Couldn't read tree root bytenr mismatch, want=61807755264, have=0 ERROR: failed to read 61807755264 And that last one you wanted me to run debug-tree on was a duplicate. Bonus: # ./btrfs-debug-tree -b 108544 /dev/sda4 btrfs-progs v4.12-dirty bytenr mismatch, want=61809344512, have=0 Couldn't read tree root node 108544 level 1 items 2 free 491 generation 325709 owner 1 fs uuid 29889b3a-1c10-48e4-ad6d-21d03d06e90b chunk uuid 33f664ec-d0bc-42f9-87f1-d2c05046 key (EXTENT_TREE ROOT_ITEM 0) block 1085456384 (66251) gen 325709 key (286 INODE_ITEM 0) block 1085505536 (66254) gen 325709 BTW, thank you for your quick responses and help so far. On Fri, Aug 18, 2017 at 5:46 AM, Qu Wenruowrote: > Would you please try this patch? > https://patchwork.kernel.org/patch/9908173/ > > This should allow btrfs-debug-tree to output tree block even tree root is > corrupted. > You could apply it on lasted master branch (tagged as v4.12). > > Then re-execute the following command (with patched btrfs-progs): > # btrfs-debug-tree -b 131072 /dev/sda4 > > And some new output: > # btrfs-debug-tree -b 61809344512 /dev/sda4 > # btrfs-debug-tree -b 61807755264 /dev/sda4 > # btrfs-debug-tree -b 61809344512 /dev/sda4 > > Thanks, > Qu > > > On 2017年08月18日 17:29, Zirconium Hacker wrote: >> >> $ sudo btrfs check -r 108544 /dev/sda4 >> parent transid verify failed on 108544 wanted 325966 found 325709 >> parent transid verify failed on 108544 wanted 325966 found 325709 >> Ignoring transid failure >> bytenr mismatch, want=61352312832, have=0 >> Couldn't setup device tree >> ERROR: cannot open file system >> >> On Fri, Aug 18, 2017 at 5:19 AM, Qu Wenruo wrote: >>> >>> >>> >>> On 2017年08月18日 17:08, Zirconium Hacker wrote: I already ran that earlier, here's the pastebin: https://pastebin.com/KGB8nVRA Running debug-tree on all 1084 of them (I guess that was unnecessary) gave the same errors every time: bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 >>> >>> Then try using btrfs check with new root: >>> >>> # btrfs check -r 108544 /dev/sda4 >>> >>> Please note that, the generation in superblock differs quite a lot with >>> find-root result. >>> So I'm afraid it will cause quite a lot of problems. >>> >>> But least, it should help btrfs check to get over "Couldn't read tree >>> root" >>> error message. >>> >>> And for btrfs-debug-tree error, I'll submit a patch soon to allow it to >>> be >>> run on such heavily damaged fs. >>> >>> >>> Thanks, >>> Qu >>> On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruo wrote: > > > > > On 2017年08月18日 16:47, Zirconium Hacker wrote: >> >> >> >> $ sudo btrfs-debug-tree -b 131072 /dev/sda4 >> btrfs-progs v4.12 >> bytenr mismatch, want=61809344512, have=0 >> Couldn't read tree root >> ERROR: unable to open /dev/sda4 > > > > > I think this can be improved for case like this. > I'll try to submit a patch to enhance btrfs-debug-tree. > > Would you please try "btrfs-find-root /dev/sda4"? > This will try to locate on-disk old tree root, and if we're lucky, old > tree > root can allow us to mount the fs. > >> >> Mounting with degraded,ro does not fix the multi-device issue. The >> system was never really intended to have a second device, though: > > > > > Wait for a minute, did you mean this btrfs doesn't ever have a second > device? > This seems quite weird now. > >> >> $ sudo btrfs fi show /dev/sda4 >> bytenr mismatch, want=61809344512, have=0 >> Couldn't read tree root >> Label: none uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b >> Total devices 2 FS bytes used 49.52GiB >> devid1 size 54.07GiB used 54.07GiB path /dev/sda4 >> *** Some devices missing >> >> I vaguely remember following this guide at some point: >> >> >> >> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html >> -- specifically the "Balance cannot run because the filesystem is >> full" part. This may have broken some things? > > > > > Not sure, at least from your superblock, too many things are in doubt. > From the number of devices, to strange system chunk. > > > Thanks, > Qu >> >> >> >> >> On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo
Re: BTRFS error (device sda4): failed to read chunk tree: -5
Would you please try this patch? https://patchwork.kernel.org/patch/9908173/ This should allow btrfs-debug-tree to output tree block even tree root is corrupted. You could apply it on lasted master branch (tagged as v4.12). Then re-execute the following command (with patched btrfs-progs): # btrfs-debug-tree -b 131072 /dev/sda4 And some new output: # btrfs-debug-tree -b 61809344512 /dev/sda4 # btrfs-debug-tree -b 61807755264 /dev/sda4 # btrfs-debug-tree -b 61809344512 /dev/sda4 Thanks, Qu On 2017年08月18日 17:29, Zirconium Hacker wrote: $ sudo btrfs check -r 108544 /dev/sda4 parent transid verify failed on 108544 wanted 325966 found 325709 parent transid verify failed on 108544 wanted 325966 found 325709 Ignoring transid failure bytenr mismatch, want=61352312832, have=0 Couldn't setup device tree ERROR: cannot open file system On Fri, Aug 18, 2017 at 5:19 AM, Qu Wenruowrote: On 2017年08月18日 17:08, Zirconium Hacker wrote: I already ran that earlier, here's the pastebin: https://pastebin.com/KGB8nVRA Running debug-tree on all 1084 of them (I guess that was unnecessary) gave the same errors every time: bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 Then try using btrfs check with new root: # btrfs check -r 108544 /dev/sda4 Please note that, the generation in superblock differs quite a lot with find-root result. So I'm afraid it will cause quite a lot of problems. But least, it should help btrfs check to get over "Couldn't read tree root" error message. And for btrfs-debug-tree error, I'll submit a patch soon to allow it to be run on such heavily damaged fs. Thanks, Qu On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruo wrote: On 2017年08月18日 16:47, Zirconium Hacker wrote: $ sudo btrfs-debug-tree -b 131072 /dev/sda4 btrfs-progs v4.12 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 I think this can be improved for case like this. I'll try to submit a patch to enhance btrfs-debug-tree. Would you please try "btrfs-find-root /dev/sda4"? This will try to locate on-disk old tree root, and if we're lucky, old tree root can allow us to mount the fs. Mounting with degraded,ro does not fix the multi-device issue. The system was never really intended to have a second device, though: Wait for a minute, did you mean this btrfs doesn't ever have a second device? This seems quite weird now. $ sudo btrfs fi show /dev/sda4 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root Label: none uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b Total devices 2 FS bytes used 49.52GiB devid1 size 54.07GiB used 54.07GiB path /dev/sda4 *** Some devices missing I vaguely remember following this guide at some point: http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html -- specifically the "Balance cannot run because the filesystem is full" part. This may have broken some things? Not sure, at least from your superblock, too many things are in doubt. From the number of devices, to strange system chunk. Thanks, Qu On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo wrote: On 2017年08月18日 15:17, Zirconium Hacker wrote: I checked my fstab, and my mount options for that partition are: nodev,nosuid (so no discard). As far as I remember, I had some issues converting from ext4 with existing tools (I think that was on Debian so the tools were likely older) so I did a manual conversion backup, wipe, copy files back). $ sudo btrfs-find-root -o 3 /dev/sda4 Couldn't read tree root Superblock thinks the generation is 311252 Superblock thinks the level is 0 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 Found tree root at 131072 gen 311252 level 0 So chunk root (and since it's level 0, the whole chunk tree) seems good. Could you please try the following command? # btrfs-debug-tree -b 131072 /dev/sda4 I assume it may fail due to the fact that root tree is corrupted. But maybe we are lucky? And further investigating your super dump and the code, it's shows some clue, mostly related to your multi-device setup. Your find-root output shows that, the only chunk leaf in /dev/sda4 seems good. And in btrfs_read_chunk_tree(), which returned -EIO and caused the error message, will first search chunk root. Since your chunk leaf is good, such search itself should not cause too much problem. Then btrfs_read_chunk_tree() will try to read out each device, by calling read_one_dev(). Which can return -EIO if any device is missing and you're not using degraded mount option. Is your 2nd device missing? If so, would you please try to mount with "degraded,ro" mount option? BTW, if you didn't manually convert chunk profiles, did you first create btrfs on single device, and then added a new device to the btrfs? Thanks, Qu On Fri, Aug 18, 2017 at 12:10 AM,
[PATCH] btrfs-progs: Allow inspect dump-tree to show specified tree block even some tree roots are corrupted
For btrfs inspect-internal dump-tree, if we use "-b" parameter to show specified tree block, then we don't really need extra tree roots. Only chunk root is needed to build up the whole chunk mapping so we can read tree blocks. This patch will add __OPEN_CTREE_RETURN_CHUNK_ROOT flag when show speicifed tree block. So even root tree is corrupted, we can still use inspect-internal dump-tree to do some debugging. Reported-by: Zirconium HackerSigned-off-by: Qu Wenruo --- cmds-inspect-dump-tree.c | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c index 93dff086..876ddcb5 100644 --- a/cmds-inspect-dump-tree.c +++ b/cmds-inspect-dump-tree.c @@ -222,6 +222,7 @@ int cmd_inspect_dump_tree(int argc, char **argv) int uuid_tree_only = 0; int roots_only = 0; int root_backups = 0; + unsigned open_ctree_flags = OPEN_CTREE_FS_PARTIAL; u64 block_only = 0; struct btrfs_root *tree_root_scan; u64 tree_id = 0; @@ -260,6 +261,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) root_backups = 1; break; case 'b': + /* +* If only showing one block, no need to fill roots +* other than chunk root +*/ + open_ctree_flags |= __OPEN_CTREE_RETURN_CHUNK_ROOT; block_only = arg_strtou64(optarg); break; case 't': { @@ -299,19 +305,14 @@ int cmd_inspect_dump_tree(int argc, char **argv) printf("%s\n", PACKAGE_STRING); - info = open_ctree_fs_info(argv[optind], 0, 0, 0, OPEN_CTREE_PARTIAL); + info = open_ctree_fs_info(argv[optind], 0, 0, 0, open_ctree_flags); if (!info) { error("unable to open %s", argv[optind]); goto out; } - root = info->fs_root; - if (!root) { - error("unable to open %s", argv[optind]); - goto out; - } - if (block_only) { + root = info->chunk_root; leaf = read_tree_block(info, block_only, info->nodesize, 0); @@ -337,6 +338,12 @@ int cmd_inspect_dump_tree(int argc, char **argv) goto close_root; } + root = info->fs_root; + if (!root) { + error("unable to open %s", argv[optind]); + goto out; + } + if (!(extent_only || uuid_tree_only || tree_id)) { if (roots_only) { printf("root tree: %llu level %d\n", -- 2.14.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
$ sudo btrfs check -r 108544 /dev/sda4 parent transid verify failed on 108544 wanted 325966 found 325709 parent transid verify failed on 108544 wanted 325966 found 325709 Ignoring transid failure bytenr mismatch, want=61352312832, have=0 Couldn't setup device tree ERROR: cannot open file system On Fri, Aug 18, 2017 at 5:19 AM, Qu Wenruowrote: > > > On 2017年08月18日 17:08, Zirconium Hacker wrote: >> >> I already ran that earlier, here's the pastebin: >> https://pastebin.com/KGB8nVRA >> >> Running debug-tree on all 1084 of them (I guess that was unnecessary) >> gave the same errors every time: >> bytenr mismatch, want=61809344512, have=0 >> Couldn't read tree root >> ERROR: unable to open /dev/sda4 >> > > Then try using btrfs check with new root: > > # btrfs check -r 108544 /dev/sda4 > > Please note that, the generation in superblock differs quite a lot with > find-root result. > So I'm afraid it will cause quite a lot of problems. > > But least, it should help btrfs check to get over "Couldn't read tree root" > error message. > > And for btrfs-debug-tree error, I'll submit a patch soon to allow it to be > run on such heavily damaged fs. > > > Thanks, > Qu > >> On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruo wrote: >>> >>> >>> >>> On 2017年08月18日 16:47, Zirconium Hacker wrote: $ sudo btrfs-debug-tree -b 131072 /dev/sda4 btrfs-progs v4.12 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 >>> >>> >>> >>> I think this can be improved for case like this. >>> I'll try to submit a patch to enhance btrfs-debug-tree. >>> >>> Would you please try "btrfs-find-root /dev/sda4"? >>> This will try to locate on-disk old tree root, and if we're lucky, old >>> tree >>> root can allow us to mount the fs. >>> Mounting with degraded,ro does not fix the multi-device issue. The system was never really intended to have a second device, though: >>> >>> >>> >>> Wait for a minute, did you mean this btrfs doesn't ever have a second >>> device? >>> This seems quite weird now. >>> $ sudo btrfs fi show /dev/sda4 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root Label: none uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b Total devices 2 FS bytes used 49.52GiB devid1 size 54.07GiB used 54.07GiB path /dev/sda4 *** Some devices missing I vaguely remember following this guide at some point: http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html -- specifically the "Balance cannot run because the filesystem is full" part. This may have broken some things? >>> >>> >>> >>> Not sure, at least from your superblock, too many things are in doubt. >>> From the number of devices, to strange system chunk. >>> >>> >>> Thanks, >>> Qu On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo wrote: > > > > > On 2017年08月18日 15:17, Zirconium Hacker wrote: >> >> >> >> I checked my fstab, and my mount options for that partition are: >> nodev,nosuid (so no discard). >> As far as I remember, I had some issues converting from ext4 with >> existing tools (I think that was on Debian so the tools were likely >> older) so I did a manual conversion backup, wipe, copy files back). >> >> $ sudo btrfs-find-root -o 3 /dev/sda4 >> Couldn't read tree root >> Superblock thinks the generation is 311252 >> Superblock thinks the level is 0 >> ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 >> Found tree root at 131072 gen 311252 level 0 > > > > > So chunk root (and since it's level 0, the whole chunk tree) seems > good. > > Could you please try the following command? > # btrfs-debug-tree -b 131072 /dev/sda4 > > I assume it may fail due to the fact that root tree is corrupted. > But maybe we are lucky? > > > And further investigating your super dump and the code, it's shows some > clue, mostly related to your multi-device setup. > > Your find-root output shows that, the only chunk leaf in /dev/sda4 > seems > good. > And in btrfs_read_chunk_tree(), which returned -EIO and caused the > error > message, will first search chunk root. > > Since your chunk leaf is good, such search itself should not cause too > much > problem. > > Then btrfs_read_chunk_tree() will try to read out each device, by > calling > read_one_dev(). > Which can return -EIO if any device is missing and you're not using > degraded > mount option. > > Is your 2nd device missing? If so, would you please try to mount with > "degraded,ro" mount option? > > BTW, if you didn't manually convert chunk profiles, did you first > create > btrfs on
Re: BTRFS error (device sda4): failed to read chunk tree: -5
On 2017年08月18日 17:08, Zirconium Hacker wrote: I already ran that earlier, here's the pastebin: https://pastebin.com/KGB8nVRA Running debug-tree on all 1084 of them (I guess that was unnecessary) gave the same errors every time: bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 Then try using btrfs check with new root: # btrfs check -r 108544 /dev/sda4 Please note that, the generation in superblock differs quite a lot with find-root result. So I'm afraid it will cause quite a lot of problems. But least, it should help btrfs check to get over "Couldn't read tree root" error message. And for btrfs-debug-tree error, I'll submit a patch soon to allow it to be run on such heavily damaged fs. Thanks, Qu On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruowrote: On 2017年08月18日 16:47, Zirconium Hacker wrote: $ sudo btrfs-debug-tree -b 131072 /dev/sda4 btrfs-progs v4.12 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 I think this can be improved for case like this. I'll try to submit a patch to enhance btrfs-debug-tree. Would you please try "btrfs-find-root /dev/sda4"? This will try to locate on-disk old tree root, and if we're lucky, old tree root can allow us to mount the fs. Mounting with degraded,ro does not fix the multi-device issue. The system was never really intended to have a second device, though: Wait for a minute, did you mean this btrfs doesn't ever have a second device? This seems quite weird now. $ sudo btrfs fi show /dev/sda4 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root Label: none uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b Total devices 2 FS bytes used 49.52GiB devid1 size 54.07GiB used 54.07GiB path /dev/sda4 *** Some devices missing I vaguely remember following this guide at some point: http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html -- specifically the "Balance cannot run because the filesystem is full" part. This may have broken some things? Not sure, at least from your superblock, too many things are in doubt. From the number of devices, to strange system chunk. Thanks, Qu On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo wrote: On 2017年08月18日 15:17, Zirconium Hacker wrote: I checked my fstab, and my mount options for that partition are: nodev,nosuid (so no discard). As far as I remember, I had some issues converting from ext4 with existing tools (I think that was on Debian so the tools were likely older) so I did a manual conversion backup, wipe, copy files back). $ sudo btrfs-find-root -o 3 /dev/sda4 Couldn't read tree root Superblock thinks the generation is 311252 Superblock thinks the level is 0 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 Found tree root at 131072 gen 311252 level 0 So chunk root (and since it's level 0, the whole chunk tree) seems good. Could you please try the following command? # btrfs-debug-tree -b 131072 /dev/sda4 I assume it may fail due to the fact that root tree is corrupted. But maybe we are lucky? And further investigating your super dump and the code, it's shows some clue, mostly related to your multi-device setup. Your find-root output shows that, the only chunk leaf in /dev/sda4 seems good. And in btrfs_read_chunk_tree(), which returned -EIO and caused the error message, will first search chunk root. Since your chunk leaf is good, such search itself should not cause too much problem. Then btrfs_read_chunk_tree() will try to read out each device, by calling read_one_dev(). Which can return -EIO if any device is missing and you're not using degraded mount option. Is your 2nd device missing? If so, would you please try to mount with "degraded,ro" mount option? BTW, if you didn't manually convert chunk profiles, did you first create btrfs on single device, and then added a new device to the btrfs? Thanks, Qu On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphy wrote: On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo wrote: BTW are you using discard mount option? Sometimes it can cause problem. OP did not say if it was using discard mount option; but did say some time before this (I'm not sure how recent) he had used fstrim. The firmware for this SSD model is current. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
I already ran that earlier, here's the pastebin: https://pastebin.com/KGB8nVRA Running debug-tree on all 1084 of them (I guess that was unnecessary) gave the same errors every time: bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruowrote: > > > On 2017年08月18日 16:47, Zirconium Hacker wrote: >> >> $ sudo btrfs-debug-tree -b 131072 /dev/sda4 >> btrfs-progs v4.12 >> bytenr mismatch, want=61809344512, have=0 >> Couldn't read tree root >> ERROR: unable to open /dev/sda4 > > > I think this can be improved for case like this. > I'll try to submit a patch to enhance btrfs-debug-tree. > > Would you please try "btrfs-find-root /dev/sda4"? > This will try to locate on-disk old tree root, and if we're lucky, old tree > root can allow us to mount the fs. > >> >> Mounting with degraded,ro does not fix the multi-device issue. The >> system was never really intended to have a second device, though: > > > Wait for a minute, did you mean this btrfs doesn't ever have a second > device? > This seems quite weird now. > >> >> $ sudo btrfs fi show /dev/sda4 >> bytenr mismatch, want=61809344512, have=0 >> Couldn't read tree root >> Label: none uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b >> Total devices 2 FS bytes used 49.52GiB >> devid1 size 54.07GiB used 54.07GiB path /dev/sda4 >> *** Some devices missing >> >> I vaguely remember following this guide at some point: >> >> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html >> -- specifically the "Balance cannot run because the filesystem is >> full" part. This may have broken some things? > > > Not sure, at least from your superblock, too many things are in doubt. > From the number of devices, to strange system chunk. > > > Thanks, > Qu >> >> >> On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo wrote: >>> >>> >>> >>> On 2017年08月18日 15:17, Zirconium Hacker wrote: I checked my fstab, and my mount options for that partition are: nodev,nosuid (so no discard). As far as I remember, I had some issues converting from ext4 with existing tools (I think that was on Debian so the tools were likely older) so I did a manual conversion backup, wipe, copy files back). $ sudo btrfs-find-root -o 3 /dev/sda4 Couldn't read tree root Superblock thinks the generation is 311252 Superblock thinks the level is 0 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 Found tree root at 131072 gen 311252 level 0 >>> >>> >>> >>> So chunk root (and since it's level 0, the whole chunk tree) seems good. >>> >>> Could you please try the following command? >>> # btrfs-debug-tree -b 131072 /dev/sda4 >>> >>> I assume it may fail due to the fact that root tree is corrupted. >>> But maybe we are lucky? >>> >>> >>> And further investigating your super dump and the code, it's shows some >>> clue, mostly related to your multi-device setup. >>> >>> Your find-root output shows that, the only chunk leaf in /dev/sda4 seems >>> good. >>> And in btrfs_read_chunk_tree(), which returned -EIO and caused the error >>> message, will first search chunk root. >>> >>> Since your chunk leaf is good, such search itself should not cause too >>> much >>> problem. >>> >>> Then btrfs_read_chunk_tree() will try to read out each device, by calling >>> read_one_dev(). >>> Which can return -EIO if any device is missing and you're not using >>> degraded >>> mount option. >>> >>> Is your 2nd device missing? If so, would you please try to mount with >>> "degraded,ro" mount option? >>> >>> BTW, if you didn't manually convert chunk profiles, did you first create >>> btrfs on single device, and then added a new device to the btrfs? >>> >>> Thanks, >>> Qu >>> On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphy wrote: > > > On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo > wrote: > >> BTW are you using discard mount option? Sometimes it can cause >> problem. > > > > OP did not say if it was using discard mount option; but did say some > time before this (I'm not sure how recent) he had used fstrim. The > firmware for this SSD model is current. > > > -- > Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
On 2017年08月18日 16:47, Zirconium Hacker wrote: $ sudo btrfs-debug-tree -b 131072 /dev/sda4 btrfs-progs v4.12 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 I think this can be improved for case like this. I'll try to submit a patch to enhance btrfs-debug-tree. Would you please try "btrfs-find-root /dev/sda4"? This will try to locate on-disk old tree root, and if we're lucky, old tree root can allow us to mount the fs. Mounting with degraded,ro does not fix the multi-device issue. The system was never really intended to have a second device, though: Wait for a minute, did you mean this btrfs doesn't ever have a second device? This seems quite weird now. $ sudo btrfs fi show /dev/sda4 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root Label: none uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b Total devices 2 FS bytes used 49.52GiB devid1 size 54.07GiB used 54.07GiB path /dev/sda4 *** Some devices missing I vaguely remember following this guide at some point: http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html -- specifically the "Balance cannot run because the filesystem is full" part. This may have broken some things? Not sure, at least from your superblock, too many things are in doubt. From the number of devices, to strange system chunk. Thanks, Qu On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruowrote: On 2017年08月18日 15:17, Zirconium Hacker wrote: I checked my fstab, and my mount options for that partition are: nodev,nosuid (so no discard). As far as I remember, I had some issues converting from ext4 with existing tools (I think that was on Debian so the tools were likely older) so I did a manual conversion backup, wipe, copy files back). $ sudo btrfs-find-root -o 3 /dev/sda4 Couldn't read tree root Superblock thinks the generation is 311252 Superblock thinks the level is 0 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 Found tree root at 131072 gen 311252 level 0 So chunk root (and since it's level 0, the whole chunk tree) seems good. Could you please try the following command? # btrfs-debug-tree -b 131072 /dev/sda4 I assume it may fail due to the fact that root tree is corrupted. But maybe we are lucky? And further investigating your super dump and the code, it's shows some clue, mostly related to your multi-device setup. Your find-root output shows that, the only chunk leaf in /dev/sda4 seems good. And in btrfs_read_chunk_tree(), which returned -EIO and caused the error message, will first search chunk root. Since your chunk leaf is good, such search itself should not cause too much problem. Then btrfs_read_chunk_tree() will try to read out each device, by calling read_one_dev(). Which can return -EIO if any device is missing and you're not using degraded mount option. Is your 2nd device missing? If so, would you please try to mount with "degraded,ro" mount option? BTW, if you didn't manually convert chunk profiles, did you first create btrfs on single device, and then added a new device to the btrfs? Thanks, Qu On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphy wrote: On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo wrote: BTW are you using discard mount option? Sometimes it can cause problem. OP did not say if it was using discard mount option; but did say some time before this (I'm not sure how recent) he had used fstrim. The firmware for this SSD model is current. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
$ sudo btrfs-debug-tree -b 131072 /dev/sda4 btrfs-progs v4.12 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root ERROR: unable to open /dev/sda4 Mounting with degraded,ro does not fix the multi-device issue. The system was never really intended to have a second device, though: $ sudo btrfs fi show /dev/sda4 bytenr mismatch, want=61809344512, have=0 Couldn't read tree root Label: none uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b Total devices 2 FS bytes used 49.52GiB devid1 size 54.07GiB used 54.07GiB path /dev/sda4 *** Some devices missing I vaguely remember following this guide at some point: http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html -- specifically the "Balance cannot run because the filesystem is full" part. This may have broken some things? On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruowrote: > > > On 2017年08月18日 15:17, Zirconium Hacker wrote: >> >> I checked my fstab, and my mount options for that partition are: >> nodev,nosuid (so no discard). >> As far as I remember, I had some issues converting from ext4 with >> existing tools (I think that was on Debian so the tools were likely >> older) so I did a manual conversion backup, wipe, copy files back). >> >> $ sudo btrfs-find-root -o 3 /dev/sda4 >> Couldn't read tree root >> Superblock thinks the generation is 311252 >> Superblock thinks the level is 0 >> ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 >> Found tree root at 131072 gen 311252 level 0 > > > So chunk root (and since it's level 0, the whole chunk tree) seems good. > > Could you please try the following command? > # btrfs-debug-tree -b 131072 /dev/sda4 > > I assume it may fail due to the fact that root tree is corrupted. > But maybe we are lucky? > > > And further investigating your super dump and the code, it's shows some > clue, mostly related to your multi-device setup. > > Your find-root output shows that, the only chunk leaf in /dev/sda4 seems > good. > And in btrfs_read_chunk_tree(), which returned -EIO and caused the error > message, will first search chunk root. > > Since your chunk leaf is good, such search itself should not cause too much > problem. > > Then btrfs_read_chunk_tree() will try to read out each device, by calling > read_one_dev(). > Which can return -EIO if any device is missing and you're not using degraded > mount option. > > Is your 2nd device missing? If so, would you please try to mount with > "degraded,ro" mount option? > > BTW, if you didn't manually convert chunk profiles, did you first create > btrfs on single device, and then added a new device to the btrfs? > > Thanks, > Qu > >> >> On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphy >> wrote: >>> >>> On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo >>> wrote: >>> BTW are you using discard mount option? Sometimes it can cause problem. >>> >>> >>> OP did not say if it was using discard mount option; but did say some >>> time before this (I'm not sure how recent) he had used fstrim. The >>> firmware for this SSD model is current. >>> >>> >>> -- >>> Chris Murphy >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs-progs: add necessary close(fd) in mkfs
On 18.08.2017 11:32, Gu Jinxiang wrote: > Add some close(fd) when error occures in mkfs. > And add close(fd) when end use it. When a process exits all of its fds (and all other runtime resources tracked by the kernel) are freed, so we don't leak fds if that's your worry. Admittedly, it's good practice to manage the lifetime of fds correctly in a program, so I don't have any strong preferences either ways. > > Signed-off-by: Gu Jinxiang> --- > mkfs/main.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/mkfs/main.c b/mkfs/main.c > index 2b109a5..ec82565 100644 > --- a/mkfs/main.c > +++ b/mkfs/main.c > @@ -1694,6 +1694,7 @@ int main(int argc, char **argv) > file, > (unsigned long long)block_count, > (unsigned long long)dev_block_count); > + close(fd); > exit(1); > } > } else { > @@ -1711,6 +1712,7 @@ int main(int argc, char **argv) > ret = zero_output_file(fd, block_count); > if (ret) { > error("unable to zero the output file"); > + close(fd); > exit(1); > } > /* our "device" is the new image file */ > @@ -1721,6 +1723,7 @@ int main(int argc, char **argv) > if (dev_block_count < BTRFS_MKFS_SYSTEM_GROUP_SIZE) { > error("device is too small to make filesystem, must be at least > %llu", > (unsigned long > long)BTRFS_MKFS_SYSTEM_GROUP_SIZE); > + close(fd); > exit(1); > } > > @@ -1740,6 +1743,7 @@ int main(int argc, char **argv) > ret = make_btrfs(fd, _cfg); > if (ret) { > error("error during mkfs: %s", strerror(-ret)); > + close(fd); > exit(1); > } > > @@ -1750,6 +1754,7 @@ int main(int argc, char **argv) > close(fd); > exit(1); > } > + close(fd); > root = fs_info->fs_root; > fs_info->alloc_start = alloc_start; > > @@ -1827,6 +1832,7 @@ int main(int argc, char **argv) > sectorsize, sectorsize, sectorsize); > if (ret) { > error("unable to add %s to filesystem: %d", file, ret); > + close(fd); > goto out; > } > if (verbose >= 2) { > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: use btrfsic_submit_bio instead of submit_bio in write_dev_flush
Although this bio has no data attached, it will reach this condition (bio->bi_opf & REQ_PREFLUSH) and then update the flush_gen of dev_state in __btrfsic_submit_bio. So we should still submit it through integrity checker. Otherwise, the integrity checker will throw the following warning when I mount a newly created btrfs filesystem. [10264.755497] btrfs: attempt to write superblock which references block M @29523968 (sdb1/654400/0) which is not flushed out of disk's write cache (block flush_gen=1, dev->flush_gen=0)! [10264.755498] btrfs: attempt to write superblock which references block M @29523968 (sdb1/37912576/0) which is not flushed out of disk's write cache (block flush_gen=1, dev->flush_gen=0)! Signed-off-by: Lu Fengqi--- fs/btrfs/disk-io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9e187df3c106..0dac37439bb9 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3515,7 +3515,7 @@ static void write_dev_flush(struct btrfs_device *device) init_completion(>flush_wait); bio->bi_private = >flush_wait; - submit_bio(bio); + btrfsic_submit_bio(bio); device->flush_bio_sent = 1; } -- 2.14.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs fi du -s gives Inappropriate ioctl for device
On Thu, Aug 17, 2017 at 10:59:33PM +0200, Goffredo Baroncelli wrote: > Patches will follow shortly With your patches applied: ~ # btrfs fi du -s /mnt/red/\@backup/ /mnt/red/\@backup/.snapshot/monthly_2017-08-01_05\:30\:01/ /mnt/red/\@svn/ /mnt/red/\@svn/.snapshot/weekly_2017-08-05_04\:20\:02/ Total Exclusive Set shared Filename 1.86TiB75.73MiB 312.77GiB /mnt/red/@backup/ 320.29GiB 0.00B 312.19GiB /mnt/red/@backup/.snapshot/monthly_2017-08-01_05:30:01/ 52.24GiB10.19MiB 4.13GiB /mnt/red/@svn/ 4.35GiB 1.05MiB 4.12GiB /mnt/red/@svn/.snapshot/weekly_2017-08-05_04:20:02/ Best regards, Piotr Szymaniak. -- Mysle, ze gdyby diabel nie istnial i mialby go stworzyc czlowiek, to stworzylby go na swoj obraz i podobienstwo. -- Fiodor Dostojewski signature.asc Description: Digital signature
[PATCH 1/2] btrfs-progs: add necessary close(fd) in mkfs
Add some close(fd) when error occures in mkfs. And add close(fd) when end use it. Signed-off-by: Gu Jinxiang--- mkfs/main.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/mkfs/main.c b/mkfs/main.c index 2b109a5..ec82565 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -1694,6 +1694,7 @@ int main(int argc, char **argv) file, (unsigned long long)block_count, (unsigned long long)dev_block_count); + close(fd); exit(1); } } else { @@ -1711,6 +1712,7 @@ int main(int argc, char **argv) ret = zero_output_file(fd, block_count); if (ret) { error("unable to zero the output file"); + close(fd); exit(1); } /* our "device" is the new image file */ @@ -1721,6 +1723,7 @@ int main(int argc, char **argv) if (dev_block_count < BTRFS_MKFS_SYSTEM_GROUP_SIZE) { error("device is too small to make filesystem, must be at least %llu", (unsigned long long)BTRFS_MKFS_SYSTEM_GROUP_SIZE); + close(fd); exit(1); } @@ -1740,6 +1743,7 @@ int main(int argc, char **argv) ret = make_btrfs(fd, _cfg); if (ret) { error("error during mkfs: %s", strerror(-ret)); + close(fd); exit(1); } @@ -1750,6 +1754,7 @@ int main(int argc, char **argv) close(fd); exit(1); } + close(fd); root = fs_info->fs_root; fs_info->alloc_start = alloc_start; @@ -1827,6 +1832,7 @@ int main(int argc, char **argv) sectorsize, sectorsize, sectorsize); if (ret) { error("unable to add %s to filesystem: %d", file, ret); + close(fd); goto out; } if (verbose >= 2) { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs-progs: delete not-used parameter fd
Parameter fd is not used in function make_image and traverse_directory of mkfs. Delete it. Signed-off-by: Gu Jinxiang--- mkfs/main.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/mkfs/main.c b/mkfs/main.c index ec82565..3e9c1b8 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -801,7 +801,7 @@ static char *make_path(const char *dir, const char *name) static int traverse_directory(struct btrfs_trans_handle *trans, struct btrfs_root *root, const char *dir_name, - struct directory_name_entry *dir_head, int out_fd) + struct directory_name_entry *dir_head) { int ret = 0; @@ -1029,8 +1029,7 @@ static int create_chunks(struct btrfs_trans_handle *trans, return ret; } -static int make_image(const char *source_dir, struct btrfs_root *root, - int out_fd) +static int make_image(const char *source_dir, struct btrfs_root *root) { int ret; struct btrfs_trans_handle *trans; @@ -1048,7 +1047,7 @@ static int make_image(const char *source_dir, struct btrfs_root *root, INIT_LIST_HEAD(_head.list); trans = btrfs_start_transaction(root, 1); - ret = traverse_directory(trans, root, source_dir, _head, out_fd); + ret = traverse_directory(trans, root, source_dir, _head); if (ret) { error("unable to traverse directory %s: %d", source_dir, ret); goto fail; @@ -1882,7 +1881,7 @@ raid_groups: goto out; } - ret = make_image(source_dir, root, fd); + ret = make_image(source_dir, root); if (ret) { error("error wihle filling filesystem: %d", ret); goto out; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
On 2017年08月18日 15:17, Zirconium Hacker wrote: I checked my fstab, and my mount options for that partition are: nodev,nosuid (so no discard). As far as I remember, I had some issues converting from ext4 with existing tools (I think that was on Debian so the tools were likely older) so I did a manual conversion backup, wipe, copy files back). $ sudo btrfs-find-root -o 3 /dev/sda4 Couldn't read tree root Superblock thinks the generation is 311252 Superblock thinks the level is 0 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 Found tree root at 131072 gen 311252 level 0 So chunk root (and since it's level 0, the whole chunk tree) seems good. Could you please try the following command? # btrfs-debug-tree -b 131072 /dev/sda4 I assume it may fail due to the fact that root tree is corrupted. But maybe we are lucky? And further investigating your super dump and the code, it's shows some clue, mostly related to your multi-device setup. Your find-root output shows that, the only chunk leaf in /dev/sda4 seems good. And in btrfs_read_chunk_tree(), which returned -EIO and caused the error message, will first search chunk root. Since your chunk leaf is good, such search itself should not cause too much problem. Then btrfs_read_chunk_tree() will try to read out each device, by calling read_one_dev(). Which can return -EIO if any device is missing and you're not using degraded mount option. Is your 2nd device missing? If so, would you please try to mount with "degraded,ro" mount option? BTW, if you didn't manually convert chunk profiles, did you first create btrfs on single device, and then added a new device to the btrfs? Thanks, Qu On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphywrote: On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo wrote: BTW are you using discard mount option? Sometimes it can cause problem. OP did not say if it was using discard mount option; but did say some time before this (I'm not sure how recent) he had used fstrim. The firmware for this SSD model is current. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs fi du -s gives Inappropriate ioctl for device
On 08/18/2017 09:12 AM, Nikolay Borisov wrote: > > > On 18.08.2017 10:09, Goffredo Baroncelli wrote: >> On 08/18/2017 08:34 AM, Nikolay Borisov wrote: [...] >>> It would be awesome if you manage to introduce xfstests for this case >> >> I am not sure if this is the right thing to do: the bugs are related to >> a) an incorrect error handling of the function lookup_path_rootid() >> b) the strange behavior of the BTRFS_EMPTY_SUBVOL_DIR_OBJECTID directory >> >> For a) it is impossible to check for each lookup_path_rootid() call from a >> xfstest; for b) I am quite sure that there are a lot of corner case not >> properly handled in btrfs progs (and still btrfs fi-du handle it wrongly) > > So the bug is actually in the btrfsprogs not in the fs, am I correct? > Yes, definitively BR G.Baroncelli -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS error (device sda4): failed to read chunk tree: -5
I checked my fstab, and my mount options for that partition are: nodev,nosuid (so no discard). As far as I remember, I had some issues converting from ext4 with existing tools (I think that was on Debian so the tools were likely older) so I did a manual conversion backup, wipe, copy files back). $ sudo btrfs-find-root -o 3 /dev/sda4 Couldn't read tree root Superblock thinks the generation is 311252 Superblock thinks the level is 0 ERROR: tree block bytenr 0 is not aligned to sectorsize 4096 Found tree root at 131072 gen 311252 level 0 On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphywrote: > On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo wrote: > >> BTW are you using discard mount option? Sometimes it can cause problem. > > OP did not say if it was using discard mount option; but did say some > time before this (I'm not sure how recent) he had used fstrim. The > firmware for this SSD model is current. > > > -- > Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs fi du -s gives Inappropriate ioctl for device
On 18.08.2017 10:09, Goffredo Baroncelli wrote: > On 08/18/2017 08:34 AM, Nikolay Borisov wrote: >> >> >> On 17.08.2017 23:59, Goffredo Baroncelli wrote: >>> Hi, >>> >>> On 08/17/2017 08:43 PM, Chris Murphy wrote: # btrfs sub create test1 Create subvolume './test1' # btrfs sub create test1/test2 Create subvolume 'test1/test2' # btrfs sub snap test1 test1.snap Create a snapshot of 'test1' in './test1.snap' # btrfs fi du -s test1 Total Exclusive Set shared Filename 0.00B 0.00B 0.00B test1 # btrfs fi du -s test1.snap Total Exclusive Set shared Filename ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device # >>> >>> tanks for the test case. Now I was able to reproduce the problem. The >>> bug(s) are two: >>> >>> 1) to get the treeid of the files/directory, the function >>> lookup_path_rootid() is used, which behaves strangely when it is called on >>> a directory that is BTRFS_EMPTY_SUBVOL_DIR_OBJECTID. In fact this function >>> is commented as following: >>> [] >>> /* >>> * For a given: >>> * - file or directory return the containing tree root id >>> * - subvolume return its own tree id >>> * - BTRFS_EMPTY_SUBVOL_DIR_OBJECTID (directory with ino == 2) the result is >>> * undefined and function returns -1 >>> */ >>> int lookup_path_rootid(int fd, u64 *rootid) >>> { >>> [] >>> >>> The caller (du_add_file()) doesn't consider this case. >>> >>> 2) in the function du_walk_dir(), an error returned by du_add_file() is >>> ignored, but the return value is not reset. So if the last entry has ino >>> BTRFS_EMPTY_SUBVOL_DIR_OBJECTID (directory with ino == 2) an error is >>> returned. But if there is another directory item no error is returned ! >>> >>> See the following tests cases: >>> >>> # btrfs sub create test1 >>> # btrfs sub create test1/test2 >>> # btrfs sub snap test1 test1.snap >>> # btrfs fi du -s test1 >>> Total Exclusive Set shared Filename >>> 0.00B 0.00B 0.00B test1 >>> # btrfs fi du -s test1.snap >>> Total Exclusive Set shared Filename >>> ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device >>> >>> But if you add *another* file/dir in test1.snap you got: >>> >>> # mkdir test1.snap/dir >>> # btrfs fi du -s test1.snap >>> Total Exclusive Set shared Filename >>> 0.00B 0.00B 0.00B test1.snap >>> >>> The error disappea> Patches will follow shortly >> >> It would be awesome if you manage to introduce xfstests for this case > > I am not sure if this is the right thing to do: the bugs are related to > a) an incorrect error handling of the function lookup_path_rootid() > b) the strange behavior of the BTRFS_EMPTY_SUBVOL_DIR_OBJECTID directory > > For a) it is impossible to check for each lookup_path_rootid() call from a > xfstest; for b) I am quite sure that there are a lot of corner case not > properly handled in btrfs progs (and still btrfs fi-du handle it wrongly) So the bug is actually in the btrfsprogs not in the fs, am I correct? > > BR > G.Baroncelli > >> >>> # uname -r 4.13.0-0.rc4.git1.1.fc27.x86_64 # rpm -q btrfs-progs btrfs-progs-4.12-1.fc27.x86_64 Chris Murphy >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs fi du -s gives Inappropriate ioctl for device
On 08/18/2017 08:34 AM, Nikolay Borisov wrote: > > > On 17.08.2017 23:59, Goffredo Baroncelli wrote: >> Hi, >> >> On 08/17/2017 08:43 PM, Chris Murphy wrote: >>> # btrfs sub create test1 >>> Create subvolume './test1' >>> # btrfs sub create test1/test2 >>> Create subvolume 'test1/test2' >>> # btrfs sub snap test1 test1.snap >>> Create a snapshot of 'test1' in './test1.snap' >>> # btrfs fi du -s test1 >>> Total Exclusive Set shared Filename >>> 0.00B 0.00B 0.00B test1 >>> # btrfs fi du -s test1.snap >>> Total Exclusive Set shared Filename >>> ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device >>> # >>> >> >> tanks for the test case. Now I was able to reproduce the problem. The bug(s) >> are two: >> >> 1) to get the treeid of the files/directory, the function >> lookup_path_rootid() is used, which behaves strangely when it is called on a >> directory that is BTRFS_EMPTY_SUBVOL_DIR_OBJECTID. In fact this function is >> commented as following: >> [] >> /* >> * For a given: >> * - file or directory return the containing tree root id >> * - subvolume return its own tree id >> * - BTRFS_EMPTY_SUBVOL_DIR_OBJECTID (directory with ino == 2) the result is >> * undefined and function returns -1 >> */ >> int lookup_path_rootid(int fd, u64 *rootid) >> { >> [] >> >> The caller (du_add_file()) doesn't consider this case. >> >> 2) in the function du_walk_dir(), an error returned by du_add_file() is >> ignored, but the return value is not reset. So if the last entry has ino >> BTRFS_EMPTY_SUBVOL_DIR_OBJECTID (directory with ino == 2) an error is >> returned. But if there is another directory item no error is returned ! >> >> See the following tests cases: >> >> # btrfs sub create test1 >> # btrfs sub create test1/test2 >> # btrfs sub snap test1 test1.snap >> # btrfs fi du -s test1 >> Total Exclusive Set shared Filename >> 0.00B 0.00B 0.00B test1 >> # btrfs fi du -s test1.snap >> Total Exclusive Set shared Filename >> ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device >> >> But if you add *another* file/dir in test1.snap you got: >> >> # mkdir test1.snap/dir >> # btrfs fi du -s test1.snap >> Total Exclusive Set shared Filename >> 0.00B 0.00B 0.00B test1.snap >> >> The error disappea> Patches will follow shortly > > It would be awesome if you manage to introduce xfstests for this case I am not sure if this is the right thing to do: the bugs are related to a) an incorrect error handling of the function lookup_path_rootid() b) the strange behavior of the BTRFS_EMPTY_SUBVOL_DIR_OBJECTID directory For a) it is impossible to check for each lookup_path_rootid() call from a xfstest; for b) I am quite sure that there are a lot of corner case not properly handled in btrfs progs (and still btrfs fi-du handle it wrongly) BR G.Baroncelli > >> >>> >>> # uname -r >>> 4.13.0-0.rc4.git1.1.fc27.x86_64 >>> # rpm -q btrfs-progs >>> btrfs-progs-4.12-1.fc27.x86_64 >>> >>> >>> >>> Chris Murphy >>> >> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Don't call lookup_path_rootid() when ino=BTRFS_EMPTY_SUBVOL_DIR_OBJECTID
From: Goffredo BaroncelliWhen ino = BTRFS_EMPTY_SUBVOL_DIR_OBJECTID, the item is not referred to any file-tree. So lookup_path_rootid() doesn't return any sensate value. Signed-off-by: Goffredo Baroncelli --- cmds-fi-du.c | 25 - ctree.h | 2 ++ 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/cmds-fi-du.c b/cmds-fi-du.c index 4bf6af3e..013f8dfd 100644 --- a/cmds-fi-du.c +++ b/cmds-fi-du.c @@ -433,7 +433,6 @@ static int du_add_file(const char *filename, int dirfd, u64 file_total = 0; u64 file_shared = 0; u64 dir_set_shared = 0; - u64 subvol; int fd; DIR *dirstream = NULL; @@ -462,16 +461,24 @@ static int du_add_file(const char *filename, int dirfd, goto out; } - ret = lookup_path_rootid(fd, ); - if (ret) - goto out_close; + /* +* st.st_ino == BTRFS_EMPTY_SUBVOL_DIR_OBJECTID (==2) +* there is no any related tree +*/ + if (st.st_ino != BTRFS_EMPTY_SUBVOL_DIR_OBJECTID) { + u64 subvol; - if (inode_seen(st.st_ino, subvol)) - goto out_close; + ret = lookup_path_rootid(fd, ); + if (ret) + goto out_close; - ret = mark_inode_seen(st.st_ino, subvol); - if (ret) - goto out_close; + if (inode_seen(st.st_ino, subvol)) + goto out_close; + + ret = mark_inode_seen(st.st_ino, subvol); + if (ret) + goto out_close; + } if (S_ISREG(st.st_mode)) { ret = du_calc_file_space(fd, shared_extents, _total, diff --git a/ctree.h b/ctree.h index 48ae8903..2fc67b06 100644 --- a/ctree.h +++ b/ctree.h @@ -138,6 +138,8 @@ struct btrfs_free_space_ctl; */ #define BTRFS_DEV_ITEMS_OBJECTID 1ULL +#define BTRFS_EMPTY_SUBVOL_DIR_OBJECTID 2 + /* * the max metadata block size. This limit is somewhat artificial, * but the memmove costs go through the roof for larger blocks. -- 2.14.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: incorrect invalid lookup_path_rootid() error handling
Hi All, Piotr and Chris, pointed me at these bugs which could be triggered by this test case: # btrfs sub create test1 # btrfs sub create test1/test2 # btrfs sub snap test1 test1.snap # btrfs fi du -s test1 Total Exclusive Set shared Filename 0.00B 0.00B 0.00B test1 # btrfs fi du -s test1.snap Total Exclusive Set shared Filename ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device These are two bugs: 1) in the function du_walk_dir() the error returned by du_add_file() normally is ignored. But if du_walk_dir() "walks" the last items, the error is returned to the caller 2) in the function du_add_file() it doesn't make sense to call lookup_path_rootid() when the inode is BTRFS_EMPTY_SUBVOL_DIR_OBJECTID: in this case the function doesn't return a valid value. BR G.Baroncelli -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Reset the ret value when ignore an error from du_add_file()
From: Goffredo BaroncelliIn du_walk_dir(), when du_add_file() returns an error it is usually ignored. However if the error is returned querying the last item, the error is returned to the caller. Signed-off-by: Goffredo Baroncelli --- cmds-fi-du.c | 1 + 1 file changed, 1 insertion(+) diff --git a/cmds-fi-du.c b/cmds-fi-du.c index f106f45b..4bf6af3e 100644 --- a/cmds-fi-du.c +++ b/cmds-fi-du.c @@ -403,6 +403,7 @@ static int du_walk_dir(struct du_dir_ctxt *ctxt, struct rb_root *shared_extents) shared_extents, , , 0); if (ret == -ENOTTY) { + ret = 0; continue; } else if (ret) { fprintf(stderr, -- 2.14.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs fi du -s gives Inappropriate ioctl for device
On 17.08.2017 23:59, Goffredo Baroncelli wrote: > Hi, > > On 08/17/2017 08:43 PM, Chris Murphy wrote: >> # btrfs sub create test1 >> Create subvolume './test1' >> # btrfs sub create test1/test2 >> Create subvolume 'test1/test2' >> # btrfs sub snap test1 test1.snap >> Create a snapshot of 'test1' in './test1.snap' >> # btrfs fi du -s test1 >> Total Exclusive Set shared Filename >> 0.00B 0.00B 0.00B test1 >> # btrfs fi du -s test1.snap >> Total Exclusive Set shared Filename >> ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device >> # >> > > tanks for the test case. Now I was able to reproduce the problem. The bug(s) > are two: > > 1) to get the treeid of the files/directory, the function > lookup_path_rootid() is used, which behaves strangely when it is called on a > directory that is BTRFS_EMPTY_SUBVOL_DIR_OBJECTID. In fact this function is > commented as following: > [] > /* > * For a given: > * - file or directory return the containing tree root id > * - subvolume return its own tree id > * - BTRFS_EMPTY_SUBVOL_DIR_OBJECTID (directory with ino == 2) the result is > * undefined and function returns -1 > */ > int lookup_path_rootid(int fd, u64 *rootid) > { > [] > > The caller (du_add_file()) doesn't consider this case. > > 2) in the function du_walk_dir(), an error returned by du_add_file() is > ignored, but the return value is not reset. So if the last entry has ino > BTRFS_EMPTY_SUBVOL_DIR_OBJECTID (directory with ino == 2) an error is > returned. But if there is another directory item no error is returned ! > > See the following tests cases: > > # btrfs sub create test1 > # btrfs sub create test1/test2 > # btrfs sub snap test1 test1.snap > # btrfs fi du -s test1 > Total Exclusive Set shared Filename > 0.00B 0.00B 0.00B test1 > # btrfs fi du -s test1.snap > Total Exclusive Set shared Filename > ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device > > But if you add *another* file/dir in test1.snap you got: > > # mkdir test1.snap/dir > # btrfs fi du -s test1.snap > Total Exclusive Set shared Filename > 0.00B 0.00B 0.00B test1.snap > > The error disappea> Patches will follow shortly It would be awesome if you manage to introduce xfstests for this case > >> >> # uname -r >> 4.13.0-0.rc4.git1.1.fc27.x86_64 >> # rpm -q btrfs-progs >> btrfs-progs-4.12-1.fc27.x86_64 >> >> >> >> Chris Murphy >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html