Re: [PATCH 0/4] 3- and 4- copy RAID1
On 07/15/2018 04:37 PM, waxhead wrote: > David Sterba wrote: >> An interesting question is the naming of the extended profiles. I picked >> something that can be easily understood but it's not a final proposal. >> Years ago, Hugo proposed a naming scheme that described the >> non-standard raid varieties of the btrfs flavor: >> >> https://marc.info/?l=linux-btrfs=136286324417767 >> >> Switching to this naming would be a good addition to the extended raid. >> > As just a humble BTRFS user I agree and really think it is about time to move > far away from the RAID terminology. However adding some more descriptive > profile names (or at least some aliases) would be much better for the > commoners (such as myself). > > For example: > > Old format / New Format / My suggested alias > SINGLE / 1C / SINGLE > DUP / 2CD / DUP (or even MIRRORLOCAL1) > RAID0 / 1CmS / STRIPE > RAID1 / 2C / MIRROR1 > RAID1c3 / 3C / MIRROR2 > RAID1c4 / 4C / MIRROR3 > RAID10 / 2CmS / STRIPE.MIRROR1 Striping and mirroring/pairing are orthogonal properties; mirror and parity are mutually exclusive. What about RAID1 -> MIRROR1 RAID10 -> MIRROR1S RAID1c3 -> MIRROR2 RAID1c3+striping -> MIRROR2S and so on... > RAID5 / 1CmS1P / STRIPE.PARITY1 > RAID6 / 1CmS2P / STRIPE.PARITY2 To me these should be called something like RAID5 -> PARITY1S RAID6 -> PARITY2S The S final is due to the fact that usually RAID5/6 spread the data on all available disks Question #1: for "parity" profiles, does make sense to limit the maximum disks number where the data may be spread ? If the answer is not, we could omit the last S. IMHO it should. Question #2: historically RAID10 is requires 4 disks. However I am guessing if the stripe could be done on a different number of disks: What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is that every 64k, the data are stored on a different disk > > I find that writing something like "btrfs balance start > -dconvert=stripe5.parity2 /mnt" is far less confusing and therefore less > error prone than writing "-dconvert=1C5S2P". > > While Hugo's suggestion is compact and to the point I would call for > expanding that so it is a bit more descriptive and human readable. > > So for example : STRIPE where obviously is the same as Hugo > proposed - the number of storage devices for the stripe and no would be > best to mean 'use max devices'. > For PARITY then is obviously required > > Keep in mind that most people (...and I am willing to bet even Duncan which > probably HAS backups ;) ) get a bit stressed when their storage system is > degraded. With that in mind I hope for more elaborate, descriptive and human > readable profile names to be used to avoid making mistakes using the > "compact" layout. > > ...and yes, of course this could go both ways. A more compact (and dare I say > cryptic) variant can cause people to stop and think before doing something > and thus avoid errors, > > Now that I made my point I can't help being a bit extra hash, obnoxious and > possibly difficult so I would also suggest that Hugo's format could have been > changed (dare I say improved?) from > > numCOPIESnumSTRIPESnumPARITY > > to. > > REPLICASnum.STRIPESnum.PARITYnum > > Which would make the above table look like so: > > Old format / My Format / My suggested alias > SINGLE / R0.S0.P0 / SINGLE > DUP / R1.S1.P0 / DUP (or even MIRRORLOCAL1) > RAID0 / R0.Sm.P0 / STRIPE > RAID1 / R1.S0.P0 / MIRROR1 > RAID1c3 / R2.S0.P0 / MIRROR2 > RAID1c4 / R3.S0.P0 / MIRROR3 > RAID10 / R1.Sm.P0 / STRIPE.MIRROR1 > RAID5 / R1.Sm.P1 / STRIPE.PARITY1 > RAID6 / R1.Sm.P2 / STRIPE.PARITY2 > > And i think this is much more readable, but others may disagree. And as a > side note... from a (hobby) coders perspective this is probably simpler to > parse as well. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] btrfs-progs: check: orig: Don't panic out when unexpected root item is referring to one extent
On Thu, Jul 05, 2018 at 03:45:56PM +0800, Qu Wenruo wrote: > With crafted image, expected root item can refer to certain extent, and > original mode uses BUG_ON() to handle such case. > > Fix it by gracefully return error. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200403 > Signed-off-by: Qu Wenruo Please send cover lettter for patchsets with more than one patch. Patches 1-3 applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESEND 2/2] btrfs-progs: make all programs and libraries optional
On Mon, Jul 16, 2018 at 04:56:57PM +0200, David Sterba wrote: > On Thu, Jul 12, 2018 at 04:11:19PM -0700, Omar Sandoval wrote: > > From: Omar Sandoval > > > > We have a build system internally which only needs to build the > > libraries out of a repository, not any binaries. I looked at how this > > works with other projects, and the best example was util-linux, which > > makes it possible to enable or disable everything individually. This is > > nice and really flexible, so let's do the same. This way, if you only > > want to build and install libbtrfsutil, you can simply do > > > > ./configure --disable-documentation --disable-all-programs > > --enable-libbtrfsutil > > make > > make install > > I think this is an overkill and abusing the --enable-XXX options. You > want to avoid building the tools by default, so adding an option for > that is fine. Selectively building only certain tools can utilize that > option too and just follow with 'make btrfs-image' etc. Yeah, it's easy to build stuff selectively, but `make install` will still try to build everything, that's the part I'm more concerned with. > The number of --enable-* will stay minimal and we don't even have to > discuss how to find a good naming scheme (that works for util-linux but > looks a bit confusing for btrfs-progs). Ok, I can collapse these into just --disable-programs/--enable-programs, and --disable-libraries/--enable-libraries? That would be enough for me. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: tests/fuzz: Use correct suffix for raw image
On Thu, Jul 05, 2018 at 01:41:24PM +0800, Qu Wenruo wrote: > * Please fold this patch into original commit in devel branch * > > It's raw image, so it should be ".raw.xz" not ".img.xz" Ah right, thanks, commit updated. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs-progs: lowmem: fix false alerts of referencer count mismatch for blocks relocated
On Tue, Jul 03, 2018 at 03:58:50PM +0800, Su Yue wrote: > Btrfs lowmem check reports such false alerts: ... 1 and 2 applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs-progs: check: enhanced progress indicator
On Wed, Jul 11, 2018 at 01:18:55PM +0800, Qu Wenruo wrote: > On 2018年07月05日 03:20, Stéphane Lesimple wrote: > > We reuse the task_position enum and task_ctx struct of the original progress > > indicator, adding more values and fields for our needs. > > > > Then add hooks in all steps of the check to properly record progress. > > > > Signed-off-by: Stéphane Lesimple > > Looks pretty good. > > Just some small nitpicks related to code style. The coding style rules need to be more relaxed for one-time or infrequent contributors so we don't scare them away. I go through such patches line by line and fix where needed but it's ok to point them out so I don't miss something. > > @@ -173,28 +161,48 @@ static int compare_extent_backref(struct rb_node > > *node1, struct rb_node *node2) > > return compare_tree_backref(node1, node2); > > } > > > > +static void print_status_check_line(void *p) > > +{ > > + struct task_ctx *priv = p; > > + char *task_position_string[] = { > > + "[1/7] checking root items ", > > + "[2/7] checking extents", > > + is_free_space_tree ? > > + "[3/7] checking free space tree": > > The extra intent makes it a little hard to align the output. > > What about using something like below to format the string? > "[%d/%d] %-20s" > > And this makes it more flex since lowmem and original mode has different > progress number (lowmem doesn't need to check root refs separately). I've kept it as is only did minor updates so it looks visually more compact. If you have further ideas how to clean up the code, feel free to send patches. > > > + "[3/7] checking free space cache ", > > + "[4/7] checking fs roots ", > > + check_data_csum ? > > + "[5/7] checking csums against data ": > > + "[5/7] checking csums (without verifying data) ", > > + "[6/7] checking root refs ", > > + "[7/7] checking quota groups ", > > + }; > > + > > + time_t elapsed = time(NULL) - priv->start_time; > > + int hours = elapsed / 3600; > > + elapsed-= hours * 3600; > > It's not common in btrfs-progs to mix declaration and code. Fixed. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Healthy amount of free space?
Greetings, I would like to ask what what is healthy amount of free space to keep on each device for btrfs to be happy? This is how my disk array currently looks like [root@dennas ~]# btrfs fi usage /raid Overall: Device size: 29.11TiB Device allocated: 21.26TiB Device unallocated:7.85TiB Device missing: 0.00B Used: 21.18TiB Free (estimated): 3.96TiB (min: 3.96TiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID1: Size:10.61TiB, Used:10.58TiB /dev/mapper/data1 1.75TiB /dev/mapper/data2 1.75TiB /dev/mapper/data3 856.00GiB /dev/mapper/data4 856.00GiB /dev/mapper/data5 1.75TiB /dev/mapper/data6 1.75TiB /dev/mapper/data7 6.29TiB /dev/mapper/data8 6.29TiB Metadata,RAID1: Size:15.00GiB, Used:13.00GiB /dev/mapper/data1 2.00GiB /dev/mapper/data2 3.00GiB /dev/mapper/data3 1.00GiB /dev/mapper/data4 1.00GiB /dev/mapper/data5 3.00GiB /dev/mapper/data6 1.00GiB /dev/mapper/data7 9.00GiB /dev/mapper/data8 10.00GiB System,RAID1: Size:64.00MiB, Used:1.50MiB /dev/mapper/data2 32.00MiB /dev/mapper/data6 32.00MiB /dev/mapper/data7 32.00MiB /dev/mapper/data8 32.00MiB Unallocated: /dev/mapper/data11004.52GiB /dev/mapper/data21004.49GiB /dev/mapper/data31006.01GiB /dev/mapper/data41006.01GiB /dev/mapper/data51004.52GiB /dev/mapper/data61004.49GiB /dev/mapper/data71005.00GiB /dev/mapper/data81005.00GiB Btrfs does quite good job of evenly using space on all devices. No, how low can I let that go? In other words, with how much space free/unallocated remaining space should I consider adding new disk? Thanks for advice :) W. -- There are only two hard things in Computer Science: cache invalidation, naming things and off-by-one errors. signature.asc Description: PGP signature
Re: [PATCH 0/4] 3- and 4- copy RAID1
waxhead wrote: David Sterba wrote: An interesting question is the naming of the extended profiles. I picked something that can be easily understood but it's not a final proposal. Years ago, Hugo proposed a naming scheme that described the non-standard raid varieties of the btrfs flavor: https://marc.info/?l=linux-btrfs=136286324417767 Switching to this naming would be a good addition to the extended raid. As just a humble BTRFS user I agree and really think it is about time to move far away from the RAID terminology. However adding some more descriptive profile names (or at least some aliases) would be much better for the commoners (such as myself). ...snip... > Which would make the above table look like so: Old format / My Format / My suggested alias SINGLE / R0.S0.P0 / SINGLE DUP / R1.S1.P0 / DUP (or even MIRRORLOCAL1) RAID0 / R0.Sm.P0 / STRIPE RAID1 / R1.S0.P0 / MIRROR1 RAID1c3 / R2.S0.P0 / MIRROR2 RAID1c4 / R3.S0.P0 / MIRROR3 RAID10 / R1.Sm.P0 / STRIPE.MIRROR1 RAID5 / R1.Sm.P1 / STRIPE.PARITY1 RAID6 / R1.Sm.P2 / STRIPE.PARITY2 And i think this is much more readable, but others may disagree. And as a side note... from a (hobby) coders perspective this is probably simpler to parse as well. ...snap... ...and before someone else points this out that my suggestion has an ugly flaw , I got a bit copy / paste happy and messed up the RAID 5 and 6 like profiles. The below table are corrected and hopefully it make the point why using the word 'replicas' is easier to understand than 'copies' even if I messed it up :) Old format / My Format / My suggested alias SINGLE / R0.S0.P0 / SINGLE DUP / R1.S1.P0 / DUP (or even MIRRORLOCAL1) RAID0 / R0.Sm.P0 / STRIPE RAID1 / R1.S0.P0 / MIRROR1 RAID1c3 / R2.S0.P0 / MIRROR2 RAID1c4 / R3.S0.P0 / MIRROR3 RAID10 / R1.Sm.P0 / STRIPE.MIRROR1 RAID5 / R0.Sm.P1 / STRIPE.PARITY1 RAID6 / R0.Sm.P2 / STRIPE.PARITY2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: inspect-internal: add option '-k u64,u8,u64' of dump-tree
On 2018年07月16日 13:55, Su Yue wrote: > For big filesystems, there are many items in trees(extent tree > specially). > For example, to dump one extent, we usually dump extent tree then pipe > result to grep. The time-consuming part is that dump tree traverses > items. And it eats cpu and memory too. > > This patch introduces an option '-k u64,u8,u64' for users(developer more > likely) to appoint a specific key to search and dump, then the path > from root node down the leaf will be dumped. > The search of the key costs most time of this way. Indeed a pretty useful debug oriented feature. > > Signed-off-by: Su Yue > --- > In my experiment, not usual case though. > Image was provided by Chris Murphy, thanks. > > #btrfs filesystem df /mnt > Data, single: total=767.00GiB, used=766.58GiB > System, DUP: total=32.00MiB, used=112.00KiB > Metadata, DUP: total=5.50GiB, used=4.69GiB > Metadata, single: total=16.00MiB, used=0.00B > GlobalReserve, single: total=512.00MiB, used=0.00B > > Before: > #time bash -c 'btrfs inspect-internal dump-tree -t 2 ~/test/test.img | grep > "1295194308608" >> /dev/null' > real0m34.024s > user0m3.269s > sys 0m4.047s > > Patched and use -k: > #time bash -c './btrfs inspect-internal dump-tree -t 2 -k 1295194308608,0,0 > ~/test/test.img >> /dev/null' > > real0m1.408s > user0m0.006s > sys 0m0.014s > > The new way is 34 times faster than 'grep' way, uses less memory and > cpu, and it prints path useful for debug. > > --- > cmds-inspect-dump-tree.c | 54 +-- > print-tree.c | 112 +++ > print-tree.h | 2 + > 3 files changed, 163 insertions(+), 5 deletions(-) > > diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c > index c8acd55a0c3a..2a1a40c8270d 100644 > --- a/cmds-inspect-dump-tree.c > +++ b/cmds-inspect-dump-tree.c > @@ -184,6 +184,21 @@ static u64 treeid_from_string(const char *str, const > char **end) > return id; > } > > +static int parse_key(struct btrfs_key *key) > +{ > + > + int ret = sscanf(optarg, "%llu,%hhu,%llu", >objectid, > + >type, >offset); > + if (ret != 3) { > + error("error parsing key '%s'\n", optarg); > + ret = -EINVAL; > + } else { > + ret = 0; > + } > + > + return ret; > +} > + > const char * const cmd_inspect_dump_tree_usage[] = { > "btrfs inspect-internal dump-tree [options] device", > "Dump tree structures from a given device", > @@ -199,6 +214,7 @@ const char * const cmd_inspect_dump_tree_usage[] = { > "-u|--uuid print only the uuid tree", > "-b|--block print info from the specified block only", > "-t|--tree print only tree with the given id (string or > number)", > + "-k|--key search the specific key then print path, must > with -t", Shouldn't it conflict with -b and -r? > "--follow use with -b, to show all children tree blocks > of ", > NULL > }; > @@ -224,8 +240,10 @@ int cmd_inspect_dump_tree(int argc, char **argv) > unsigned open_ctree_flags; > u64 block_only = 0; > struct btrfs_root *tree_root_scan; > + struct btrfs_key spec_key = { 0 }; > u64 tree_id = 0; > bool follow = false; > + bool is_key_spec = false; What about @key_set? @is_key_spec looks a little strange here. > > /* >* For debug-tree, we care nothing about extent tree (it's just backref > @@ -248,10 +266,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) > { "block", required_argument, NULL, 'b'}, > { "tree", required_argument, NULL, 't'}, > { "follow", no_argument, NULL, GETOPT_VAL_FOLLOW }, > + {"key", required_argument, NULL, 'k'}, Small alignment mismatch, missing a space before "key". > { NULL, 0, NULL, 0 } > }; > > - c = getopt_long(argc, argv, "deb:rRut:", long_options, NULL); > + c = getopt_long(argc, argv, "deb:rRut:k:", long_options, NULL); > if (c < 0) > break; > switch (c) { > @@ -300,6 +319,12 @@ int cmd_inspect_dump_tree(int argc, char **argv) > } > break; > } > + case 'k': > + ret = parse_key(_key); > + if (ret) > + exit(1); > + is_key_spec = true; > + break; > case GETOPT_VAL_FOLLOW: > follow = true; > break; > @@ -308,6 +333,9 @@ int cmd_inspect_dump_tree(int argc, char **argv) > } > } > > + if (!tree_id && is_key_spec) > + usage(cmd_inspect_dump_tree_usage); > + > if (check_argc_exact(argc - optind, 1)) >
Re: [PATCH] btrfs-progs: inspect-internal: add option '-k u64,u8,u64' of dump-tree
On 07/16/2018 03:45 PM, Qu Wenruo wrote: On 2018年07月16日 13:55, Su Yue wrote: For big filesystems, there are many items in trees(extent tree specially). For example, to dump one extent, we usually dump extent tree then pipe result to grep. The time-consuming part is that dump tree traverses items. And it eats cpu and memory too. This patch introduces an option '-k u64,u8,u64' for users(developer more likely) to appoint a specific key to search and dump, then the path from root node down the leaf will be dumped. The search of the key costs most time of this way. Indeed a pretty useful debug oriented feature. Signed-off-by: Su Yue --- In my experiment, not usual case though. Image was provided by Chris Murphy, thanks. #btrfs filesystem df /mnt Data, single: total=767.00GiB, used=766.58GiB System, DUP: total=32.00MiB, used=112.00KiB Metadata, DUP: total=5.50GiB, used=4.69GiB Metadata, single: total=16.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B Before: #time bash -c 'btrfs inspect-internal dump-tree -t 2 ~/test/test.img | grep "1295194308608" >> /dev/null' real0m34.024s user0m3.269s sys 0m4.047s Patched and use -k: #time bash -c './btrfs inspect-internal dump-tree -t 2 -k 1295194308608,0,0 ~/test/test.img >> /dev/null' real0m1.408s user0m0.006s sys 0m0.014s The new way is 34 times faster than 'grep' way, uses less memory and cpu, and it prints path useful for debug. --- cmds-inspect-dump-tree.c | 54 +-- print-tree.c | 112 +++ print-tree.h | 2 + 3 files changed, 163 insertions(+), 5 deletions(-) diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c index c8acd55a0c3a..2a1a40c8270d 100644 --- a/cmds-inspect-dump-tree.c +++ b/cmds-inspect-dump-tree.c @@ -184,6 +184,21 @@ static u64 treeid_from_string(const char *str, const char **end) return id; } +static int parse_key(struct btrfs_key *key) +{ + + int ret = sscanf(optarg, "%llu,%hhu,%llu", >objectid, +>type, >offset); + if (ret != 3) { + error("error parsing key '%s'\n", optarg); + ret = -EINVAL; + } else { + ret = 0; + } + + return ret; +} + const char * const cmd_inspect_dump_tree_usage[] = { "btrfs inspect-internal dump-tree [options] device", "Dump tree structures from a given device", @@ -199,6 +214,7 @@ const char * const cmd_inspect_dump_tree_usage[] = { "-u|--uuid print only the uuid tree", "-b|--block print info from the specified block only", "-t|--tree print only tree with the given id (string or number)", + "-k|--key search the specific key then print path, must with -t", Shouldn't it conflict with -b and -r? "--follow use with -b, to show all children tree blocks of ", NULL }; @@ -224,8 +240,10 @@ int cmd_inspect_dump_tree(int argc, char **argv) unsigned open_ctree_flags; u64 block_only = 0; struct btrfs_root *tree_root_scan; + struct btrfs_key spec_key = { 0 }; u64 tree_id = 0; bool follow = false; + bool is_key_spec = false; What about @key_set? @is_key_spec looks a little strange here. Ok, will call it is_spec_key_set in next version. /* * For debug-tree, we care nothing about extent tree (it's just backref @@ -248,10 +266,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) { "block", required_argument, NULL, 'b'}, { "tree", required_argument, NULL, 't'}, { "follow", no_argument, NULL, GETOPT_VAL_FOLLOW }, + {"key", required_argument, NULL, 'k'}, Small alignment mismatch, missing a space before "key". { NULL, 0, NULL, 0 } }; - c = getopt_long(argc, argv, "deb:rRut:", long_options, NULL); + c = getopt_long(argc, argv, "deb:rRut:k:", long_options, NULL); if (c < 0) break; switch (c) { @@ -300,6 +319,12 @@ int cmd_inspect_dump_tree(int argc, char **argv) } break; } + case 'k': + ret = parse_key(_key); + if (ret) + exit(1); + is_key_spec = true; + break; case GETOPT_VAL_FOLLOW: follow = true; break; @@ -308,6 +333,9 @@ int cmd_inspect_dump_tree(int argc, char **argv) } } + if (!tree_id && is_key_spec) + usage(cmd_inspect_dump_tree_usage); + if (check_argc_exact(argc - optind, 1)) usage(cmd_inspect_dump_tree_usage); @@ -398,7 +426,11 @@ again:
Re: Corrupted FS with "open_ctree failed" and "failed to recover balance: -5"
On 2018年07月16日 16:15, Udo Waechter wrote: > Hello, > > noone any ideas? Do you need more information? > > Cheers, > udo. > > On 11/07/18 17:37, Udo Waechter wrote: >> Hello everyone, >> >> I have a corrupted filesystem which I can't seem to recover. >> >> The machine is: >> Debian Linux, kernel 4.9 and btrfs-progs v4.13.3 >> >> I have a HDD RAID5 with LVM and the volume in question is a LVM volume. >> On top of that I had a RAID1 SSD cache with lvm-cache. >> >> Yesterday both! SSDs died within minutes. This lead to the corruped >> filesystem that I have now. >> >> I hope I followed the procedure correctly. >> >> What I tried so far: >> * "mount -o usebackuproot,ro " and "nospace_cache" "clear_cache" and all >> permutations of these mount options >> >> I'm getting: >> >> [96926.830400] BTRFS info (device dm-2): trying to use backup root at >> mount time >> [96926.830406] BTRFS info (device dm-2): disk space caching is enabled >> [96926.927978] BTRFS error (device dm-2): parent transid verify failed >> on 321269628928 wanted 3276017 found 3275985 >> [96926.938619] BTRFS error (device dm-2): parent transid verify failed >> on 321269628928 wanted 3276017 found 3275985 >> [96926.940705] BTRFS error (device dm-2): failed to recover balance: -5 This means your fs failed to recover the balance. And it should mostly be caused by transid error just one line above. Normally this means your fs is more or less corrupted, could be caused by powerloss or something else. >> [96926.985801] BTRFS error (device dm-2): open_ctree failed >> >> The weird thing is that I can't really find information about the >> "failed to recover balance: -5" error. - There was no rebalancing >> running when during the crash. Can only be determined by tree dump. # btrfs ins dump-tree -t root >> >> * btrfs-find-root: https://pastebin.com/qkjnSUF7 - It bothers me that I >> don't see any "good generations" as described here: >> https://btrfs.wiki.kernel.org/index.php/Restore >> >> * "btrfs rescue" - it starts, then goes to "looping on XYZ" then stops >> >> * "btrfs rescue super-recover -v" gives: >> >> All Devices: >> Device: id = 1, name = /dev/vg00/... >> Before Recovering: >> [All good supers]: >> device name = /dev/vg00/... >> superblock bytenr = 65536 >> >> device name = /dev/vg00/... >> superblock bytenr = 67108864 >> >> device name = /dev/vg00/... >> superblock bytenr = 274877906944 >> >> [All bad supers]: >> >> All supers are valid, no need to recover >> >> >> * Unfortunatly I did a "btrfs rescue zero-log" at some point :( - As it >> turns out that might have been a bad idea >> >> >> * Also, a "btrfs check --init-extent-tree" - https://pastebin.com/jATDCFZy Then it is making things worse, fortunately it should terminate before it causes more damage. I'm just curious why people doesn't try the safest "btrfs check" without any options, but goes the most dangerous option. And "btrfs check" output please. If possible, "btrfs check --mode=lowmem" is also good for debug. Thanks, Qu >> >> The volume contained qcow2 images for VMs. I need only one of those, >> since one piece of important software decided to not do backups :( >> >> Any help is highly appreciated. >> >> Many thanks, >> udo. >> > signature.asc Description: OpenPGP digital signature
Re: [PATCH] btrfs-progs: inspect-internal: add option '-k u64,u8,u64' of dump-tree
On 2018年07月16日 16:14, Su Yue wrote: > > > On 07/16/2018 03:45 PM, Qu Wenruo wrote: >> >> >> On 2018年07月16日 13:55, Su Yue wrote: >>> For big filesystems, there are many items in trees(extent tree >>> specially). >>> For example, to dump one extent, we usually dump extent tree then pipe >>> result to grep. The time-consuming part is that dump tree traverses >>> items. And it eats cpu and memory too. >>> >>> This patch introduces an option '-k u64,u8,u64' for users(developer more >>> likely) to appoint a specific key to search and dump, then the path >>> from root node down the leaf will be dumped. >>> The search of the key costs most time of this way. >> >> Indeed a pretty useful debug oriented feature. >> >>> >>> Signed-off-by: Su Yue >>> --- >>> In my experiment, not usual case though. >>> Image was provided by Chris Murphy, thanks. >>> >>> #btrfs filesystem df /mnt >>> Data, single: total=767.00GiB, used=766.58GiB >>> System, DUP: total=32.00MiB, used=112.00KiB >>> Metadata, DUP: total=5.50GiB, used=4.69GiB >>> Metadata, single: total=16.00MiB, used=0.00B >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> >>> Before: >>> #time bash -c 'btrfs inspect-internal dump-tree -t 2 ~/test/test.img >>> | grep "1295194308608" >> /dev/null' >>> real 0m34.024s >>> user 0m3.269s >>> sys 0m4.047s >>> >>> Patched and use -k: >>> #time bash -c './btrfs inspect-internal dump-tree -t 2 -k >>> 1295194308608,0,0 ~/test/test.img >> /dev/null' >>> >>> real 0m1.408s >>> user 0m0.006s >>> sys 0m0.014s >>> >>> The new way is 34 times faster than 'grep' way, uses less memory and >>> cpu, and it prints path useful for debug. >>> >>> --- >>> cmds-inspect-dump-tree.c | 54 +-- >>> print-tree.c | 112 +++ >>> print-tree.h | 2 + >>> 3 files changed, 163 insertions(+), 5 deletions(-) >>> >>> diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c >>> index c8acd55a0c3a..2a1a40c8270d 100644 >>> --- a/cmds-inspect-dump-tree.c >>> +++ b/cmds-inspect-dump-tree.c >>> @@ -184,6 +184,21 @@ static u64 treeid_from_string(const char *str, >>> const char **end) >>> return id; >>> } >>> +static int parse_key(struct btrfs_key *key) >>> +{ >>> + >>> + int ret = sscanf(optarg, "%llu,%hhu,%llu", >objectid, >>> + >type, >offset); >>> + if (ret != 3) { >>> + error("error parsing key '%s'\n", optarg); >>> + ret = -EINVAL; >>> + } else { >>> + ret = 0; >>> + } >>> + >>> + return ret; >>> +} >>> + >>> const char * const cmd_inspect_dump_tree_usage[] = { >>> "btrfs inspect-internal dump-tree [options] device", >>> "Dump tree structures from a given device", >>> @@ -199,6 +214,7 @@ const char * const cmd_inspect_dump_tree_usage[] = { >>> "-u|--uuid print only the uuid tree", >>> "-b|--block print info from the specified block only", >>> "-t|--tree print only tree with the given id >>> (string or number)", >>> + "-k|--key search the specific key then print path, >>> must with -t", >> >> Shouldn't it conflict with -b and -r? >> >>> "--follow use with -b, to show all children tree >>> blocks of ", >>> NULL >>> }; >>> @@ -224,8 +240,10 @@ int cmd_inspect_dump_tree(int argc, char **argv) >>> unsigned open_ctree_flags; >>> u64 block_only = 0; >>> struct btrfs_root *tree_root_scan; >>> + struct btrfs_key spec_key = { 0 }; >>> u64 tree_id = 0; >>> bool follow = false; >>> + bool is_key_spec = false; >> >> What about @key_set? @is_key_spec looks a little strange here. >> > Ok, will call it is_spec_key_set in next version. > >>> /* >>> * For debug-tree, we care nothing about extent tree (it's just >>> backref >>> @@ -248,10 +266,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) >>> { "block", required_argument, NULL, 'b'}, >>> { "tree", required_argument, NULL, 't'}, >>> { "follow", no_argument, NULL, GETOPT_VAL_FOLLOW }, >>> + {"key", required_argument, NULL, 'k'}, >> >> Small alignment mismatch, missing a space before "key". >> >>> { NULL, 0, NULL, 0 } >>> }; >>> - c = getopt_long(argc, argv, "deb:rRut:", long_options, NULL); >>> + c = getopt_long(argc, argv, "deb:rRut:k:", long_options, NULL); >>> if (c < 0) >>> break; >>> switch (c) { >>> @@ -300,6 +319,12 @@ int cmd_inspect_dump_tree(int argc, char **argv) >>> } >>> break; >>> } >>> + case 'k': >>> + ret = parse_key(_key); >>> + if (ret) >>> + exit(1); >>> + is_key_spec = true; >>> + break; >>> case GETOPT_VAL_FOLLOW: >>> follow = true; >>> break; >>> @@ -308,6 +333,9 @@ int cmd_inspect_dump_tree(int argc, char **argv) >>>
Re: Corrupted FS with "open_ctree failed" and "failed to recover balance: -5"
Hello, noone any ideas? Do you need more information? Cheers, udo. On 11/07/18 17:37, Udo Waechter wrote: > Hello everyone, > > I have a corrupted filesystem which I can't seem to recover. > > The machine is: > Debian Linux, kernel 4.9 and btrfs-progs v4.13.3 > > I have a HDD RAID5 with LVM and the volume in question is a LVM volume. > On top of that I had a RAID1 SSD cache with lvm-cache. > > Yesterday both! SSDs died within minutes. This lead to the corruped > filesystem that I have now. > > I hope I followed the procedure correctly. > > What I tried so far: > * "mount -o usebackuproot,ro " and "nospace_cache" "clear_cache" and all > permutations of these mount options > > I'm getting: > > [96926.830400] BTRFS info (device dm-2): trying to use backup root at > mount time > [96926.830406] BTRFS info (device dm-2): disk space caching is enabled > [96926.927978] BTRFS error (device dm-2): parent transid verify failed > on 321269628928 wanted 3276017 found 3275985 > [96926.938619] BTRFS error (device dm-2): parent transid verify failed > on 321269628928 wanted 3276017 found 3275985 > [96926.940705] BTRFS error (device dm-2): failed to recover balance: -5 > [96926.985801] BTRFS error (device dm-2): open_ctree failed > > The weird thing is that I can't really find information about the > "failed to recover balance: -5" error. - There was no rebalancing > running when during the crash. > > * btrfs-find-root: https://pastebin.com/qkjnSUF7 - It bothers me that I > don't see any "good generations" as described here: > https://btrfs.wiki.kernel.org/index.php/Restore > > * "btrfs rescue" - it starts, then goes to "looping on XYZ" then stops > > * "btrfs rescue super-recover -v" gives: > > All Devices: > Device: id = 1, name = /dev/vg00/... > Before Recovering: > [All good supers]: > device name = /dev/vg00/... > superblock bytenr = 65536 > > device name = /dev/vg00/... > superblock bytenr = 67108864 > > device name = /dev/vg00/... > superblock bytenr = 274877906944 > > [All bad supers]: > > All supers are valid, no need to recover > > > * Unfortunatly I did a "btrfs rescue zero-log" at some point :( - As it > turns out that might have been a bad idea > > > * Also, a "btrfs check --init-extent-tree" - https://pastebin.com/jATDCFZy > > The volume contained qcow2 images for VMs. I need only one of those, > since one piece of important software decided to not do backups :( > > Any help is highly appreciated. > > Many thanks, > udo. > signature.asc Description: OpenPGP digital signature
Re: [PATCH v2] btrfs-progs: inspect-internal: add option '-k u64,u8,u64' of dump-tree
On 2018年07月16日 17:13, Su Yue wrote: > For big filesystems, there are many items in trees(extent tree > specially). > For example, to dump one extent, we usually dump extent tree then pipe > result to grep. The time-consuming part is that dump tree traverses > items. And it eats cpu and memory too. > > This patch introduces an option '-k u64,u8,u64' for users(developers more > likely) to appoint a specific key to search and dump, then the path > from root node down the leaf will be dumped. > The search of the key costs most time of this way. > > Signed-off-by: Su Yue Looks pretty good, just small nitpicks. Basically you could add my reviewed-by tag: Reviewed-by: Qu Wenruo > --- > Change log: > v2: > Rename btrfs_search_spec_key() to btrfs_search_print_path(). > Rename is_key_spec to is_spec_key_set. > Move btrfs_search_print_path() after open_ctree(), call it only once. > Remove duplicate code and replaced with btrfs_print_tree(). > Modify the usage about '-k'. > All are suggested by Qu Wenruo. > > cmds-inspect-dump-tree.c | 41 - > print-tree.c | 77 > print-tree.h | 2 ++ > 3 files changed, 119 insertions(+), 1 deletion(-) > > diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c > index c8acd55a0c3a..80df826af0fd 100644 > --- a/cmds-inspect-dump-tree.c > +++ b/cmds-inspect-dump-tree.c > @@ -184,6 +184,21 @@ static u64 treeid_from_string(const char *str, const > char **end) > return id; > } > > +static int parse_key(struct btrfs_key *key) > +{ > + > + int ret = sscanf(optarg, "%llu,%hhu,%llu", >objectid, > + >type, >offset); > + if (ret != 3) { > + error("error parsing key '%s'\n", optarg); > + ret = -EINVAL; > + } else { > + ret = 0; > + } > + > + return ret; > +} > + > const char * const cmd_inspect_dump_tree_usage[] = { > "btrfs inspect-internal dump-tree [options] device", > "Dump tree structures from a given device", > @@ -199,6 +214,7 @@ const char * const cmd_inspect_dump_tree_usage[] = { > "-u|--uuid print only the uuid tree", > "-b|--block print info from the specified block only", > "-t|--tree print only tree with the given id (string or > number)", > + "-k|--key search the specific key then print path, must > with -t(conflicts with others)", What are the "others"? At least for me, other includes -r and -b. > "--follow use with -b, to show all children tree blocks > of ", > NULL > }; > @@ -224,8 +240,10 @@ int cmd_inspect_dump_tree(int argc, char **argv) > unsigned open_ctree_flags; > u64 block_only = 0; > struct btrfs_root *tree_root_scan; > + struct btrfs_key spec_key = { 0 }; > u64 tree_id = 0; > bool follow = false; > + bool is_spec_key_set = false; > > /* >* For debug-tree, we care nothing about extent tree (it's just backref > @@ -248,10 +266,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) > { "block", required_argument, NULL, 'b'}, > { "tree", required_argument, NULL, 't'}, > { "follow", no_argument, NULL, GETOPT_VAL_FOLLOW }, > + { "key", required_argument, NULL, 'k'}, > { NULL, 0, NULL, 0 } > }; > > - c = getopt_long(argc, argv, "deb:rRut:", long_options, NULL); > + c = getopt_long(argc, argv, "deb:rRut:k:", long_options, NULL); > if (c < 0) > break; > switch (c) { > @@ -300,6 +319,12 @@ int cmd_inspect_dump_tree(int argc, char **argv) > } > break; > } > + case 'k': > + ret = parse_key(_key); > + if (ret) > + exit(1); > + is_spec_key_set = true; > + break; > case GETOPT_VAL_FOLLOW: > follow = true; > break; > @@ -308,6 +333,9 @@ int cmd_inspect_dump_tree(int argc, char **argv) > } > } > > + if (!tree_id && is_spec_key_set) > + usage(cmd_inspect_dump_tree_usage); > + And options conflict should also be done here. Thanks, Qu > if (check_argc_exact(argc - optind, 1)) > usage(cmd_inspect_dump_tree_usage); > > @@ -325,6 +353,17 @@ int cmd_inspect_dump_tree(int argc, char **argv) > goto out; > } > > + if (is_spec_key_set) { > + root = info->tree_root; > + if (IS_ERR(root) || !root) { > + ret = root ? PTR_ERR(root) : -1; > + error("unable to open %s", argv[optind]); > + goto out; > + } > + > +
[PATCH v2] btrfs-progs: inspect-internal: add option '-k u64,u8,u64' of dump-tree
For big filesystems, there are many items in trees(extent tree specially). For example, to dump one extent, we usually dump extent tree then pipe result to grep. The time-consuming part is that dump tree traverses items. And it eats cpu and memory too. This patch introduces an option '-k u64,u8,u64' for users(developers more likely) to appoint a specific key to search and dump, then the path from root node down the leaf will be dumped. The search of the key costs most time of this way. Signed-off-by: Su Yue --- Change log: v2: Rename btrfs_search_spec_key() to btrfs_search_print_path(). Rename is_key_spec to is_spec_key_set. Move btrfs_search_print_path() after open_ctree(), call it only once. Remove duplicate code and replaced with btrfs_print_tree(). Modify the usage about '-k'. All are suggested by Qu Wenruo. cmds-inspect-dump-tree.c | 41 - print-tree.c | 77 print-tree.h | 2 ++ 3 files changed, 119 insertions(+), 1 deletion(-) diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c index c8acd55a0c3a..80df826af0fd 100644 --- a/cmds-inspect-dump-tree.c +++ b/cmds-inspect-dump-tree.c @@ -184,6 +184,21 @@ static u64 treeid_from_string(const char *str, const char **end) return id; } +static int parse_key(struct btrfs_key *key) +{ + + int ret = sscanf(optarg, "%llu,%hhu,%llu", >objectid, +>type, >offset); + if (ret != 3) { + error("error parsing key '%s'\n", optarg); + ret = -EINVAL; + } else { + ret = 0; + } + + return ret; +} + const char * const cmd_inspect_dump_tree_usage[] = { "btrfs inspect-internal dump-tree [options] device", "Dump tree structures from a given device", @@ -199,6 +214,7 @@ const char * const cmd_inspect_dump_tree_usage[] = { "-u|--uuid print only the uuid tree", "-b|--block print info from the specified block only", "-t|--tree print only tree with the given id (string or number)", + "-k|--key search the specific key then print path, must with -t(conflicts with others)", "--follow use with -b, to show all children tree blocks of ", NULL }; @@ -224,8 +240,10 @@ int cmd_inspect_dump_tree(int argc, char **argv) unsigned open_ctree_flags; u64 block_only = 0; struct btrfs_root *tree_root_scan; + struct btrfs_key spec_key = { 0 }; u64 tree_id = 0; bool follow = false; + bool is_spec_key_set = false; /* * For debug-tree, we care nothing about extent tree (it's just backref @@ -248,10 +266,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) { "block", required_argument, NULL, 'b'}, { "tree", required_argument, NULL, 't'}, { "follow", no_argument, NULL, GETOPT_VAL_FOLLOW }, + { "key", required_argument, NULL, 'k'}, { NULL, 0, NULL, 0 } }; - c = getopt_long(argc, argv, "deb:rRut:", long_options, NULL); + c = getopt_long(argc, argv, "deb:rRut:k:", long_options, NULL); if (c < 0) break; switch (c) { @@ -300,6 +319,12 @@ int cmd_inspect_dump_tree(int argc, char **argv) } break; } + case 'k': + ret = parse_key(_key); + if (ret) + exit(1); + is_spec_key_set = true; + break; case GETOPT_VAL_FOLLOW: follow = true; break; @@ -308,6 +333,9 @@ int cmd_inspect_dump_tree(int argc, char **argv) } } + if (!tree_id && is_spec_key_set) + usage(cmd_inspect_dump_tree_usage); + if (check_argc_exact(argc - optind, 1)) usage(cmd_inspect_dump_tree_usage); @@ -325,6 +353,17 @@ int cmd_inspect_dump_tree(int argc, char **argv) goto out; } + if (is_spec_key_set) { + root = info->tree_root; + if (IS_ERR(root) || !root) { + ret = root ? PTR_ERR(root) : -1; + error("unable to open %s", argv[optind]); + goto out; + } + + btrfs_search_print_path(info, tree_id, _key); + goto close_root; + } if (block_only) { root = info->chunk_root; leaf = read_tree_block(info, block_only, 0); diff --git a/print-tree.c b/print-tree.c index a09ecfbb28f0..3868e42d5d29 100644 --- a/print-tree.c +++ b/print-tree.c @@ -1459,3 +1459,80 @@ void btrfs_print_tree(struct extent_buffer *eb, int
[PATCH v5 2/2] btrfs: get device pointer from btrfs_scan_one_device
Instead of pointer to btrfs_fs_devices as an arg in btrfs_scan_one_device, better to make it as a return value. And since btrfs_fs_devices can be get by btrfs_device, better to return btrfs_device than fs_btrfs_devices. Signed-off-by: Gu Jinxiang --- Changelog: v5: rebase to misc-next. v4: as suggested by Anand, change return value of btrfs_scan_one_device from btrfs_fs_devices to btrfs_device. v3: as comment by robot, use PTR_ERR_OR_ZERO, and rebase to misc-next. v2: as comment by Nikolay, use ERR_CAST instead of cast type manually. fs/btrfs/super.c | 35 ++- fs/btrfs/volumes.c | 22 -- fs/btrfs/volumes.h | 4 ++-- 3 files changed, 32 insertions(+), 29 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index dcab4a0244e5..413bbe2d6db2 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -888,7 +888,7 @@ static int btrfs_parse_early_options(const char *options, fmode_t flags, { substring_t args[MAX_OPT_ARGS]; char *device_name, *opts, *orig, *p; - struct btrfs_fs_devices *fs_devices = NULL; + struct btrfs_device *device = NULL; int error = 0; lockdep_assert_held(_mutex); @@ -918,11 +918,13 @@ static int btrfs_parse_early_options(const char *options, fmode_t flags, error = -ENOMEM; goto out; } - error = btrfs_scan_one_device(device_name, flags, - holder, _devices); + device = btrfs_scan_one_device(device_name, flags, + holder); kfree(device_name); - if (error) + if (IS_ERR(device)) { + error = PTR_ERR(device); goto out; + } } } @@ -1518,6 +1520,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type, { struct block_device *bdev = NULL; struct super_block *s; + struct btrfs_device *device = NULL; struct btrfs_fs_devices *fs_devices = NULL; struct btrfs_fs_info *fs_info = NULL; struct security_mnt_opts new_sec_opts; @@ -1561,12 +1564,15 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type, goto error_fs_info; } - error = btrfs_scan_one_device(device_name, mode, fs_type, _devices); - if (error) { + device = btrfs_scan_one_device(device_name, mode, fs_type); + if (IS_ERR(device)) { + error = PTR_ERR(device); mutex_unlock(_mutex); goto error_fs_info; } + fs_devices = device->fs_devices; + fs_info->fs_devices = fs_devices; error = btrfs_open_devices(fs_devices, mode, fs_type); @@ -2227,7 +2233,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct btrfs_ioctl_vol_args *vol; - struct btrfs_fs_devices *fs_devices; + struct btrfs_device *device = NULL; int ret = -ENOTTY; if (!capable(CAP_SYS_ADMIN)) @@ -2240,19 +2246,22 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, switch (cmd) { case BTRFS_IOC_SCAN_DEV: mutex_lock(_mutex); - ret = btrfs_scan_one_device(vol->name, FMODE_READ, - _root_fs_type, _devices); + device = btrfs_scan_one_device(vol->name, FMODE_READ, + _root_fs_type); + ret = PTR_ERR_OR_ZERO(device); mutex_unlock(_mutex); break; case BTRFS_IOC_DEVICES_READY: mutex_lock(_mutex); - ret = btrfs_scan_one_device(vol->name, FMODE_READ, - _root_fs_type, _devices); - if (ret) { + device = btrfs_scan_one_device(vol->name, FMODE_READ, + _root_fs_type); + if (IS_ERR(device)) { + ret = PTR_ERR(device); mutex_unlock(_mutex); break; } - ret = !(fs_devices->num_devices == fs_devices->total_devices); + ret = !(device->fs_devices->num_devices == + device->fs_devices->total_devices); mutex_unlock(_mutex); break; case BTRFS_IOC_GET_SUPPORTED_FEATURES: diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 88d37bfa99c8..9f485696461b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1213,15 +1213,14 @@ static int btrfs_read_disk_super(struct block_device *bdev, u64 bytenr, * and we are
[PATCH v5 1/2] btrfs: make fs_devices to be a local variable
fs_devices is always passed to btrfs_scan_one_device which overrides it. And in the call stack below fs_devices is passed to btrfs_scan_one_device from btrfs_mount_root. And in btrfs_mount_root the output fs_devices of this call stack is not used. btrfs_mount_root -> btrfs_parse_early_options ->btrfs_scan_one_device So, there is no necessary to pass fs_devices from btrfs_mount_root, use a local variable in btrfs_parse_early_options is enough. Signed-off-by: Gu Jinxiang Reviewed-by: Anand Jain --- Changelog: v5: change a line wrap, and rebase to misc-next. v4: changed a line warp, and adjusted the order of two rows. v3: rebase to misc-next. v2: deal with Nikolay's comment, make changelog more clair. fs/btrfs/super.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 28ab75ebb983..dcab4a0244e5 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -884,10 +884,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, * only when we need to allocate a new super block. */ static int btrfs_parse_early_options(const char *options, fmode_t flags, - void *holder, struct btrfs_fs_devices **fs_devices) + void *holder) { substring_t args[MAX_OPT_ARGS]; char *device_name, *opts, *orig, *p; + struct btrfs_fs_devices *fs_devices = NULL; int error = 0; lockdep_assert_held(_mutex); @@ -917,8 +918,8 @@ static int btrfs_parse_early_options(const char *options, fmode_t flags, error = -ENOMEM; goto out; } - error = btrfs_scan_one_device(device_name, - flags, holder, fs_devices); + error = btrfs_scan_one_device(device_name, flags, + holder, _devices); kfree(device_name); if (error) goto out; @@ -1554,7 +1555,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type, } mutex_lock(_mutex); - error = btrfs_parse_early_options(data, mode, fs_type, _devices); + error = btrfs_parse_early_options(data, mode, fs_type); if (error) { mutex_unlock(_mutex); goto error_fs_info; -- 2.17.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 0/2] btrfs: do some parameters optimization
Do parameters optimization for function btrfs_parse_early_options and btrfs_scan_one_device. Gu Jinxiang (2): btrfs: make fs_devices to be a local variable btrfs: get device pointer from btrfs_scan_one_device fs/btrfs/super.c | 38 -- fs/btrfs/volumes.c | 22 -- fs/btrfs/volumes.h | 4 ++-- 3 files changed, 34 insertions(+), 30 deletions(-) -- 2.17.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/2] btrfs: get device pointer from btrfs_scan_one_device
:: @@ -1561,12 +1564,15 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type, goto error_fs_info; } - error = btrfs_scan_one_device(device_name, mode, fs_type, _devices); - if (error) { + device = btrfs_scan_one_device(device_name, mode, fs_type); + if (IS_ERR(device)) { + error = PTR_ERR(device); mutex_unlock(_mutex); goto error_fs_info; } + fs_devices = device->fs_devices; + The version at misc-next is missing this assignment and it panics. :: Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v5 2/2] btrfs: get device pointer from btrfs_scan_one_device
> -Original Message- > From: David Sterba [mailto:dste...@suse.cz] > Sent: Monday, July 16, 2018 8:34 PM > To: Gu, Jinxiang/顾 金香 > Cc: linux-btrfs@vger.kernel.org; anand.j...@oracle.com > Subject: Re: [PATCH v5 2/2] btrfs: get device pointer from > btrfs_scan_one_device > > On Mon, Jul 16, 2018 at 05:11:17PM +0800, Gu Jinxiang wrote: > > Instead of pointer to btrfs_fs_devices as an arg in > > btrfs_scan_one_device, better to make it as a return value. > > > > And since btrfs_fs_devices can be get by btrfs_device, > > better to return btrfs_device than fs_btrfs_devices. > > > > Signed-off-by: Gu Jinxiang > > --- > > > > Changelog: > > v5: rebase to misc-next. > > v4: as suggested by Anand, change return value of > > btrfs_scan_one_device from btrfs_fs_devices to btrfs_device. > > v3: as comment by robot, use PTR_ERR_OR_ZERO, and rebase to misc-next. > > v2: as comment by Nikolay, use ERR_CAST instead of cast type manually. > > > > fs/btrfs/super.c | 35 ++- > > fs/btrfs/volumes.c | 22 -- > > fs/btrfs/volumes.h | 4 ++-- > > 3 files changed, 32 insertions(+), 29 deletions(-) > > > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > > index dcab4a0244e5..413bbe2d6db2 100644 > > --- a/fs/btrfs/super.c > > +++ b/fs/btrfs/super.c > > @@ -888,7 +888,7 @@ static int btrfs_parse_early_options(const char > > *options, fmode_t flags, > > { > > substring_t args[MAX_OPT_ARGS]; > > char *device_name, *opts, *orig, *p; > > - struct btrfs_fs_devices *fs_devices = NULL; > > + struct btrfs_device *device = NULL; > > int error = 0; > > > > lockdep_assert_held(_mutex); > > @@ -918,11 +918,13 @@ static int btrfs_parse_early_options(const char > > *options, fmode_t flags, > > error = -ENOMEM; > > goto out; > > } > > - error = btrfs_scan_one_device(device_name, flags, > > - holder, _devices); > > + device = btrfs_scan_one_device(device_name, flags, > > + holder); > > kfree(device_name); > > - if (error) > > + if (IS_ERR(device)) { > > + error = PTR_ERR(device); > > goto out; > > + } > > } > > } > > > > @@ -1518,6 +1520,7 @@ static struct dentry *btrfs_mount_root(struct > > file_system_type *fs_type, > > { > > struct block_device *bdev = NULL; > > struct super_block *s; > > + struct btrfs_device *device = NULL; > > struct btrfs_fs_devices *fs_devices = NULL; > > struct btrfs_fs_info *fs_info = NULL; > > struct security_mnt_opts new_sec_opts; > > @@ -1561,12 +1564,15 @@ static struct dentry *btrfs_mount_root(struct > > file_system_type *fs_type, > > goto error_fs_info; > > } > > > > - error = btrfs_scan_one_device(device_name, mode, fs_type, _devices); > > - if (error) { > > + device = btrfs_scan_one_device(device_name, mode, fs_type); > > + if (IS_ERR(device)) { > > + error = PTR_ERR(device); > > mutex_unlock(_mutex); > > goto error_fs_info; > > } > > > > + fs_devices = device->fs_devices; > > + > > fs_info->fs_devices = fs_devices; > > > > error = btrfs_open_devices(fs_devices, mode, fs_type); > > @@ -2227,7 +2233,7 @@ static long btrfs_control_ioctl(struct file *file, > > unsigned int cmd, > > unsigned long arg) > > { > > struct btrfs_ioctl_vol_args *vol; > > - struct btrfs_fs_devices *fs_devices; > > + struct btrfs_device *device = NULL; > > int ret = -ENOTTY; > > > > if (!capable(CAP_SYS_ADMIN)) > > @@ -2240,19 +2246,22 @@ static long btrfs_control_ioctl(struct file *file, > > unsigned int cmd, > > switch (cmd) { > > case BTRFS_IOC_SCAN_DEV: > > mutex_lock(_mutex); > > - ret = btrfs_scan_one_device(vol->name, FMODE_READ, > > - _root_fs_type, _devices); > > + device = btrfs_scan_one_device(vol->name, FMODE_READ, > > + _root_fs_type); > > + ret = PTR_ERR_OR_ZERO(device); > > mutex_unlock(_mutex); > > break; > > case BTRFS_IOC_DEVICES_READY: > > mutex_lock(_mutex); > > - ret = btrfs_scan_one_device(vol->name, FMODE_READ, > > - _root_fs_type, _devices); > > - if (ret) { > > + device = btrfs_scan_one_device(vol->name, FMODE_READ, > > + _root_fs_type); > > + if (IS_ERR(device)) { > > + ret = PTR_ERR(device); > > mutex_unlock(_mutex); > > break; > > } > > - ret = !(fs_devices->num_devices
Re: [PATCH 0/2] btrfs-progs: check: enhanced progress indicator
On Wed, Jul 04, 2018 at 09:20:12PM +0200, Stéphane Lesimple wrote: > This patch replaces the current ".oOo."-style progress indicator of > btrfs check (-p), by a more helpful live indication of the check progress. > I've been using it on a custom btrfs-progs version since 2015. > > Here's how the output looks like on a 22 Tb 5-disk RAID1 FS: > > # btrfs check -p /dev/mapper/luks-ST1VN0004- > Opening filesystem to check... > Checking filesystem on /dev/mapper/luks-ST1VN0004- > UUID: ---- > [1/7] checking extents (0:20:21 elapsed, 950958 items checked) > [2/7] checking root items(0:01:29 elapsed, 15121 items checked) > [3/7] checking free space cache (0:00:11 elapsed, 4928 items checked) > [4/7] checking fs roots (0:51:31 elapsed, 600892 items checked) > [5/7] checking csums (0:14:35 elapsed, 754522 items checked) > [6/7] checking root refs (0:00:00 elapsed, 232 items checked) > [7/7] checking quota groups skipped (not enabled on this FS) > found 5286458060800 bytes used, no error found Nice, this looks better than what we have now. Patches applied, with some minor coding style adjustments and I changed the update period back to 1 second, 50ms is IMHO too frequent for an operation that can run for tens of minutes. Also as a safety against dumping too much information to the log into if the output is redirected. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-16 14:29, Goffredo Baroncelli wrote: On 07/15/2018 04:37 PM, waxhead wrote: David Sterba wrote: An interesting question is the naming of the extended profiles. I picked something that can be easily understood but it's not a final proposal. Years ago, Hugo proposed a naming scheme that described the non-standard raid varieties of the btrfs flavor: https://marc.info/?l=linux-btrfs=136286324417767 Switching to this naming would be a good addition to the extended raid. As just a humble BTRFS user I agree and really think it is about time to move far away from the RAID terminology. However adding some more descriptive profile names (or at least some aliases) would be much better for the commoners (such as myself). For example: Old format / New Format / My suggested alias SINGLE / 1C / SINGLE DUP / 2CD / DUP (or even MIRRORLOCAL1) RAID0 / 1CmS / STRIPE RAID1 / 2C / MIRROR1 RAID1c3 / 3C / MIRROR2 RAID1c4 / 4C / MIRROR3 RAID10 / 2CmS / STRIPE.MIRROR1 Striping and mirroring/pairing are orthogonal properties; mirror and parity are mutually exclusive. What about RAID1 -> MIRROR1 RAID10 -> MIRROR1S RAID1c3 -> MIRROR2 RAID1c3+striping -> MIRROR2S and so on... RAID5 / 1CmS1P / STRIPE.PARITY1 RAID6 / 1CmS2P / STRIPE.PARITY2 To me these should be called something like RAID5 -> PARITY1S RAID6 -> PARITY2S The S final is due to the fact that usually RAID5/6 spread the data on all available disks Question #1: for "parity" profiles, does make sense to limit the maximum disks number where the data may be spread ? If the answer is not, we could omit the last S. IMHO it should. Currently, there is no ability to cap the number of disks that striping can happen across. Ideally, that will change in the future, in which case not only the S will be needed, but also a number indicating how wide the stripe is. Question #2: historically RAID10 is requires 4 disks. However I am guessing if the stripe could be done on a different number of disks: What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is that every 64k, the data are stored on a different disk This is what MD and LVM RAID10 do. They work somewhat differently from what BTRFS calls raid10 (actually, what we currently call raid1 works almost identically to MD and LVM RAID10 when more than 3 disks are involved, except that the chunk size is 1G or larger). Short of drastic internal changes to how that profile works, this isn't likely to happen. In spite of both of these, there is practical need for indicating the stripe width. Depending on the configuration of the underlying storage, it's fully possible (and sometimes even certain) that you will see chunks with differing stripe widths, so properly reporting the stripe width (in devices, not bytes) is useful for monitoring purposes). Consider for example a 6-device array using what's currently called a raid10 profile where 2 of the disks are smaller than the other four. On such an array, chunks will span all six disks (resulting in 2 copies striped across 3 disks each) until those two smaller disks are full, at which point new chunks will span only the remaining four disks (resulting in 2 copies striped across 2 disks each). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] fstests: btrfs/168 verify device ready after device delete
On Fri, Jul 06, 2018 at 02:01:37PM +0800, Anand Jain wrote: > This test case verifies if the device ready return success after the > device delete. > > Signed-off-by: Anand Jain Need some helps from btrfs folks to see if it's a valid test. Thanks! Eryu > --- > v1->v2: use _run_btrfs_util_prog instead of open coding it. > > tests/btrfs/168 | 68 > + > tests/btrfs/168.out | 2 ++ > tests/btrfs/group | 1 + > 3 files changed, 71 insertions(+) > create mode 100755 tests/btrfs/168 > create mode 100644 tests/btrfs/168.out > > diff --git a/tests/btrfs/168 b/tests/btrfs/168 > new file mode 100755 > index ..0d3e99839209 > --- /dev/null > +++ b/tests/btrfs/168 > @@ -0,0 +1,68 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2018 Oracle. All Rights Reserved. > +# > +# FS QA Test 168 > +# > +# Test if btrfs is still reported ready after the device delete > +# > +# This could be fixed by the following kernel commit: > +# btrfs: fix missing superblock update in the device delete commit > transaction > +# > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > +} > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs btrfs > +_supported_os Linux > +_require_command "$BTRFS_TUNE_PROG" btrfstune > +_require_scratch_dev_pool 2 > + > +_scratch_dev_pool_get 2 > +dev_1=$(echo $SCRATCH_DEV_POOL | awk '{print $1}') > +dev_2=$(echo $SCRATCH_DEV_POOL | awk '{print $2}') > + > +# normal delete device and then check for ready > +run_check _scratch_pool_mkfs "-d single -m single" > +_scratch_mount > +_run_btrfs_util_prog device delete $dev_1 $SCRATCH_MNT > +_run_btrfs_util_prog device ready $dev_2 > + > +_scratch_unmount > +# delete a seed device and then check for ready > +run_check $BTRFS_TUNE_PROG -S 1 $dev_2 > +run_check _mount $dev_2 $SCRATCH_MNT > +_run_btrfs_util_prog device add -f $dev_1 $SCRATCH_MNT > +run_check mount -o rw,remount $dev_1 $SCRATCH_MNT > +_run_btrfs_util_prog device delete $dev_2 $SCRATCH_MNT > +_run_btrfs_util_prog device ready $dev_1 > + > +_scratch_unmount > +_scratch_dev_pool_put > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/168.out b/tests/btrfs/168.out > new file mode 100644 > index ..893a41d859c8 > --- /dev/null > +++ b/tests/btrfs/168.out > @@ -0,0 +1,2 @@ > +QA output created by 168 > +Silence is golden > diff --git a/tests/btrfs/group b/tests/btrfs/group > index 5cff3bd6cc03..7bc3ea457992 100644 > --- a/tests/btrfs/group > +++ b/tests/btrfs/group > @@ -170,3 +170,4 @@ > 165 auto quick subvol > 166 auto quick qgroup > 167 auto quick replace volume > +168 auto quick volume > -- > 2.15.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] btrfs: Use btrfs_mark_bg_unused() to replace open code
On Sun, May 27, 2018 at 09:25:25AM +0800, Qu Wenruo wrote: > > > On 2018年05月22日 20:14, David Sterba wrote: > > On Tue, May 22, 2018 at 04:43:47PM +0800, Qu Wenruo wrote: > >> Introduce a small helper, btrfs_mark_bg_unused(), to accquire needed > >> locks and add a block group to unused_bgs list. > > > > The helper is nice but hides that there's a reference taken on the 'bg'. > > This would be good to add at least to the function comment or somehow > > squeeze it into the function name itself, like > > btrfs_get_and_mark_bg_unused. > > That btrfs_get_block_group() call is only for later removal. > The reference will be removed in btrfs_delete_unused_bgs(). > > So I don't think the function name needs the "_get" part. Agreed, it's inside the helpers. Patch added to misc-next, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/2] btrfs: make fs_devices to be a local variable
On 16.07.2018 12:11, Gu Jinxiang wrote: > fs_devices is always passed to btrfs_scan_one_device which > overrides it. And in the call stack below fs_devices is passed to > btrfs_scan_one_device from btrfs_mount_root. > And in btrfs_mount_root the output fs_devices of this call stack > is not used. > btrfs_mount_root > -> btrfs_parse_early_options > ->btrfs_scan_one_device > So, there is no necessary to pass fs_devices from btrfs_mount_root, > use a local variable in btrfs_parse_early_options is enough. > > Signed-off-by: Gu Jinxiang > Reviewed-by: Anand Jain > --- > > Changelog: > v5: change a line wrap, and rebase to misc-next. > v4: changed a line warp, and adjusted the order of two rows. > v3: rebase to misc-next. > v2: deal with Nikolay's comment, make changelog more clair. > > fs/btrfs/super.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > index 28ab75ebb983..dcab4a0244e5 100644 > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -884,10 +884,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, > char *options, > * only when we need to allocate a new super block. > */ > static int btrfs_parse_early_options(const char *options, fmode_t flags, > - void *holder, struct btrfs_fs_devices **fs_devices) > + void *holder) > { > substring_t args[MAX_OPT_ARGS]; > char *device_name, *opts, *orig, *p; > + struct btrfs_fs_devices *fs_devices = NULL; > int error = 0; > > lockdep_assert_held(_mutex); > @@ -917,8 +918,8 @@ static int btrfs_parse_early_options(const char *options, > fmode_t flags, > error = -ENOMEM; > goto out; > } > - error = btrfs_scan_one_device(device_name, > - flags, holder, fs_devices); > + error = btrfs_scan_one_device(device_name, flags, > + holder, _devices); You are line wraps are in fact worse than before, the first letter of the "holder.." line has to be aligned exactly under the first letter of "device_name". I assume David will fix this while merging so no need to resend but fix your editor for future submissions. > kfree(device_name); > if (error) > goto out; > @@ -1554,7 +1555,7 @@ static struct dentry *btrfs_mount_root(struct > file_system_type *fs_type, > } > > mutex_lock(_mutex); > - error = btrfs_parse_early_options(data, mode, fs_type, _devices); > + error = btrfs_parse_early_options(data, mode, fs_type); > if (error) { > mutex_unlock(_mutex); > goto error_fs_info; > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/2] btrfs: get device pointer from btrfs_scan_one_device
On Mon, Jul 16, 2018 at 05:11:17PM +0800, Gu Jinxiang wrote: > Instead of pointer to btrfs_fs_devices as an arg in > btrfs_scan_one_device, better to make it as a return value. > > And since btrfs_fs_devices can be get by btrfs_device, > better to return btrfs_device than fs_btrfs_devices. > > Signed-off-by: Gu Jinxiang > --- > > Changelog: > v5: rebase to misc-next. > v4: as suggested by Anand, change return value of > btrfs_scan_one_device from btrfs_fs_devices to btrfs_device. > v3: as comment by robot, use PTR_ERR_OR_ZERO, and rebase to misc-next. > v2: as comment by Nikolay, use ERR_CAST instead of cast type manually. > > fs/btrfs/super.c | 35 ++- > fs/btrfs/volumes.c | 22 -- > fs/btrfs/volumes.h | 4 ++-- > 3 files changed, 32 insertions(+), 29 deletions(-) > > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > index dcab4a0244e5..413bbe2d6db2 100644 > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -888,7 +888,7 @@ static int btrfs_parse_early_options(const char *options, > fmode_t flags, > { > substring_t args[MAX_OPT_ARGS]; > char *device_name, *opts, *orig, *p; > - struct btrfs_fs_devices *fs_devices = NULL; > + struct btrfs_device *device = NULL; > int error = 0; > > lockdep_assert_held(_mutex); > @@ -918,11 +918,13 @@ static int btrfs_parse_early_options(const char > *options, fmode_t flags, > error = -ENOMEM; > goto out; > } > - error = btrfs_scan_one_device(device_name, flags, > - holder, _devices); > + device = btrfs_scan_one_device(device_name, flags, > + holder); > kfree(device_name); > - if (error) > + if (IS_ERR(device)) { > + error = PTR_ERR(device); > goto out; > + } > } > } > > @@ -1518,6 +1520,7 @@ static struct dentry *btrfs_mount_root(struct > file_system_type *fs_type, > { > struct block_device *bdev = NULL; > struct super_block *s; > + struct btrfs_device *device = NULL; > struct btrfs_fs_devices *fs_devices = NULL; > struct btrfs_fs_info *fs_info = NULL; > struct security_mnt_opts new_sec_opts; > @@ -1561,12 +1564,15 @@ static struct dentry *btrfs_mount_root(struct > file_system_type *fs_type, > goto error_fs_info; > } > > - error = btrfs_scan_one_device(device_name, mode, fs_type, _devices); > - if (error) { > + device = btrfs_scan_one_device(device_name, mode, fs_type); > + if (IS_ERR(device)) { > + error = PTR_ERR(device); > mutex_unlock(_mutex); > goto error_fs_info; > } > > + fs_devices = device->fs_devices; > + > fs_info->fs_devices = fs_devices; > > error = btrfs_open_devices(fs_devices, mode, fs_type); > @@ -2227,7 +2233,7 @@ static long btrfs_control_ioctl(struct file *file, > unsigned int cmd, > unsigned long arg) > { > struct btrfs_ioctl_vol_args *vol; > - struct btrfs_fs_devices *fs_devices; > + struct btrfs_device *device = NULL; > int ret = -ENOTTY; > > if (!capable(CAP_SYS_ADMIN)) > @@ -2240,19 +2246,22 @@ static long btrfs_control_ioctl(struct file *file, > unsigned int cmd, > switch (cmd) { > case BTRFS_IOC_SCAN_DEV: > mutex_lock(_mutex); > - ret = btrfs_scan_one_device(vol->name, FMODE_READ, > - _root_fs_type, _devices); > + device = btrfs_scan_one_device(vol->name, FMODE_READ, > + _root_fs_type); > + ret = PTR_ERR_OR_ZERO(device); > mutex_unlock(_mutex); > break; > case BTRFS_IOC_DEVICES_READY: > mutex_lock(_mutex); > - ret = btrfs_scan_one_device(vol->name, FMODE_READ, > - _root_fs_type, _devices); > - if (ret) { > + device = btrfs_scan_one_device(vol->name, FMODE_READ, > + _root_fs_type); > + if (IS_ERR(device)) { > + ret = PTR_ERR(device); > mutex_unlock(_mutex); > break; > } > - ret = !(fs_devices->num_devices == fs_devices->total_devices); > + ret = !(device->fs_devices->num_devices == > + device->fs_devices->total_devices); > mutex_unlock(_mutex); > break; > case BTRFS_IOC_GET_SUPPORTED_FEATURES: > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 88d37bfa99c8..9f485696461b 100644 >
Re: [PATCH v2] btrfs: Rewrite retry logic in do_chunk_alloc
On Wed, Apr 18, 2018 at 10:27:57AM +0300, Nikolay Borisov wrote: > do_chunk_alloc implements logic to detect whether there is currently > pending chunk allocation (by means of space_info->chunk_alloc being > set) and if so it loops around to the 'again' label. Additionally, > based on the state of the space_info (e.g. whether it's full or not) > and the return value of should_alloc_chunk() it decides whether this > is a "hard" error (ENOSPC) or we can just return 0. > > This patch refactors all of this: > > 1. Put order to the scattered ifs handling the various cases in an > easy-to-read if {} else if{} branches. This makes clear the various > cases we are interested in handling. > > 2. Call should_alloc_chunk only once and use the result in the > if/else if constructs. All of this is done under space_info->lock, so > even before multiple calls of should_alloc_chunk were unnecessary. > > 3. Rewrite the "do {} while()" loop currently implemented via label > into an explicit loop construct. > > 4. Move the mutex locking for the case where the caller is the one doing the > allocation. For the case where the caller needs to wait a concurrent > allocation, > introduce a pair of mutex_lock/mutex_unlock to act as a barrier and reword > the > comment. > > 5. Switch local vars to bool type where pertinent. > > All in all this shouldn't introduce any functional changes. > > Signed-off-by: Nikolay Borisov Reviewed-by: David Sterba -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: rename btrfs_parse_early_options
Rename btrfs_parse_early_options() to btrfs_parse_device_options(). As btrfs_parse_early_options() parses the -o device options and scan the device provided. So this rename specifies its action. Also the function name is inline with btrfs_parse_subvol_options(). No functional changes. Signed-off-by: Anand Jain --- fs/btrfs/super.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 9d340f8d3457..9dc56d3eaa66 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -465,9 +465,9 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, case Opt_subvolrootid: case Opt_device: /* -* These are parsed by btrfs_parse_subvol_options -* and btrfs_parse_early_options -* and can be happily ignored here. +* These are parsed by btrfs_parse_subvol_options and +* btrfs_parse_device_options and can be happily +* ignored here. */ break; case Opt_nodatasum: @@ -883,8 +883,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, * All other options will be parsed on much later in the mount process and * only when we need to allocate a new super block. */ -static int btrfs_parse_early_options(const char *options, fmode_t flags, -void *holder) +static int btrfs_parse_device_options(const char *options, fmode_t flags, + void *holder) { substring_t args[MAX_OPT_ARGS]; char *device_name, *opts, *orig, *p; @@ -951,7 +951,7 @@ static int btrfs_parse_subvol_options(const char *options, char **subvol_name, /* * strsep changes the string, duplicate it because -* btrfs_parse_early_options gets called later +* btrfs_parse_device_options gets called later */ opts = kstrdup(options, GFP_KERNEL); if (!opts) @@ -1558,7 +1558,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type, } mutex_lock(_mutex); - error = btrfs_parse_early_options(data, mode, fs_type); + error = btrfs_parse_device_options(data, mode, fs_type); if (error) { mutex_unlock(_mutex); goto error_fs_info; -- 2.15.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2nd try v4] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
On 05/29/2018 07:19 PM, David Sterba wrote: On Tue, May 29, 2018 at 01:40:34PM +0800, Anand Jain wrote: v3->v4: As we traverse through the seed device, fs_device gets updated with the child seed fs_devices, so make sure we use the same fs_devices pointer for the mutex_unlock as used for the mutex_lock. Well, now that I see the change, shouldn't we always hold the device_list_mutex of the fs_devices that's being processed? Ie. each time it's switched, the previous is unlocked and new one locked. No David. That's because we organize seed device under its parent fs_devices ((fs_devices::seed)::seed)..so on, and they are a local cloned copy of the original seed fs_devices. So parent's fs_devices::device_list_mutex lock will suffice. On the 2nd thought, though theoretically you are right. But practically there isn't any use case which can benefit by using the intricate locking as you suggested above. I don't think it's intricate or complex, just lock the fs_devices that can be potentially modified in the following loop. Schematically: function called with some fs_devices again: lock fs_devices->device_list_mutex foreach device in fs_devices if ... fs_devices counters get changed after device deletion endfor if (fs_devices->seed) unlock fs_devices->device_list_mutex fs_devices = fs_devices->seed lock fs_devices->device_list_mutex <-- lock the new one goto again endif unlock fs_devices->device_list_mutex<-- correctly unlock endfunc I am following the following method:- By holding the parent fs_devices (which is also the mounted fs_devices lock) it would imply to lock its dependent cloned seed fs_devices, as to reach the cloned seed device, first we need to traverse from the parent fs_devices. Locking the parent fs_devices might be the right scheme, but if any child fs_devices pointer gets passed to a random function that will change it, then the locking will not protect it. If its not holding the parent fs_devices lock then it would a bug and if its already holding the parent lock then child lock isn't necessary at all. Also sent the [DOC] BTRFS Volume operations, Device Lists and Locks all in one page to have a holistic vewi. Its not too convincing to me the approach of using the per fs_devices lock when fs_devices::seed is traversed its not necessary. Thanks, Anand This might need a deeper audit how the seeding device is done and maybe add some helpers that eg. lock the whole chain so all of them are protected. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs: Introduce mount time chunk <-> dev extent mapping check
On Mon, Jul 09, 2018 at 02:42:02PM +0800, Qu Wenruo wrote: > This patch will introduce chunk <-> dev extent mapping check, to protect > us against invalid dev extents or chunks. > > Since chunk mapping is the fundamental infrastructure of btrfs, extra > check at mount time could prevent a lot of unexpected behavior (BUG_ON). > > Reported-by: Xu Wen > Links: https://bugzilla.kernel.org/show_bug.cgi?id=200403 Link: > Links: https://bugzilla.kernel.org/show_bug.cgi?id=200407 > Signed-off-by: Qu Wenruo > --- > fs/btrfs/disk-io.c | 7 ++ > fs/btrfs/volumes.c | 173 + > fs/btrfs/volumes.h | 2 + > 3 files changed, 182 insertions(+) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 205092dc9390..068ca7498e94 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -3075,6 +3075,13 @@ int open_ctree(struct super_block *sb, > fs_info->generation = generation; > fs_info->last_trans_committed = generation; > > + ret = btrfs_verify_dev_extents(fs_info); > + if (ret) { > + btrfs_err(fs_info, > + "failed to verify dev extents against chunks: %d", > + ret); > + goto fail_block_groups; > + } > ret = btrfs_recover_balance(fs_info); > if (ret) { > btrfs_err(fs_info, "failed to recover balance: %d", ret); > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index e6a8e4aabc66..05e418cb37f3 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -6440,6 +6440,7 @@ static int read_one_chunk(struct btrfs_fs_info > *fs_info, struct btrfs_key *key, > map->stripe_len = btrfs_chunk_stripe_len(leaf, chunk); > map->type = btrfs_chunk_type(leaf, chunk); > map->sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk); > + map->verified_stripes = 0; > for (i = 0; i < num_stripes; i++) { > map->stripes[i].physical = > btrfs_stripe_offset_nr(leaf, chunk, i); > @@ -7295,3 +7296,175 @@ void btrfs_reset_fs_info_ptr(struct btrfs_fs_info > *fs_info) > fs_devices = fs_devices->seed; > } > } > + > +static u64 calc_stripe_length(u64 type, u64 chunk_len, int num_stripes) > +{ > + switch (type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { > + case BTRFS_BLOCK_GROUP_RAID0: > + return div_u64(chunk_len, num_stripes); > + case BTRFS_BLOCK_GROUP_RAID10: > + return div_u64(chunk_len * 2, num_stripes); > + case BTRFS_BLOCK_GROUP_RAID5: > + return div_u64(chunk_len, num_stripes - 1); > + case BTRFS_BLOCK_GROUP_RAID6: > + return div_u64(chunk_len, num_stripes - 2); > + default: > + return chunk_len; > + } > +} There are already too many hardcoded values for the raid profiles, please don't add another one unless really necessary and use existing predefined constants or helpers (eg. nr_data_stripes or btrfs_raid_array). > +static int verify_one_dev_extent(struct btrfs_fs_info *fs_info, > + u64 chunk_offset, u64 devid, > + u64 physical_offset, u64 physical_len) > +{ > + struct extent_map_tree *em_tree = _info->mapping_tree.map_tree; > + struct extent_map *em; > + struct map_lookup *map; > + u64 stripe_len; > + bool found = false; This variable is only set and never checked. > + int ret = 0; > + int i; > + > + read_lock(_tree->lock); > + em = lookup_extent_mapping(em_tree, chunk_offset, 1); > + read_unlock(_tree->lock); > + > + if (!em) { > + ret = -EUCLEAN; > + btrfs_err(fs_info, > + "dev extent (%llu, %llu) doesn't have corresponding chunk", > + devid, physical_offset); > + goto out; > + } > + > + map = em->map_lookup; > + stripe_len = calc_stripe_length(map->type, em->len, map->num_stripes); > + if (physical_len != stripe_len) { > + btrfs_err(fs_info, > +"dev extent (%llu, %llu) length doesn't match with chunk %llu, have %llu > expect %llu", > + devid, physical_offset, em->start, physical_len, > + stripe_len); > + ret = -EUCLEAN; > + goto out; > + } > + > + for (i = 0; i < map->num_stripes; i++) { > + if (map->stripes[i].dev->devid == devid && > + map->stripes[i].physical == physical_offset) { > + found = true; 2nd time set > + if (map->verified_stripes >= map->num_stripes) { > + btrfs_err(fs_info, > + "too many dev extent for chunk %llu is detected", > + em->start); > + ret = -EUCLEAN; > + goto out; > + } > + map->verified_stripes++; > + break; >
Re: [PATCH 5/5] btrfs: Verify every chunk has corresponding block group at mount time
On 2018年07月16日 21:16, David Sterba wrote: > On Fri, Jul 06, 2018 at 07:44:37AM +0800, Qu Wenruo wrote: >> >> >> On 2018年07月05日 23:18, David Sterba wrote: >>> On Tue, Jul 03, 2018 at 05:10:09PM +0800, Qu Wenruo wrote: If a crafted btrfs has missing block group items, it could cause unexpected behavior and breaks our expectation on 1:1 chunk<->block group mapping. Although we added block group -> chunk mapping check, we still need chunk -> block group mapping check. This patch will do extra check to ensure each chunk has its corresponding block group. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847 Reported-by: Xu Wen Signed-off-by: Qu Wenruo --- fs/btrfs/extent-tree.c | 52 +- 1 file changed, 51 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 82b446f014b9..746095034ca2 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -10038,6 +10038,56 @@ static int check_exist_chunk(struct btrfs_fs_info *fs_info, u64 start, u64 len, return ret; } +/* + * Iterate all chunks and verify each of them has corresponding block group + */ +static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info) +{ + struct btrfs_mapping_tree *map_tree = _info->mapping_tree; + struct extent_map *em; + struct btrfs_block_group_cache *bg; + u64 start = 0; + int ret = 0; + + while (1) { + read_lock(_tree->map_tree.lock); + em = lookup_extent_mapping(_tree->map_tree, start, + (u64)-1 - start); >>> >>> This needs a comment. >> >> For the @len part? > > Yes, for the expression how it's calculated. > >>> + read_unlock(_tree->map_tree.lock); + if (!em) + break; + + bg = btrfs_lookup_block_group(fs_info, em->start); + if (!bg) { + btrfs_err_rl(fs_info, + "chunk start=%llu len=%llu doesn't have corresponding block group", + em->start, em->len); + ret = -ENOENT; >>> >>> -EUCLEAN ? >> >> Either works for me. > > That's not just a cosmetic change, there's a semantic difference between > the error codes, I maybe make that more explicit and not expect that this > is obvious. > > ENOENT does not make much sense in this context, the caller (mount in > this case) cannot do anything about a code that says 'some internal > structure not found'. The point here is, if every self-checker should only return -EUCLEAN, it won't really indicate what's going wrong, except points to some self-checker (and such self-checkers are growing larger than our expectation already). My practice here is, put some human readable and meaningful error message. No matter what we choose to return, the error message should tell us what's going wrong. In this case, I don't really care the return value. If it's explicitly needed to return -EUCLEAN, I could make all existing checker (from tree-checker to chunk/bg/dev-extent checker) to return -EUCLEAN if anything is wrong (and save several "ret = -EUCLEAN" lines). The return value doesn't really have much meaning nowadays, it's the error message important now. Thanks, Qu > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > signature.asc Description: OpenPGP digital signature
Re: [PATCH 0/2] btrfs: chunk and dev-extent related error handler enhancement
On Mon, Jul 09, 2018 at 02:42:01PM +0800, Qu Wenruo wrote: > Can be fetched with all existing tree-checker/bg<->chunk error detector > from github: > https://github.com/adam900710/linux/tree/tree_checker_enhance > > Still some fuzzed images reported from Xu Wen. > This time, 2 can be fixed by chunk <-> dev extent mapping verification. > One BUG_ON() can be removed. > > The remaining 2 images are all mostly about extent tree corruption, > which is not as easy to detect, since extent tree has way more complex > reference relationship. > > At least, fix what we can first. > > And for chunk <-> dev extent mapping verification, it will trigger > read on the whole device tree, to iterate through all DEV_EXTENT items. > This will introduce an extra overhead on mount. > > However since device tree is pretty small (the same level as chunk tree), > and according to previous analyse on mount time, chunk tree iteration is > only a pretty small fraction of the whole mount time (less than 5%), it > shouldn't bring obvious impact to users. Ok, that's acceptable. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/5] btrfs: Verify every chunk has corresponding block group at mount time
On Fri, Jul 06, 2018 at 07:44:37AM +0800, Qu Wenruo wrote: > > > On 2018年07月05日 23:18, David Sterba wrote: > > On Tue, Jul 03, 2018 at 05:10:09PM +0800, Qu Wenruo wrote: > >> If a crafted btrfs has missing block group items, it could cause > >> unexpected behavior and breaks our expectation on 1:1 > >> chunk<->block group mapping. > >> > >> Although we added block group -> chunk mapping check, we still need > >> chunk -> block group mapping check. > >> > >> This patch will do extra check to ensure each chunk has its > >> corresponding block group. > >> > >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847 > >> Reported-by: Xu Wen > >> Signed-off-by: Qu Wenruo > >> --- > >> fs/btrfs/extent-tree.c | 52 +- > >> 1 file changed, 51 insertions(+), 1 deletion(-) > >> > >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > >> index 82b446f014b9..746095034ca2 100644 > >> --- a/fs/btrfs/extent-tree.c > >> +++ b/fs/btrfs/extent-tree.c > >> @@ -10038,6 +10038,56 @@ static int check_exist_chunk(struct btrfs_fs_info > >> *fs_info, u64 start, u64 len, > >>return ret; > >> } > >> > >> +/* > >> + * Iterate all chunks and verify each of them has corresponding block > >> group > >> + */ > >> +static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info) > >> +{ > >> + struct btrfs_mapping_tree *map_tree = _info->mapping_tree; > >> + struct extent_map *em; > >> + struct btrfs_block_group_cache *bg; > >> + u64 start = 0; > >> + int ret = 0; > >> + > >> + while (1) { > >> + read_lock(_tree->map_tree.lock); > >> + em = lookup_extent_mapping(_tree->map_tree, start, > >> + (u64)-1 - start); > > > > This needs a comment. > > For the @len part? Yes, for the expression how it's calculated. > > > >> + read_unlock(_tree->map_tree.lock); > >> + if (!em) > >> + break; > >> + > >> + bg = btrfs_lookup_block_group(fs_info, em->start); > >> + if (!bg) { > >> + btrfs_err_rl(fs_info, > >> + "chunk start=%llu len=%llu doesn't have corresponding block group", > >> + em->start, em->len); > >> + ret = -ENOENT; > > > > -EUCLEAN ? > > Either works for me. That's not just a cosmetic change, there's a semantic difference between the error codes, I maybe make that more explicit and not expect that this is obvious. ENOENT does not make much sense in this context, the caller (mount in this case) cannot do anything about a code that says 'some internal structure not found'. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESEND 1/2] btrfs-progs: remove stale dir-test and quick-test
On Thu, Jul 12, 2018 at 04:11:18PM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > These don't build anymore and don't appear to be used for anything. I've fixed build of quick-test and will keep the code for now, it can be used for some stress testing. The dir-test does not seem to be even potentially useful in the fucute, so it's dropped now. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/7] btrfs: do device clone using the btrfs_scan_one_device
When we add a device to the RO mounted seed device, it becomes a RW sprout FS. The following steps are used to hold the seed and sprout fs_devices. (first two steps are not mandatory for the sprouting, they are there to ensure the seed device remains in the scanned state) . Clone the (mounted) fs_devices, lets call it as old_devices . Now add old_devices to fs_uuids (yeah, there is duplicate fsid in the list, as we are under uuid_mutex so its fine). . Alloc a new fs_devices, lets call it as seed_devices . Copy fs_devices into the seed_devices . Move fs_devices::devices into seed_devices::devices . Bring seed_devices to under fs_devices::seed (fs_devices->seed = seed_devices) . Assign a new FSID to the fs_devices and add the new writable device to the fs_devices. This patch makes the following changes.. As we clone fs_devices to make sure the device remains scanned after the sprouting. So use the btrfs_scan_one_device() code instead. And do it at the end of the sprouting. Signed-off-by: Anand Jain --- fs/btrfs/volumes.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 8450bcfed4cb..c6f3f0dfbabe 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2179,7 +2179,7 @@ int btrfs_find_device_by_devspec(struct btrfs_fs_info *fs_info, u64 devid, static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) { struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; - struct btrfs_fs_devices *old_devices; + struct btrfs_fs_devices *old_fs_devices; struct btrfs_fs_devices *seed_devices; struct btrfs_super_block *disk_super = fs_info->super_copy; struct btrfs_device *device; @@ -2193,14 +2193,6 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) if (IS_ERR(seed_devices)) return PTR_ERR(seed_devices); - old_devices = clone_fs_devices(fs_devices); - if (IS_ERR(old_devices)) { - kfree(seed_devices); - return PTR_ERR(old_devices); - } - - list_add(_devices->fs_list, _uuids); - memcpy(seed_devices, fs_devices, sizeof(*seed_devices)); seed_devices->opened = 1; INIT_LIST_HEAD(_devices->devices); @@ -2233,6 +2225,17 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) ~BTRFS_SUPER_FLAG_SEEDING; btrfs_set_super_flags(disk_super, super_flags); + /* +* As the above code hijacked the original seed fs_devices, now +* create a new one for the original seed FSID. +*/ + list_for_each_entry(device, _devices->seed->devices, dev_list) { + if (!device->name) + continue; + btrfs_scan_one_device(device->name->str, FMODE_READ, + fs_info->bdev_holder, _fs_devices); + } + return 0; } -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 1/7] btrfs: drop uuid_mutex in btrfs_free_extra_devids()
btrfs_free_extra_devids() is called only in the mount context which traverses through the fs_devices::devices and frees the orphan devices devices in the given %fs_devices if any. As the search for the orphan device is limited to fs_devices::devices so we don't need the global uuid_mutex. There can't be any mount-point based ioctl threads in this context as the mount thread is not yet returned. But there can be the btrfs-control based scan ioctls thread which calls device_list_add(). Here in the mount thread the fs_devices::opened is incremented way before btrfs_free_extra_devids() is called and in the scan context the fs_devices which are already opened neither be freed or alloc-able at device_list_add(). But lets say you change the device-path and call the scan again, then scan would update the new device path and this operation could race against the btrfs_free_extra_devids() thread, which might be in the process of free-ing the same device. So synchronize it by using the device_list_mutex. This scenario is a very corner case, and practically the scan and mount are anyway serialized by the usage so unless the race is instrumented its very difficult to achieve. Signed-off-by: Anand Jain Reviewed-by: David Sterba --- v3->v4: As we traverse through the seed device, fs_device gets updated with the child seed fs_devices, so make sure we use the same fs_devices pointer for the mutex_unlock as used for the mutex_lock. v2->v3: Update change log. (Currently device_list_add() is very lean on its device_list_mutex usage, a cleanup and fix is wip. Given the practicality of the above race condition this patch is good to merge). v1->v2: replace uuid_mutex with device_list_mutex instead of delete. change log updated. fs/btrfs/volumes.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 88d37bfa99c8..870c9f69a6a4 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -934,8 +934,9 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step) { struct btrfs_device *device, *next; struct btrfs_device *latest_dev = NULL; + struct btrfs_fs_devices *parent_fs_devices = fs_devices; - mutex_lock(_mutex); + mutex_lock(_fs_devices->device_list_mutex); again: /* This is the initialized path, it is safe to release the devices. */ list_for_each_entry_safe(device, next, _devices->devices, dev_list) { @@ -989,8 +990,7 @@ void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices, int step) } fs_devices->latest_bdev = latest_dev->bdev; - - mutex_unlock(_mutex); + mutex_unlock(_fs_devices->device_list_mutex); } static void free_device_rcu(struct rcu_head *head) -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/7] btrfs: fix race between free_stale_devices and close_fs_devices
From: Anand Jain %fs_devices can be free-ed by btrfs_free_stale_devices() when the close_fs_devices() drops fs_devices::opened to zero, but close_fs_devices tries to access the %fs_devices again without the device_list_mutex. Fix this by bringing the %fs_devices access with in the device_list_mutex. Stack trace as below. WARNING: CPU: 1 PID: 4499 at fs/btrfs/volumes.c:1071 close_fs_devices+0xbc7/0xfa0 fs/btrfs/volumes.c:1071 Kernel panic - not syncing: panic_on_warn set ... :: RIP: 0010:close_fs_devices+0xbc7/0xfa0 fs/btrfs/volumes.c:1071 :: btrfs_close_devices+0x29/0x150 fs/btrfs/volumes.c:1085 open_ctree+0x589/0x7898 fs/btrfs/disk-io.c:3358 btrfs_fill_super fs/btrfs/super.c:1202 [inline] btrfs_mount_root+0x16df/0x1e70 fs/btrfs/super.c:1593 mount_fs+0xae/0x328 fs/super.c:1277 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount+0x40/0x60 fs/namespace.c:1027 btrfs_mount+0x4a1/0x213e fs/btrfs/super.c:1661 mount_fs+0xae/0x328 fs/super.c:1277 Reported-by: syzbot+ceb2606025ec1cc34...@syzkaller.appspotmail.com Signed-off-by: Anand Jain --- v1->v2: Commit log update. Satisfy checkpatch.pl. Remove HEAD.. fs/btrfs/volumes.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 870c9f69a6a4..8450bcfed4cb 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1062,13 +1062,13 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices) list_for_each_entry_safe(device, tmp, _devices->devices, dev_list) { btrfs_close_one_device(device); } - mutex_unlock(_devices->device_list_mutex); - WARN_ON(fs_devices->open_devices); WARN_ON(fs_devices->rw_devices); fs_devices->opened = 0; fs_devices->seeding = 0; + mutex_unlock(_devices->device_list_mutex); + return 0; } -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/7] Misc volume patch set part2
As the last set if the pull wasn't integrated due to its pending review corrections. Here, I have done them (not much except for adding comments and function renames) and so sending it again. Also this set clubs other independent patches which are in the mailing list. I am explaining the changes of the individual patches in its change list. And most of it still remains at v1. This can also be pulled from: g...@github.com:asj/btrfs-devel.git misc-next-for-kdave-16jul Anand Jain (7): btrfs: drop uuid_mutex in btrfs_free_extra_devids() btrfs: fix race between free_stale_devices and close_fs_devices btrfs: do device clone using the btrfs_scan_one_device btrfs: use the assigned fs_devices instead of the dereference btrfs: warn for num_devices below 0 btrfs: add helper btrfs_num_devices() to deduce num_devices btrfs: add helper function check device delete able fs/btrfs/volumes.c | 106 +++-- 1 file changed, 62 insertions(+), 44 deletions(-) -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/7] btrfs: use the assigned fs_devices instead of the dereference
We have assigned the %fs_info->fs_devices in %fs_devices as its not modified just use it for the mutex_lock(). Signed-off-by: Anand Jain --- I don't see this in the ML at all, I remember sending this. Anyway I am marking this as V1. fs/btrfs/volumes.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c6f3f0dfbabe..7f4973fc2b52 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2199,7 +2199,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) INIT_LIST_HEAD(_devices->alloc_list); mutex_init(_devices->device_list_mutex); - mutex_lock(_info->fs_devices->device_list_mutex); + mutex_lock(_devices->device_list_mutex); list_splice_init_rcu(_devices->devices, _devices->devices, synchronize_rcu); list_for_each_entry(device, _devices->devices, dev_list) @@ -2219,7 +2219,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) generate_random_uuid(fs_devices->fsid); memcpy(fs_info->fsid, fs_devices->fsid, BTRFS_FSID_SIZE); memcpy(disk_super->fsid, fs_devices->fsid, BTRFS_FSID_SIZE); - mutex_unlock(_info->fs_devices->device_list_mutex); + mutex_unlock(_devices->device_list_mutex); super_flags = btrfs_super_flags(disk_super) & ~BTRFS_SUPER_FLAG_SEEDING; -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/7] btrfs: warn for num_devices below 0
In preparation to de-duplicate a section of code where we deduce the num_devices, use warn instead of bug. Signed-off-by: Anand Jain --- fs/btrfs/volumes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 7f4973fc2b52..0f4c512aa6b4 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3726,7 +3726,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info, num_devices = fs_info->fs_devices->num_devices; btrfs_dev_replace_read_lock(_info->dev_replace); if (btrfs_dev_replace_is_ongoing(_info->dev_replace)) { - BUG_ON(num_devices < 1); + WARN_ON(num_devices < 1); num_devices--; } btrfs_dev_replace_read_unlock(_info->dev_replace); -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/7] btrfs: add helper btrfs_num_devices() to deduce num_devices
When the replace is running the fs_devices::num_devices also includes the replace device, however in some operations like device delete and balance it needs the actual num_devices without the repalce devices, so now the function btrfs_num_devices() just provides that. Signed-off-by: Anand Jain --- v2: add comments. Thanks Nikolay. fs/btrfs/volumes.c | 32 ++-- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 0f4c512aa6b4..1c0b56374992 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1844,6 +1844,21 @@ void btrfs_assign_next_active_device(struct btrfs_fs_info *fs_info, fs_info->fs_devices->latest_bdev = next_device->bdev; } +/* Returns btrfs_fs_devices::num_devices excluding replace device if any */ +static inline u64 btrfs_num_devices(struct btrfs_fs_info *fs_info) +{ + u64 num_devices = fs_info->fs_devices->num_devices; + + btrfs_dev_replace_read_lock(_info->dev_replace); + if (btrfs_dev_replace_is_ongoing(_info->dev_replace)) { + WARN_ON(num_devices < 1); + num_devices--; + } + btrfs_dev_replace_read_unlock(_info->dev_replace); + + return num_devices; +} + int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, u64 devid) { @@ -1857,13 +1872,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, mutex_lock(_mutex); - num_devices = fs_devices->num_devices; - btrfs_dev_replace_read_lock(_info->dev_replace); - if (btrfs_dev_replace_is_ongoing(_info->dev_replace)) { - WARN_ON(num_devices < 1); - num_devices--; - } - btrfs_dev_replace_read_unlock(_info->dev_replace); + num_devices = btrfs_num_devices(fs_info); ret = btrfs_check_raid_min_devices(fs_info, num_devices - 1); if (ret) @@ -3723,13 +3732,8 @@ int btrfs_balance(struct btrfs_fs_info *fs_info, } } - num_devices = fs_info->fs_devices->num_devices; - btrfs_dev_replace_read_lock(_info->dev_replace); - if (btrfs_dev_replace_is_ongoing(_info->dev_replace)) { - WARN_ON(num_devices < 1); - num_devices--; - } - btrfs_dev_replace_read_unlock(_info->dev_replace); + num_devices = btrfs_num_devices(fs_info); + allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE | BTRFS_BLOCK_GROUP_DUP; if (num_devices > 1) allowed |= (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID1); -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/7] btrfs: add helper function check device delete able
Move the section of the code which performs the check if the device is indelible, move that into a helper function. Signed-off-by: Anand Jain --- v1->v2: Rename function to btrfs_get_device_for_delete(), thanks Nikolay. fs/btrfs/volumes.c | 49 ++--- 1 file changed, 30 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 1c0b56374992..0cefc24b028c 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1859,6 +1859,33 @@ static inline u64 btrfs_num_devices(struct btrfs_fs_info *fs_info) return num_devices; } +static struct btrfs_device *btrfs_get_device_for_delete( + struct btrfs_fs_info *fs_info, + const char *device_path, u64 devid) +{ + int ret; + struct btrfs_device *device; + + ret = btrfs_check_raid_min_devices(fs_info, + btrfs_num_devices(fs_info) - 1); + if (ret) + return ERR_PTR(ret); + + ret = btrfs_find_device_by_devspec(fs_info, devid, device_path, + ); + if (ret) + return ERR_PTR(ret); + + if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, >dev_state)) + return ERR_PTR(BTRFS_ERROR_DEV_TGT_REPLACE); + + if (test_bit(BTRFS_DEV_STATE_WRITEABLE, >dev_state) && + fs_info->fs_devices->rw_devices == 1) + return ERR_PTR(BTRFS_ERROR_DEV_ONLY_WRITABLE); + + return device; +} + int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, u64 devid) { @@ -1872,25 +1899,9 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, mutex_lock(_mutex); - num_devices = btrfs_num_devices(fs_info); - - ret = btrfs_check_raid_min_devices(fs_info, num_devices - 1); - if (ret) - goto out; - - ret = btrfs_find_device_by_devspec(fs_info, devid, device_path, - ); - if (ret) - goto out; - - if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, >dev_state)) { - ret = BTRFS_ERROR_DEV_TGT_REPLACE; - goto out; - } - - if (test_bit(BTRFS_DEV_STATE_WRITEABLE, >dev_state) && - fs_info->fs_devices->rw_devices == 1) { - ret = BTRFS_ERROR_DEV_ONLY_WRITABLE; + device = btrfs_get_device_for_delete(fs_info, device_path, devid); + if (IS_ERR(device)) { + ret = PTR_ERR(device); goto out; } -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESEND 2/2] btrfs-progs: make all programs and libraries optional
On Thu, Jul 12, 2018 at 04:11:19PM -0700, Omar Sandoval wrote: > From: Omar Sandoval > > We have a build system internally which only needs to build the > libraries out of a repository, not any binaries. I looked at how this > works with other projects, and the best example was util-linux, which > makes it possible to enable or disable everything individually. This is > nice and really flexible, so let's do the same. This way, if you only > want to build and install libbtrfsutil, you can simply do > > ./configure --disable-documentation --disable-all-programs > --enable-libbtrfsutil > make > make install I think this is an overkill and abusing the --enable-XXX options. You want to avoid building the tools by default, so adding an option for that is fine. Selectively building only certain tools can utilize that option too and just follow with 'make btrfs-image' etc. The number of --enable-* will stay minimal and we don't even have to discuss how to find a good naming scheme (that works for util-linux but looks a bit confusing for btrfs-progs). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: rename btrfs_parse_early_options
On 07/16/2018 10:18 PM, Anand Jain wrote: Rename btrfs_parse_early_options() to btrfs_parse_device_options(). As btrfs_parse_early_options() parses the -o device options and scan the device provided. So this rename specifies its action. Also the function name is inline with btrfs_parse_subvol_options(). No functional changes. Signed-off-by: Anand Jain --- Forgot to mention: This should be applied on top of Gu Jinxiang (2): btrfs: make fs_devices to be a local variable btrfs: get device pointer from btrfs_scan_one_device Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/15] Add delayed-refs support to btrfs-progs
On Fri, Jun 08, 2018 at 03:47:43PM +0300, Nikolay Borisov wrote: > Hello, > > > > Here is a series which adds support for delayed refs. This is needed to > enable > later work on adding freespace tree repair code. Additionally, it results in > more code sharing between kernel/user space. > > Patches 1-9 are simple prep patches removing some arguments, causing problems > later. They can go independently of the delayed refs work. They don't > introduce > any functional changes. Next, patches 10-13 introduce the needed > infrastructure > to for delayed refs without actually activating it. Patch 14 finally wires it > up by adding the necessary call outs to btrfs_run_delayed refs and reworking > the > extent addition/freeing functions. With all of this done, patch 15 finally > removes the old code. > > This series passes all btrfs progs fsck and misc tests + fuzz tests apart from > fuzz-003/007/009 - but those fail without this series so it's unlikely it's > caused by it. > > Nikolay Borisov (15): > btrfs-progs: Remove root argument from pin_down_bytes > btrfs-progs: Remove root argument from btrfs_del_csums > btrfs-progs: Add functions to modify the used space by a root > btrfs-progs: Refactor the root used bytes are updated > btrfs-progs: Make update_block_group take fs_info instead of root > btrfs-progs: check: Drop trans/root arguments from free_extent_hook > btrfs-progs: Remove root argument from __free_extent > btrfs-progs: Remove root argument from alloc_reserved_tree_block > btrfs-progs: Always pass 0 for offset when calling btrfs_free_extent > for btree blocks. > btrfs-progs: Add boolean to signal whether we are re-initing extent > tree > btrfs-progs: Add delayed refs infrastructure > btrfs-progs: Add __free_extent2 function > btrfs-progs: Add alloc_reserved_tree_block2 function > btrfs-progs: Wire up delayed refs > btrfs-progs: Remove old delayed refs infrastructure Added to devel. There were some patch-to-patch compilation issues, function alloc_reserved_tree_block2 used earlier than defined so I reordered the patches to fix that. The CI fails at test check/020-extent-ref-cases but it works on my machine so it's not caused by the patchset. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] Misc volume patch set part2
On 07/16/2018 10:58 PM, Anand Jain wrote: As the last set if the pull wasn't integrated due to its pending review corrections. Here, I have done them (not much except for adding comments and function renames) and so sending it again. Also this set clubs other independent patches which are in the mailing list. I am explaining the changes of the individual patches in its change list. And most of it still remains at v1. This can also be pulled from: g...@github.com:asj/btrfs-devel.git misc-next-for-kdave-16jul Sorry please pull from use the branch 'misc-next-for-kdave-17jul' instead. I notice that there were some changes since my last fast-forward. And also here, I have added the following patch which is non function change patch. 3696c3aaf81b btrfs: rename btrfs_parse_early_options Thanks, Anand Anand Jain (7): btrfs: drop uuid_mutex in btrfs_free_extra_devids() btrfs: fix race between free_stale_devices and close_fs_devices btrfs: do device clone using the btrfs_scan_one_device btrfs: use the assigned fs_devices instead of the dereference btrfs: warn for num_devices below 0 btrfs: add helper btrfs_num_devices() to deduce num_devices btrfs: add helper function check device delete able fs/btrfs/volumes.c | 106 +++-- 1 file changed, 62 insertions(+), 44 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs-progs: Exit gracefully when overlap chunks are detected
On Mon, Jul 09, 2018 at 02:50:53PM +0800, Qu Wenruo wrote: > BUG_ON() can be triggered if some image contains overlap chunks. > volumes.c:1930: read_one_chunk: BUG_ON `ret` triggered, value -17 > btrfs(+0x2cf12)[0x5601efa17f12] > btrfs(+0x2fd8b)[0x5601efa1ad8b] > btrfs(btrfs_read_chunk_tree+0x2bf)[0x5601efa1b30f] > btrfs(btrfs_setup_chunk_tree_and_device_map+0xe8)[0x5601efa07718] > btrfs(+0x1c944)[0x5601efa07944] > btrfs(open_ctree_fs_info+0x90)[0x5601efa07b90] > btrfs(cmd_check+0x4d7)[0x5601efa4f8c7] > btrfs(main+0x88)[0x5601ef9fd768] > /usr/lib/libc.so.6(__libc_start_main+0xeb)[0x7f3c7787306b] > btrfs(_start+0x2a)[0x5601ef9fd88a] > > Extent cache code can already detect it without problem, we only need to > remove the BUG_ON() and exit gracefully. > > Reported-by: Xu Wen > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200409 > Signed-off-by: Qu Wenruo 1 and 2 applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] btrfs-progs: inspect-internal: add option '-k u64,u8,u64' of dump-tree
On Mon, Jul 16, 2018 at 05:13:25PM +0800, Su Yue wrote: > For big filesystems, there are many items in trees(extent tree > specially). > For example, to dump one extent, we usually dump extent tree then pipe > result to grep. The time-consuming part is that dump tree traverses > items. And it eats cpu and memory too. > > This patch introduces an option '-k u64,u8,u64' for users(developers more > likely) to appoint a specific key to search and dump, then the path > from root node down the leaf will be dumped. > The search of the key costs most time of this way. > > Signed-off-by: Su Yue > --- > Change log: > v2: > Rename btrfs_search_spec_key() to btrfs_search_print_path(). > Rename is_key_spec to is_spec_key_set. > Move btrfs_search_print_path() after open_ctree(), call it only once. > Remove duplicate code and replaced with btrfs_print_tree(). > Modify the usage about '-k'. > All are suggested by Qu Wenruo. > > cmds-inspect-dump-tree.c | 41 - > print-tree.c | 77 > print-tree.h | 2 ++ > 3 files changed, 119 insertions(+), 1 deletion(-) > > diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c > index c8acd55a0c3a..80df826af0fd 100644 > --- a/cmds-inspect-dump-tree.c > +++ b/cmds-inspect-dump-tree.c > @@ -184,6 +184,21 @@ static u64 treeid_from_string(const char *str, const > char **end) > return id; > } > > +static int parse_key(struct btrfs_key *key) > +{ > + Extra newline. > + int ret = sscanf(optarg, "%llu,%hhu,%llu", >objectid, > + >type, >offset); Please don't do complex initializations in the declaration block. And use of the global variable 'optarg' is not good, pass the scanned string as a parameter. > + if (ret != 3) { > + error("error parsing key '%s'\n", optarg); no "\n" at the end of the string in error(...) > + ret = -EINVAL; > + } else { > + ret = 0; > + } > + > + return ret; > +} > + > const char * const cmd_inspect_dump_tree_usage[] = { > "btrfs inspect-internal dump-tree [options] device", > "Dump tree structures from a given device", > @@ -199,6 +214,7 @@ const char * const cmd_inspect_dump_tree_usage[] = { > "-u|--uuid print only the uuid tree", > "-b|--block print info from the specified block only", > "-t|--tree print only tree with the given id (string or > number)", > + "-k|--key search the specific key then print path, must > with -t(conflicts with others)", As there are more conflicting options, it would be better to group them and just put a note about that. The message as you write it does not provide clear usage instructions. > "--follow use with -b, to show all children tree blocks > of ", > NULL > }; > @@ -224,8 +240,10 @@ int cmd_inspect_dump_tree(int argc, char **argv) > unsigned open_ctree_flags; > u64 block_only = 0; > struct btrfs_root *tree_root_scan; > + struct btrfs_key spec_key = { 0 }; > u64 tree_id = 0; > bool follow = false; > + bool is_spec_key_set = false; > > /* >* For debug-tree, we care nothing about extent tree (it's just backref > @@ -248,10 +266,11 @@ int cmd_inspect_dump_tree(int argc, char **argv) > { "block", required_argument, NULL, 'b'}, > { "tree", required_argument, NULL, 't'}, > { "follow", no_argument, NULL, GETOPT_VAL_FOLLOW }, > + { "key", required_argument, NULL, 'k'}, > { NULL, 0, NULL, 0 } > }; > > - c = getopt_long(argc, argv, "deb:rRut:", long_options, NULL); > + c = getopt_long(argc, argv, "deb:rRut:k:", long_options, NULL); > if (c < 0) > break; > switch (c) { > @@ -300,6 +319,12 @@ int cmd_inspect_dump_tree(int argc, char **argv) > } > break; > } > + case 'k': > + ret = parse_key(_key); > + if (ret) > + exit(1); > + is_spec_key_set = true; What if the key is already set? > + break; > case GETOPT_VAL_FOLLOW: > follow = true; > break; > @@ -308,6 +333,9 @@ int cmd_inspect_dump_tree(int argc, char **argv) > } > } > > + if (!tree_id && is_spec_key_set) > + usage(cmd_inspect_dump_tree_usage); > + > if (check_argc_exact(argc - optind, 1)) > usage(cmd_inspect_dump_tree_usage); > > @@ -325,6 +353,17 @@ int cmd_inspect_dump_tree(int argc, char **argv) > goto out; > } > > + if (is_spec_key_set) { > + root = info->tree_root; > + if (IS_ERR(root) || !root) {
Re: [PATCH 2/2] btrfs-progs: check: enhanced progress indicator
On 2018/07/05 4:20, Stéphane Lesimple wrote: > We reuse the task_position enum and task_ctx struct of the original progress > indicator, adding more values and fields for our needs. > > Then add hooks in all steps of the check to properly record progress. > > Signed-off-by: Stéphane Lesimple > --- > check/main.c| 176 > ++-- > check/mode-common.h | 20 ++ > check/mode-lowmem.c | 1 + > convert/main.c | 2 +- > qgroup-verify.c | 7 +++ > qgroup-verify.h | 2 + > task-utils.c| 8 ++- > task-utils.h| 3 +- > 8 files changed, 154 insertions(+), 65 deletions(-) > > diff --git a/check/main.c b/check/main.c > index 3190b5d..bb3ebea 100644 > --- a/check/main.c > +++ b/check/main.c > @@ -25,6 +25,7 @@ > #include > #include > #include > +#include > #include "ctree.h" > #include "volumes.h" > #include "repair.h" > @@ -47,20 +48,6 @@ > #include "check/mode-original.h" > #include "check/mode-lowmem.h" > > -enum task_position { > - TASK_EXTENTS, > - TASK_FREE_SPACE, > - TASK_FS_ROOTS, > - TASK_NOTHING, /* have to be the last element */ > -}; > - > -struct task_ctx { > - int progress_enabled; > - enum task_position tp; > - > - struct task_info *info; > -}; > - > u64 bytes_used = 0; > u64 total_csum_bytes = 0; > u64 total_btree_bytes = 0; > @@ -72,6 +59,7 @@ u64 data_bytes_referenced = 0; > LIST_HEAD(duplicate_extents); > LIST_HEAD(delete_items); > int no_holes = 0; > +static int is_free_space_tree = 0; > int init_extent_tree = 0; > int check_data_csum = 0; > struct btrfs_fs_info *global_info; > @@ -173,28 +161,48 @@ static int compare_extent_backref(struct rb_node > *node1, struct rb_node *node2) > return compare_tree_backref(node1, node2); > } > > +static void print_status_check_line(void *p) > +{ > + struct task_ctx *priv = p; > + char *task_position_string[] = { > + "[1/7] checking root items ", > + "[2/7] checking extents", > + is_free_space_tree ? > + "[3/7] checking free space tree": > + "[3/7] checking free space cache ", > + "[4/7] checking fs roots ", > + check_data_csum ? > + "[5/7] checking csums against data ": > + "[5/7] checking csums (without verifying data) ", > + "[6/7] checking root refs ", > + "[7/7] checking quota groups ", > + }; > + > + time_t elapsed = time(NULL) - priv->start_time; > + int hours = elapsed / 3600; > + elapsed-= hours * 3600; > + int minutes = elapsed / 60; > + elapsed-= minutes * 60; > + int seconds = elapsed; > + printf("%s (%d:%02d:%02d elapsed", task_position_string[priv->tp], > hours, minutes, seconds); > + if (priv->item_count > 0) > + printf(", %llu items checked)\r", priv->item_count); > + else > + printf(")\r"); > + fflush(stdout); > +} > > static void *print_status_check(void *p) > { > struct task_ctx *priv = p; > - const char work_indicator[] = { '.', 'o', 'O', 'o' }; > - uint32_t count = 0; > - static char *task_position_string[] = { > - "checking extents", > - "checking free space cache", > - "checking fs roots", > - }; > > - task_period_start(priv->info, 1000 /* 1s */); > + task_period_start(priv->info, 50); > > if (priv->tp == TASK_NOTHING) > return NULL; > > while (1) { > - printf("%s [%c]\r", task_position_string[priv->tp], > - work_indicator[count % 4]); > - count++; > - fflush(stdout); > + print_status_check_line(p); > task_period_wait(priv->info); > } > return NULL; > @@ -202,6 +210,7 @@ static void *print_status_check(void *p) > > static int print_status_return(void *p) > { > + print_status_check_line(p); > printf("\n"); > fflush(stdout); > > @@ -2942,6 +2951,7 @@ static int check_root_refs(struct btrfs_root *root, > loop = 0; > cache = search_cache_extent(root_cache, 0); > while (1) { > + ctx.item_count++; > if (!cache) > break; > rec = container_of(cache, struct root_record, cache); > @@ -3263,6 +3273,7 @@ static int check_fs_root(struct btrfs_root *root, > } > > while (1) { > + ctx.item_count++; > wret = walk_down_tree(root, , wc, , ); > if (wret < 0) > ret = wret; > @@ -3340,11 +3351,6 @@ static int check_fs_roots(struct btrfs_fs_info > *fs_info, > int