Re: [PATCH v2 3/3] fstests: generic: Check the fs after each FUA writes
On 2018年03月28日 13:04, Amir Goldstein wrote: > On Wed, Mar 28, 2018 at 7:40 AM, Qu Wenruowrote: >> Basic test case which triggers fsstress with dm-log-writes, and then >> check the fs after each FUA writes. >> With needed infrastructure and special handlers for journal based fs. >> >> Signed-off-by: Qu Wenruo >> --- >> changelog: >> v2: >> Use proper "SUSE Linux Products GmbH" instead of "SuSE" >> Get rid of dm-snapshot which is pretty slow if we're creating and >> deleting snapshots repeatedly. >> (Maybe LVM thin provision would be much better, but current replay >>solution is good so far, and no slower than dm-snapshot) >> Add back run_check so that we can get the seed. >> --- >> Unfortunately, neither xfs nor ext4 survies this test for even single >> success, while btrfs survives. >> (Although not what I want, I'm just trying my luck >> to reproduce a serious btrfs corruption situation) >> >> Although btrfs may be the fastest fs for the test, since it has fixed >> small amount of write in mkfs and almost nothing to replay, it still >> takes about 240s~300s to finish (the same level using snapshot). >> >> It would take longer time for ext4 for its large amount of write during >> mkfs, if it can survive the test in the first space. > > Hmm, how much time would the total test would take if you don't fail > it on fsck? I am asking because it may be possible to improve this with > only a single snapshot after mkfs. The only fs which can pass the test right now is btrfs, so other estimation is mostly based on guess. > > Anyway, if total test run time is expected to be under 10min I wouldn't > bother with this optimization, at least not right now. IMO it is more > important to get the test out there to get the corruption bugs floating. I'd say from current status, if XFS doesn't fail, it would definitely finish in 10min. For EXT4 I'm not pretty sure. I'd like to keep current test case as simple as possible right now, and for later enhancement, I have several different ideas: 1) Reflink fs + loopback Yep, use btrfs/xfs as base filesystem and create copy using reflink, then use such files as loopback device. The good thing is, AFAIK btrfs/xfs reflink is really fast. Much much faster than dm-snapshot or even LVM thin. The much much smaller block size (4K default) makes CoW overhead (LVM thin is 64K, not sure about dm-snapshot though). The problem is, such setup needs extra mount point and can be a little hard to setup, and we're introducing another layer of fs, if the fs itself has some hidden bug, it would screw up the test case. 2) LVM thin provision LVM thin provision looks much like btrfs/xfs for block level, and smaller default block size (64K vs original 2M) makes CoW overhead smaller. I'm currently testing this method, the good thing is it's a little easier to setup and we can use single mount point. Anyway, with proper and efficient snapshot ability implemented, I will definitely convert this test case, and add Flush test case. Thanks for your review too, Qu > > Thanks for working on this! > You can add > Reviewed-by: Amir Goldstein > > >> >> As a comparison, btrfs takes about 5 seconds to replay log, mount, >> unmount and run fsck at the end of the test. >> But for ext4, it already takes about 5 seconds to do the same thing >> before triggering fsck error. >> >> Fsck fail for ext4: >> _check_generic_filesystem: filesystem on /dev/mapper/test-scratch1 is >> inconsistent >> *** fsck.ext4 output *** >> fsck from util-linux 2.31.1 >> e2fsck 1.43.8 (1-Jan-2018) >> Pass 1: Checking inodes, blocks, and sizes >> Inode 131076 extent tree (at level 1) could be shorter. Fix? no >> >> Inode 131262, i_size is 0, should be 258048. Fix? no >> >> Pass 2: Checking directory structure >> Pass 3: Checking directory connectivity >> Pass 4: Checking reference counts >> Pass 5: Checking group summary information >> >> For xfs: >> _check_xfs_filesystem: filesystem on /dev/mapper/test-scratch1 is >> inconsistent (r) >> *** xfs_repair -n output *** >> Phase 1 - find and verify superblock... >> Phase 2 - using internal log >> - zero log... >> - scan filesystem freespace and inode maps... >> - found root inode chunk >> Phase 3 - for each AG... >> - scan (but don't clear) agi unlinked lists... >> - process known inodes and perform inode discovery... >> - agno = 0 >> - agno = 1 >> - agno = 2 >> bad symlink header ino 8409190, file block 0, disk block 1051147 >> problem with symbolic link in inode 8409190 >> would have cleared inode 8409190 >> - agno = 3 >> - process newly discovered inodes... >> Phase 4 - check for duplicate blocks... >> - setting up duplicate extent list... >> - check for inodes claiming duplicate blocks... >> - agno = 0 >> - agno = 1 >> - agno = 3 >> -
Re: [PATCH v2 3/3] fstests: generic: Check the fs after each FUA writes
On Wed, Mar 28, 2018 at 7:40 AM, Qu Wenruowrote: > Basic test case which triggers fsstress with dm-log-writes, and then > check the fs after each FUA writes. > With needed infrastructure and special handlers for journal based fs. > > Signed-off-by: Qu Wenruo > --- > changelog: > v2: > Use proper "SUSE Linux Products GmbH" instead of "SuSE" > Get rid of dm-snapshot which is pretty slow if we're creating and > deleting snapshots repeatedly. > (Maybe LVM thin provision would be much better, but current replay >solution is good so far, and no slower than dm-snapshot) > Add back run_check so that we can get the seed. > --- > Unfortunately, neither xfs nor ext4 survies this test for even single > success, while btrfs survives. > (Although not what I want, I'm just trying my luck > to reproduce a serious btrfs corruption situation) > > Although btrfs may be the fastest fs for the test, since it has fixed > small amount of write in mkfs and almost nothing to replay, it still > takes about 240s~300s to finish (the same level using snapshot). > > It would take longer time for ext4 for its large amount of write during > mkfs, if it can survive the test in the first space. Hmm, how much time would the total test would take if you don't fail it on fsck? I am asking because it may be possible to improve this with only a single snapshot after mkfs. Anyway, if total test run time is expected to be under 10min I wouldn't bother with this optimization, at least not right now. IMO it is more important to get the test out there to get the corruption bugs floating. Thanks for working on this! You can add Reviewed-by: Amir Goldstein > > As a comparison, btrfs takes about 5 seconds to replay log, mount, > unmount and run fsck at the end of the test. > But for ext4, it already takes about 5 seconds to do the same thing > before triggering fsck error. > > Fsck fail for ext4: > _check_generic_filesystem: filesystem on /dev/mapper/test-scratch1 is > inconsistent > *** fsck.ext4 output *** > fsck from util-linux 2.31.1 > e2fsck 1.43.8 (1-Jan-2018) > Pass 1: Checking inodes, blocks, and sizes > Inode 131076 extent tree (at level 1) could be shorter. Fix? no > > Inode 131262, i_size is 0, should be 258048. Fix? no > > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > > For xfs: > _check_xfs_filesystem: filesystem on /dev/mapper/test-scratch1 is > inconsistent (r) > *** xfs_repair -n output *** > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - zero log... > - scan filesystem freespace and inode maps... > - found root inode chunk > Phase 3 - for each AG... > - scan (but don't clear) agi unlinked lists... > - process known inodes and perform inode discovery... > - agno = 0 > - agno = 1 > - agno = 2 > bad symlink header ino 8409190, file block 0, disk block 1051147 > problem with symbolic link in inode 8409190 > would have cleared inode 8409190 > - agno = 3 > - process newly discovered inodes... > Phase 4 - check for duplicate blocks... > - setting up duplicate extent list... > - check for inodes claiming duplicate blocks... > - agno = 0 > - agno = 1 > - agno = 3 > - agno = 2 > entry "lb" in shortform directory 8409188 references free inode 8409190 > would have junked entry "lb" in directory inode 8409188 > bad symlink header ino 8409190, file block 0, disk block 1051147 > problem with symbolic link in inode 8409190 > would have cleared inode 8409190 > No modify flag set, skipping phase 5 > Phase 6 - check inode connectivity... > - traversing filesystem ... > entry "lb" in shortform directory inode 8409188 points to free inode 8409190 > would junk entry > - traversal finished ... > - moving disconnected inodes to lost+found ... > Phase 7 - verify link counts... > Maximum metadata LSN (1:396) is ahead of log (1:63). > Would format log to cycle 4. > No modify flag set, skipping filesystem flush and exiting. > > And special note for XFS guys, I also hit an XFS internal metadata > warning during journal replay: > [ 7901.423659] XFS (dm-4): Starting recovery (logdev: internal) > [ 7901.577511] XFS (dm-4): Metadata corruption detected at > xfs_dinode_verify+0x467/0x570 [xfs], inode 0x805067 dinode > [ 7901.580529] XFS (dm-4): Unmount and run xfs_repair > [ 7901.581901] XFS (dm-4): First 128 bytes of corrupted metadata buffer: > [ 7901.583205] b8963f41: 49 4e a1 ff 03 02 00 00 00 00 00 00 00 00 00 > 00 IN.. > [ 7901.584659] f35a50e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > [ 7901.586075] 386eea9e: 5a b2 0e 69 0a f3 43 27 5a b2 0e 69 0a f3 43 > 27 Z..i..C'Z..i..C' > [ 7901.587561] ac636661: 5a b2 0e 69 0d
[PATCH v2 2/3] fstests: log-writes: Add support for METADATA flag
Signed-off-by: Qu WenruoReviewed-by: Amir Goldstein --- changelog: v2: Add Amir's reviewed-by tag. --- src/log-writes/log-writes.c | 3 ++- src/log-writes/log-writes.h | 9 + 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c index a872429d3f41..5dc22c244cea 100644 --- a/src/log-writes/log-writes.c +++ b/src/log-writes/log-writes.c @@ -130,7 +130,8 @@ struct flags_to_str_entry { DEFINE_LOG_FLAGS_STR_ENTRY(FLUSH), DEFINE_LOG_FLAGS_STR_ENTRY(FUA), DEFINE_LOG_FLAGS_STR_ENTRY(DISCARD), - DEFINE_LOG_FLAGS_STR_ENTRY(MARK) + DEFINE_LOG_FLAGS_STR_ENTRY(MARK), + DEFINE_LOG_FLAGS_STR_ENTRY(METADATA) }; #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) diff --git a/src/log-writes/log-writes.h b/src/log-writes/log-writes.h index 35ca35838aac..75fb8ac0bf79 100644 --- a/src/log-writes/log-writes.h +++ b/src/log-writes/log-writes.h @@ -20,10 +20,11 @@ typedef __u32 u32; /* * Constants copied from kernel file drivers/md/dm-log-writes.c */ -#define LOG_FLUSH_FLAG (1 << 0) -#define LOG_FUA_FLAG (1 << 1) -#define LOG_DISCARD_FLAG (1 << 2) -#define LOG_MARK_FLAG (1 << 3) +#define LOG_FLUSH_FLAG (1 << 0) +#define LOG_FUA_FLAG (1 << 1) +#define LOG_DISCARD_FLAG (1 << 2) +#define LOG_MARK_FLAG (1 << 3) +#define LOG_METADATA_FLAG (1 << 4) #define WRITE_LOG_VERSION 1 #define WRITE_LOG_MAGIC 0x6a736677736872 -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/3] fstests: log-writes: Add support to output human readable flags
Also change the flag numeric output to hex. Signed-off-by: Qu WenruoReviewed-by: Amir Goldstein --- changelog: v2: Add Amir's reviewed-by tag. --- src/log-writes/log-writes.c | 70 - 1 file changed, 63 insertions(+), 7 deletions(-) diff --git a/src/log-writes/log-writes.c b/src/log-writes/log-writes.c index 09391574c4d2..a872429d3f41 100644 --- a/src/log-writes/log-writes.c +++ b/src/log-writes/log-writes.c @@ -120,6 +120,58 @@ int log_discard(struct log *log, struct log_write_entry *entry) return 0; } +#define DEFINE_LOG_FLAGS_STR_ENTRY(x) \ + {LOG_##x##_FLAG, #x} + +struct flags_to_str_entry { + u64 flags; + const char *str; +} log_flags_table[] = { + DEFINE_LOG_FLAGS_STR_ENTRY(FLUSH), + DEFINE_LOG_FLAGS_STR_ENTRY(FUA), + DEFINE_LOG_FLAGS_STR_ENTRY(DISCARD), + DEFINE_LOG_FLAGS_STR_ENTRY(MARK) +}; + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) +#define LOG_FLAGS_BUF_SIZE 128 +/* + * Convert numeric flags to human readable flags. + * @flags: numeric flags + * @buf: output buffer for human readable string. + * must have enough space (LOG_FLAGS_BUF_SIZE) to contain all + * the string + */ +static void entry_flags_to_str(u64 flags, char *buf) +{ + int empty = 1; + int left_len; + int i; + + buf[0] = '\0'; + for (i = 0; i < ARRAY_SIZE(log_flags_table); i++) { + if (flags & log_flags_table[i].flags) { + if (!empty) + strncat(buf, "|", LOG_FLAGS_BUF_SIZE); + empty = 0; + strncat(buf, log_flags_table[i].str, LOG_FLAGS_BUF_SIZE); + flags &= ~log_flags_table[i].flags; + } + } + if (flags) { + if (!empty) + strncat(buf, "|", LOG_FLAGS_BUF_SIZE); + empty = 0; + left_len = LOG_FLAGS_BUF_SIZE - strnlen(buf, + LOG_FLAGS_BUF_SIZE); + if (left_len > 0) + snprintf(buf + strnlen(buf, LOG_FLAGS_BUF_SIZE), +left_len, "UNKNOWN.0x%llx", flags); + } + if (empty) + strncpy(buf, "NONE", LOG_FLAGS_BUF_SIZE); +} + /* * @log: the log we are replaying. * @entry: entry to be replayed. @@ -179,6 +231,7 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry, size_t read_size = read_data ? log->sectorsize : sizeof(struct log_write_entry); char *buf; + char flags_buf[LOG_FLAGS_BUF_SIZE]; ssize_t ret; off_t offset; int skip = 0; @@ -210,19 +263,20 @@ int log_replay_next_entry(struct log *log, struct log_write_entry *entry, log->cur_pos += read_size; } + flags = le64_to_cpu(entry->flags); + entry_flags_to_str(flags, flags_buf); skip = log_should_skip(log, entry); if (log_writes_verbose > 1 || (log_writes_verbose && !skip)) { - printf("%s %d@%llu: sector %llu, size %llu, flags %llu\n", + printf("%s %d@%llu: sector %llu, size %llu, flags 0x%llx(%s)\n", skip ? "skipping" : "replaying", (int)log->cur_entry - 1, log->cur_pos / log->sectorsize, (unsigned long long)le64_to_cpu(entry->sector), (unsigned long long)size, - (unsigned long long)le64_to_cpu(entry->flags)); + (unsigned long long)flags, flags_buf); } if (!size) return 0; - flags = le64_to_cpu(entry->flags); if (flags & LOG_DISCARD_FLAG) return log_discard(log, entry); @@ -301,7 +355,7 @@ int log_seek_entry(struct log *log, u64 entry_num) return -1; } if (log_writes_verbose > 1) - printf("seek entry %d@%llu: %llu, size %llu, flags %llu\n", + printf("seek entry %d@%llu: %llu, size %llu, flags 0x%llx\n", (int)i, log->cur_pos / log->sectorsize, (unsigned long long)le64_to_cpu(entry.sector), (unsigned long long)le64_to_cpu(entry.nr_sectors), @@ -339,6 +393,7 @@ int log_seek_next_entry(struct log *log, struct log_write_entry *entry, size_t read_size = read_data ? log->sectorsize : sizeof(struct log_write_entry); u64 flags; + char flags_buf[LOG_FLAGS_BUF_SIZE]; ssize_t ret; if (log->cur_entry >= log->nr_entries) @@ -366,14 +421,15 @@ int log_seek_next_entry(struct log *log, struct log_write_entry *entry, } else { log->cur_pos += read_size; } + flags =
[PATCH v2 3/3] fstests: generic: Check the fs after each FUA writes
Basic test case which triggers fsstress with dm-log-writes, and then check the fs after each FUA writes. With needed infrastructure and special handlers for journal based fs. Signed-off-by: Qu Wenruo--- changelog: v2: Use proper "SUSE Linux Products GmbH" instead of "SuSE" Get rid of dm-snapshot which is pretty slow if we're creating and deleting snapshots repeatedly. (Maybe LVM thin provision would be much better, but current replay solution is good so far, and no slower than dm-snapshot) Add back run_check so that we can get the seed. --- Unfortunately, neither xfs nor ext4 survies this test for even single success, while btrfs survives. (Although not what I want, I'm just trying my luck to reproduce a serious btrfs corruption situation) Although btrfs may be the fastest fs for the test, since it has fixed small amount of write in mkfs and almost nothing to replay, it still takes about 240s~300s to finish (the same level using snapshot). It would take longer time for ext4 for its large amount of write during mkfs, if it can survive the test in the first space. As a comparison, btrfs takes about 5 seconds to replay log, mount, unmount and run fsck at the end of the test. But for ext4, it already takes about 5 seconds to do the same thing before triggering fsck error. Fsck fail for ext4: _check_generic_filesystem: filesystem on /dev/mapper/test-scratch1 is inconsistent *** fsck.ext4 output *** fsck from util-linux 2.31.1 e2fsck 1.43.8 (1-Jan-2018) Pass 1: Checking inodes, blocks, and sizes Inode 131076 extent tree (at level 1) could be shorter. Fix? no Inode 131262, i_size is 0, should be 258048. Fix? no Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information For xfs: _check_xfs_filesystem: filesystem on /dev/mapper/test-scratch1 is inconsistent (r) *** xfs_repair -n output *** Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 bad symlink header ino 8409190, file block 0, disk block 1051147 problem with symbolic link in inode 8409190 would have cleared inode 8409190 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 3 - agno = 2 entry "lb" in shortform directory 8409188 references free inode 8409190 would have junked entry "lb" in directory inode 8409188 bad symlink header ino 8409190, file block 0, disk block 1051147 problem with symbolic link in inode 8409190 would have cleared inode 8409190 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... entry "lb" in shortform directory inode 8409188 points to free inode 8409190 would junk entry - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... Maximum metadata LSN (1:396) is ahead of log (1:63). Would format log to cycle 4. No modify flag set, skipping filesystem flush and exiting. And special note for XFS guys, I also hit an XFS internal metadata warning during journal replay: [ 7901.423659] XFS (dm-4): Starting recovery (logdev: internal) [ 7901.577511] XFS (dm-4): Metadata corruption detected at xfs_dinode_verify+0x467/0x570 [xfs], inode 0x805067 dinode [ 7901.580529] XFS (dm-4): Unmount and run xfs_repair [ 7901.581901] XFS (dm-4): First 128 bytes of corrupted metadata buffer: [ 7901.583205] b8963f41: 49 4e a1 ff 03 02 00 00 00 00 00 00 00 00 00 00 IN.. [ 7901.584659] f35a50e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7901.586075] 386eea9e: 5a b2 0e 69 0a f3 43 27 5a b2 0e 69 0a f3 43 27 Z..i..C'Z..i..C' [ 7901.587561] ac636661: 5a b2 0e 69 0d 92 bc 00 00 00 00 00 00 00 00 00 Z..i [ 7901.588969] d75f9093: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7901.590475] d2af5688: 00 00 00 02 00 00 00 00 00 00 00 00 84 d7 6a e9 ..j. [ 7901.591907] e8dd8211: ff ff ff ff 34 93 a9 3a 00 00 00 00 00 00 00 04 4..: [ 7901.593319] b7610e4e: 00 00 00 01 00 00 00 45 00 00 00 00 00 00 00 00 ...E --- common/dmlogwrites| 56 +++ tests/generic/481 | 104 ++ tests/generic/481.out | 2 + tests/generic/group | 1 + 4 files changed, 163 insertions(+) create mode 100755 tests/generic/481 create mode 100644
Re: [PATCH] fstests: test btrfs fsync after hole punching with no-holes mode
On Mon, Mar 26, 2018 at 11:59:21PM +0100, fdman...@kernel.org wrote: > From: Filipe Manana> > Test that when we have the no-holes mode enabled and a specific metadata > layout, if we punch a hole and fsync the file, at replay time the whole > hole was preserved. > > This issue is fixed by the following btrfs patch for the linux kernel: > > "Btrfs: fix fsync after hole punching when using no-holes feature" I'd expect a test failure with 4.16-rc6 kernel, as the mentioned fix above is not there. But test always passes for me. Did I miss anything? btrfs-progs version is btrfs-progs-4.11.1-3.fc27. > > Signed-off-by: Filipe Manana > --- > tests/btrfs/159 | 100 > > tests/btrfs/159.out | 5 +++ > tests/btrfs/group | 1 + > 3 files changed, 106 insertions(+) > create mode 100755 tests/btrfs/159 > create mode 100644 tests/btrfs/159.out > > diff --git a/tests/btrfs/159 b/tests/btrfs/159 > new file mode 100755 > index ..6083975a > --- /dev/null > +++ b/tests/btrfs/159 > @@ -0,0 +1,100 @@ > +#! /bin/bash > +# FSQA Test No. 159 > +# > +# Test that when we have the no-holes mode enabled and a specific metadata > +# layout, if we punch a hole and fsync the file, at replay time the whole > +# hole was preserved. > +# > +#--- > +# > +# Copyright (C) 2018 SUSE Linux Products GmbH. All Rights Reserved. > +# Author: Filipe Manana > +# > +# This program is free software; you can redistribute it and/or > +# modify it under the terms of the GNU General Public License as > +# published by the Free Software Foundation. > +# > +# This program is distributed in the hope that it would be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program; if not, write the Free Software Foundation, > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > +#--- > +# > + > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + _cleanup_flakey > + cd / > + rm -f $tmp.* > +} > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > +. ./common/dmflakey > + > +# real QA test starts here > +_supported_fs generic ^^^ btrfs > +_supported_os Linux > +_require_scratch > +_require_dm_target flakey > +_require_xfs_io_command "fpunch" > + > +rm -f $seqres.full > + > +# We create the filesystem with a node size of 64Kb because we need to > create a > +# specific metadata layout in order to trigger the bug we are testing. At the > +# moment the node size can not be smaller then the system's page size, so > given > +# that the largest possible page size is 64Kb and by default the node size is > +# set to the system's page size value, we explictly create a filesystem with > a > +# 64Kb node size. > +_scratch_mkfs -O no-holes -n $((64 * 1024)) >>$seqres.full 2>&1 > +_require_metadata_journaling $SCRATCH_DEV > +_init_flakey > +_mount_flakey > + > +# Create our test file with 831 extents of 256Kb each. Before each extent, > there I think it's 832 extents in total? As the index starts with 0. Otherwise looks good to me. Thanks, Eryu > +# is a 256Kb hole (except for the first extent, which starts at offset 0). > This > +# creates two leafs in the filesystem tree, with the extent at offset > 216530944 > +# being the last item in the first leaf and the extent at offset 217055232 > being > +# the first item in the second leaf. > +for ((i = 0; i <= 831; i++)); do > + offset=$((i * 2 * 256 * 1024)) > + $XFS_IO_PROG -f -c "pwrite -S 0xab -b 256K $offset 256K" \ > + $SCRATCH_MNT/foobar >/dev/null > +done > + > +# Now persist everything done so far. > +sync > + > +# Now punch a hole that covers part of the extent at offset 216530944. > +$XFS_IO_PROG -c "fpunch $((216530944 + 128 * 1024 - 4000)) 256K" \ > + -c "fsync" \ > + $SCRATCH_MNT/foobar > + > +echo "File digest before power failure:" > +md5sum $SCRATCH_MNT/foobar | _filter_scratch > + > +# Simulate a power failure and mount the filesystem to check that replaying > the > +# fsync log/journal succeeds and our test file has the expected content. > +_flakey_drop_and_remount > + > +echo "File digest after power failure and log replay:" > +md5sum $SCRATCH_MNT/foobar | _filter_scratch > + > +_unmount_flakey > +_cleanup_flakey > + > +status=0 > +exit > diff --git a/tests/btrfs/159.out b/tests/btrfs/159.out > new file mode 100644
Re: [PATCH v2 2/3] btrfs: Allow rmdir(2) to delete a subvolume
On 2018/03/27 21:39, Nikolay Borisov wrote: > > > On 26.03.2018 11:30, Misono Tomohiro wrote: >> This patch changes the behavior of rmdir(2) to allow it to delete >> an empty subvolume by default, unless it is not a default subvolume >> and send is not in progress. >> >> New function btrfs_delete_subvolume() is almost equal to the second half >> of btrfs_ioctl_snap_destroy(). This function requires inode_lock for both >> @dir and inode of @dentry. For rmdir(2) it is already acquired in vfs >> layer before calling btrfs_rmdir(). >> >> Note that while a non-privileged user cannot delete a read-only subvolume >> by "btrfs subvolume delete" when user_subvol_rm_allowd mount option is >> enabled, rmdir(2) can delete an empty read-only subvolume. >> (However, rm -r cannot use for read-only subvolume containing files.) >> >> Signed-off-by: Tomohiro Misono>> --- >> fs/btrfs/inode.c | 143 >> ++- >> 1 file changed, 142 insertions(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >> index db66fa4fede6..b778776eee8e 100644 >> --- a/fs/btrfs/inode.c >> +++ b/fs/btrfs/inode.c >> @@ -4387,6 +4387,147 @@ noinline int may_destroy_subvol(struct btrfs_root >> *root) >> return ret; >> } >> >> +static int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry) >> +{ >> +struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb); >> +struct btrfs_root *root = BTRFS_I(dir)->root; >> +struct inode *inode = d_inode(dentry); >> +struct btrfs_root *dest = BTRFS_I(inode)->root; >> +struct btrfs_trans_handle *trans; >> +struct btrfs_block_rsv block_rsv; >> +u64 root_flags; >> +u64 qgroup_reserved; >> +int ret; >> +int err; >> + >> +/* >> + * Don't allow to delete a subvolume with send in progress. This is >> + * inside the i_mutex so the error handling that has to drop the bit >> + * again is not run concurrently. >> + */ >> +spin_lock(>root_item_lock); >> +root_flags = btrfs_root_flags(>root_item); >> +if (dest->send_in_progress == 0) { >> +btrfs_set_root_flags(>root_item, >> +root_flags | BTRFS_ROOT_SUBVOL_DEAD); >> +spin_unlock(>root_item_lock); >> +} else { >> +spin_unlock(>root_item_lock); >> +btrfs_warn(fs_info, >> + "Attempt to delete subvolume %llu during send", >> + dest->root_key.objectid); >> +err = -EPERM; >> +return err; >> +} >> + >> +down_write(_info->subvol_sem); >> + >> +err = may_destroy_subvol(dest); >> +if (err) >> +goto out_up_write; >> + >> +btrfs_init_block_rsv(_rsv, BTRFS_BLOCK_RSV_TEMP); >> +/* >> + * One for dir inode, two for dir entries, two for root >> + * ref/backref. >> + */ >> +err = btrfs_subvolume_reserve_metadata(root, _rsv, >> + 5, _reserved, true); >> +if (err) >> +goto out_up_write; >> + >> +trans = btrfs_start_transaction(root, 0); >> +if (IS_ERR(trans)) { >> +err = PTR_ERR(trans); >> +goto out_release; >> +} >> +trans->block_rsv = _rsv; >> +trans->bytes_reserved = block_rsv.size; >> + >> +btrfs_record_snapshot_destroy(trans, BTRFS_I(dir)); >> + >> +ret = btrfs_unlink_subvol(trans, root, dir, >> +dest->root_key.objectid, >> +dentry->d_name.name, >> +dentry->d_name.len); >> +if (ret) { >> +err = ret; >> +btrfs_abort_transaction(trans, ret); >> +goto out_end_trans; >> +} >> + >> +btrfs_record_root_in_trans(trans, dest); >> + >> +memset(>root_item.drop_progress, 0, >> +sizeof(dest->root_item.drop_progress)); >> +dest->root_item.drop_level = 0; >> +btrfs_set_root_refs(>root_item, 0); >> + >> +if (!test_and_set_bit(BTRFS_ROOT_ORPHAN_ITEM_INSERTED, >state)) { >> +ret = btrfs_insert_orphan_item(trans, >> +fs_info->tree_root, >> +dest->root_key.objectid); >> +if (ret) { >> +btrfs_abort_transaction(trans, ret); >> +err = ret; >> +goto out_end_trans; >> +} >> +} >> + >> +ret = btrfs_uuid_tree_rem(trans, fs_info, dest->root_item.uuid, >> + BTRFS_UUID_KEY_SUBVOL, >> + dest->root_key.objectid); >> +if (ret && ret != -ENOENT) { >> +btrfs_abort_transaction(trans, ret); >> +err = ret; >> +goto out_end_trans; >> +} >> +if (!btrfs_is_empty_uuid(dest->root_item.received_uuid)) { >> +ret = btrfs_uuid_tree_rem(trans, fs_info, >> +
Re: [PATCH v2 1/3] btrfs: Move may_destroy_subvol() from ioctl.c to inode.c
On 2018/03/26 17:51, Nikolay Borisov wrote: > > > On 26.03.2018 11:28, Misono Tomohiro wrote: >> This is a preparation work to allow rmdir(2) to delete a subvolume. >> >> Signed-off-by: Tomohiro Misono>> --- >> fs/btrfs/ctree.h | 1 + >> fs/btrfs/inode.c | 54 ++ >> fs/btrfs/ioctl.c | 54 -- >> 3 files changed, 55 insertions(+), 54 deletions(-) >> >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h >> index da308774b8a4..2dbead106aab 100644 >> --- a/fs/btrfs/ctree.h >> +++ b/fs/btrfs/ctree.h >> @@ -3166,6 +3166,7 @@ int btrfs_unlink_subvol(struct btrfs_trans_handle >> *trans, >> struct btrfs_root *root, >> struct inode *dir, u64 objectid, >> const char *name, int name_len); >> +noinline int may_destroy_subvol(struct btrfs_root *root); > > Why the explicit noinline? If it's for "documentation"/appearing in > stack traces purposes then you should use noinline_for_stack. But I > don't think it's really needed. Just to keep the original declaration. Also, the third patch returns this function into static again (in ioctl.c). > >> int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, >> int front); >> int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans, >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >> index f53470112670..db66fa4fede6 100644 >> --- a/fs/btrfs/inode.c >> +++ b/fs/btrfs/inode.c >> @@ -4333,6 +4333,60 @@ int btrfs_unlink_subvol(struct btrfs_trans_handle >> *trans, >> return ret; >> } >> >> +/* >> + * helper to check if the subvolume references other subvolumes >> + */ >> +noinline int may_destroy_subvol(struct btrfs_root *root) >> +{ >> +struct btrfs_fs_info *fs_info = root->fs_info; >> +struct btrfs_path *path; >> +struct btrfs_dir_item *di; >> +struct btrfs_key key; >> +u64 dir_id; >> +int ret; >> + >> +path = btrfs_alloc_path(); >> +if (!path) >> +return -ENOMEM; >> + >> +/* Make sure this root isn't set as the default subvol */ >> +dir_id = btrfs_super_root_dir(fs_info->super_copy); >> +di = btrfs_lookup_dir_item(NULL, fs_info->tree_root, path, >> + dir_id, "default", 7, 0); >> +if (di && !IS_ERR(di)) { >> +btrfs_dir_item_key_to_cpu(path->nodes[0], di, ); >> +if (key.objectid == root->root_key.objectid) { >> +ret = -EPERM; >> +btrfs_err(fs_info, >> + "deleting default subvolume %llu is not >> allowed", >> + key.objectid); >> +goto out; >> +} >> +btrfs_release_path(path); >> +} >> + >> +key.objectid = root->root_key.objectid; >> +key.type = BTRFS_ROOT_REF_KEY; >> +key.offset = (u64)-1; >> + >> +ret = btrfs_search_slot(NULL, fs_info->tree_root, , path, 0, 0); >> +if (ret < 0) >> +goto out; >> +BUG_ON(ret == 0); >> + >> +ret = 0; >> +if (path->slots[0] > 0) { >> +path->slots[0]--; >> +btrfs_item_key_to_cpu(path->nodes[0], , path->slots[0]); >> +if (key.objectid == root->root_key.objectid && >> +key.type == BTRFS_ROOT_REF_KEY) >> +ret = -ENOTEMPTY; >> +} >> +out: >> +btrfs_free_path(path); >> +return ret; >> +} >> + >> static int btrfs_rmdir(struct inode *dir, struct dentry *dentry) >> { >> struct inode *inode = d_inode(dentry); >> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c >> index 111ee282b777..11af9eb449ef 100644 >> --- a/fs/btrfs/ioctl.c >> +++ b/fs/btrfs/ioctl.c >> @@ -1843,60 +1843,6 @@ static noinline int >> btrfs_ioctl_subvol_setflags(struct file *file, >> return ret; >> } >> >> -/* >> - * helper to check if the subvolume references other subvolumes >> - */ >> -static noinline int may_destroy_subvol(struct btrfs_root *root) >> -{ >> -struct btrfs_fs_info *fs_info = root->fs_info; >> -struct btrfs_path *path; >> -struct btrfs_dir_item *di; >> -struct btrfs_key key; >> -u64 dir_id; >> -int ret; >> - >> -path = btrfs_alloc_path(); >> -if (!path) >> -return -ENOMEM; >> - >> -/* Make sure this root isn't set as the default subvol */ >> -dir_id = btrfs_super_root_dir(fs_info->super_copy); >> -di = btrfs_lookup_dir_item(NULL, fs_info->tree_root, path, >> - dir_id, "default", 7, 0); >> -if (di && !IS_ERR(di)) { >> -btrfs_dir_item_key_to_cpu(path->nodes[0], di, ); >> -if (key.objectid == root->root_key.objectid) { >> -ret = -EPERM; >> -btrfs_err(fs_info, >> - "deleting default subvolume %llu is not >> allowed", >> -
error report: misc-test 006 sometimes fails in current devel branch
current devel branch of btrfs-progs (github) occasionally fails at misc-test 006: (kernel is 4.16.0-rc7) == $ sudo make test-misc TEST=006\* [LD] fssum [TEST] misc-tests.sh [TEST/misc] 006-image-on-missing-device /usr/data/src/btrfs-progs/tests//common: line 177: 10819 Aborted (core dumped) $INSTRUMENT "$@" >> "$RESULTS" 2>&1 mayfail: returned code 134 (SIGABRT), not ignored test failed for case 006-image-on-missing-device make: *** [Makefile:329: test-misc] Error 1 == git bisect points the commit fa754dddc204 "btrfs-progs: mkfs: precreate the uuid tree" and after reverting this, it seems that the test always runs correctly. Full log of misc-tests-resut.txt when the test fails is below: -- === Entering /usr/data/src/btrfs-progs/tests//misc-tests/006-image-on-missing-device ### losetup --find --show img1 /dev/loop0 ### losetup --find --show img2 /dev/loop1 ### /usr/data/src/btrfs-progs/mkfs.btrfs -f -d raid1 -m raid1 /dev/loop0 /dev/loop1 btrfs-progs v4.15.1 See http://btrfs.wiki.kernel.org for more information. Performing full device TRIM /dev/loop0 (2.00GiB) ... Performing full device TRIM /dev/loop1 (2.00GiB) ... Label: (null) UUID: 220272d0-2454-429c-a198-83eaf954aceb Node size: 16384 Sector size:4096 Filesystem size:4.00GiB Block group profiles: Data: RAID1 204.75MiB Metadata: RAID1 204.75MiB System: RAID1 8.00MiB SSD detected: no Incompat features: extref, skinny-metadata Number of devices: 2 Devices: IDSIZE PATH 1 2.00GiB /dev/loop0 2 2.00GiB /dev/loop1 ### mount /dev/loop0 /usr/data/src/btrfs-progs/tests//mnt ### dd if=/dev/zero of=/usr/data/src/btrfs-progs/tests//mnt/a bs=1M count=10 10+0 records in 10+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 0.00643606 s, 1.6 GB/s ### dd if=/dev/zero of=/usr/data/src/btrfs-progs/tests//mnt/b bs=4k count=1000 conv=sync 1000+0 records in 1000+0 records out 4096000 bytes (4.1 MB, 3.9 MiB) copied, 0.0056388 s, 726 MB/s ### umount /usr/data/src/btrfs-progs/tests//mnt ### /usr/data/src/btrfs-progs/btrfs check /dev/loop0 checking extents checking free space cache checking fs roots checking csums checking root refs Checking filesystem on /dev/loop0 UUID: 220272d0-2454-429c-a198-83eaf954aceb found 14843904 bytes used, no error found total csum bytes: 14240 total tree bytes: 131072 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 108171 file data blocks allocated: 14712832 referenced 14712832 ### /usr/data/src/btrfs-progs/btrfs-image /dev/loop0 /tmp/test-img.dump ### /usr/data/src/btrfs-progs/btrfs filesystem show /dev/loop0 Label: none uuid: 220272d0-2454-429c-a198-83eaf954aceb Total devices 2 FS bytes used 14.16MiB devid1 size 2.00GiB used 417.50MiB path /dev/loop0 devid2 size 2.00GiB used 417.50MiB path /dev/loop1 ### wipefs -a /dev/loop1 /dev/loop1: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d ### losetup -d /dev/loop1 ### /usr/data/src/btrfs-progs/btrfs filesystem show /dev/loop0 warning, device 2 is missing Label: none uuid: 220272d0-2454-429c-a198-83eaf954aceb Total devices 2 FS bytes used 14.16MiB devid1 size 2.00GiB used 417.50MiB path /dev/loop0 *** Some devices missing ### /usr/data/src/btrfs-progs/btrfs check /dev/loop0 checking extents checking free space cache failed to load free space cache for block group 30408704 failed to load free space cache for block group 245104640 checking fs roots checking csums checking root refs warning, device 2 is missing Checking filesystem on /dev/loop0 UUID: 220272d0-2454-429c-a198-83eaf954aceb found 14843904 bytes used, no error found total csum bytes: 14240 total tree bytes: 131072 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 108171 file data blocks allocated: 14712832 referenced 14712832 ### /usr/data/src/btrfs-progs/btrfs-image /dev/loop0 /tmp/test-img.dump Couldn't map the block 459800576 free(): invalid pointer failed (ignored, ret=134): /usr/data/src/btrfs-progs/btrfs-image /dev/loop0 /tmp/test-img.dump mayfail: returned code 134 (SIGABRT), not ignored test failed for case 006-image-on-missing-device -- So, the last btrfs-image failed. Also, it seems that even when the test succeeds, the last btrfs-image actually fails: -- ### /usr/data/src/btrfs-progs/btrfs-image /dev/loop0 /tmp/test-img.dump Couldn't map the block 459800576 ERROR: failed to flush pending data: -5 ERROR: create failed: Bad address warning, device 2 is missing failed (ignored, ret=1): /usr/data/src/btrfs-progs/btrfs-image /dev/loop0 /tmp/test-img.dump
[PATCH v2 5/8] btrfs: check if the fsid in the primary sb and copy sb are same
During the btrfs dev scan make sure that other copies of superblock contain the same fsid as the primary SB. So that we bring to the user notice if the superblock has been overwritten. mkfs.btrfs -fq /dev/sdc mkfs.btrfs -fq /dev/sdb dd if=/dev/sdb of=/dev/sdc count=4K skip=64K seek=64K obs=1 ibs=1 mount /dev/sdc /btrfs Caveat: Pls note that older btrfs-progs do not wipe the non-overwriting stale superblock like copy2 if a smaller mkfs.btrfs -b is created. Thus this patch in the kernel will report error. The workaround is to wipe the superblock manually, like dd if=/dev/zero of= seek=274877906944 ibs=1 obs=1 count4K OR apply the btrfs-progs patch btrfs-progs: wipe copies of the stale superblock beyond -b size which shall find and wipe the non overwriting superblock during mkfs. Signed-off-by: Anand Jain--- v1->v2: Do an explicit read for primary superblock. Drop kzalloc(). Fix split pr_err(). fs/btrfs/volumes.c | 47 --- 1 file changed, 36 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index e63723f23227..b099823f60d1 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1198,40 +1198,65 @@ static int btrfs_read_disk_super(struct block_device *bdev, u64 bytenr, int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, struct btrfs_fs_devices **fs_devices_ret) { + struct btrfs_super_block *disk_super_primary; struct btrfs_super_block *disk_super; struct btrfs_device *device; struct block_device *bdev; + struct page *page_primary; struct page *page; int ret = 0; u64 bytenr; + int i; - /* -* we would like to check all the supers, but that would make -* a btrfs mount succeed after a mkfs from a different FS. -* So, we need to add a special mount option to scan for -* later supers, using BTRFS_SUPER_MIRROR_MAX instead -*/ - bytenr = btrfs_sb_offset(0); flags |= FMODE_EXCL; bdev = blkdev_get_by_path(path, flags, holder); if (IS_ERR(bdev)) return PTR_ERR(bdev); - ret = btrfs_read_disk_super(bdev, bytenr, , _super); + /* +* We would like to check all the supers and use one good copy, +* but that would make a btrfs mount succeed after a mkfs from +* a different FS. +* So, we need to add a special mount option to scan for +* later supers, using BTRFS_SUPER_MIRROR_MAX instead. +* So, just validate if all copies of the superblocks are ok +* and have the same fsid. +*/ + bytenr = btrfs_sb_offset(0); + ret = btrfs_read_disk_super(bdev, bytenr, _primary, + _super_primary); if (ret < 0) goto error_bdev_put; + for (i = 1; i < BTRFS_SUPER_MIRROR_MAX; i++) { + bytenr = btrfs_sb_offset(i); + ret = btrfs_read_disk_super(bdev, bytenr, , _super); + if (ret < 0) { + ret = 0; + continue; + } + + if (memcmp(disk_super_primary->fsid, disk_super->fsid, + BTRFS_FSID_SIZE)) { + pr_err("BTRFS (device %pg): superblock fsid mismatch, primary %pU copy%d %pU", + bdev, disk_super_primary->fsid, i, + disk_super->fsid); + ret = -EINVAL; + btrfs_release_disk_super(page); + goto error_bdev_put; + } + btrfs_release_disk_super(page); + } + mutex_lock(_mutex); - device = device_list_add(path, disk_super); + device = device_list_add(path, disk_super_primary); if (IS_ERR(device)) ret = PTR_ERR(device); else *fs_devices_ret = device->fs_devices; mutex_unlock(_mutex); - btrfs_release_disk_super(page); - error_bdev_put: blkdev_put(bdev, flags); -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/8] btrfs: drop the redundant invalidate_bdev()
During the mount context btrfs_open_devices() calls invalidate_bdev() for all the devices. So drop the invalidate_bdev() in open_ctree() which is only for the btrfs_fs_devices::latest_bdev. The call trace is as shown below. btrfs_mount_root() | |_btrfs_parse_early_options (-o device only) | |_btrfs_scan_one_device | |_btrfs_read_disk_super() | |_read_cache_page_gfp() | |_btrfs_scan_one_device(mount-arg-dev only) | |_btrfs_read_disk_super() | |_read_cache_page_gfp() | | |_btrfs_open_devices(fsid:all) | |_btrfs_open_one_device() ||_btrfs_get_bdev_and_sb() <--- invalidate_bdev(fsid:all) | |_btrfs_read_dev_super() ||_btrfs_read_dev_one_super() | |___bread() | |_btrfs_fill_super() |_btrfs_open_ctree() <-- invalidate_bdev(latest_bdev) <-- redundant |_btrfs_read_dev_super(latest_bdev only) |_btrfs_read_dev_one_super(latest_bdev only) |___bread(latest_bdev) Signed-off-by: Anand Jain--- fs/btrfs/disk-io.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 110465bec775..628c2b1d1349 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2553,8 +2553,6 @@ int open_ctree(struct super_block *sb, __setup_root(tree_root, fs_info, BTRFS_ROOT_TREE_OBJECTID); - invalidate_bdev(fs_devices->latest_bdev); - /* * Read super block and check the signature bytes only */ -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/8] btrfs: cleanup btrfs_read_disk_super() to return std error
The only caller btrfs_scan_one_device() sets -EINVAL for error from btrfs_read_disk_super(), so this patch returns -EINVAL from the latter function. A preparatory patch to add csum check in btrfs_read_disk_super(). Signed-off-by: Anand Jain--- v1->v2: Check btrfs_read_disk_super() to be more explicit ret < 0 fs/btrfs/volumes.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 87d183accdb2..e63723f23227 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1154,23 +1154,23 @@ static int btrfs_read_disk_super(struct block_device *bdev, u64 bytenr, /* make sure our super fits in the device */ if (bytenr + PAGE_SIZE >= i_size_read(bdev->bd_inode)) - return 1; + return -EINVAL; /* make sure our super fits in the page */ if (sizeof(**disk_super) > PAGE_SIZE) - return 1; + return -EINVAL; /* make sure our super doesn't straddle pages on disk */ index = bytenr >> PAGE_SHIFT; if ((bytenr + sizeof(**disk_super) - 1) >> PAGE_SHIFT != index) - return 1; + return -EINVAL; /* pull in the page with our super */ *page = read_cache_page_gfp(bdev->bd_inode->i_mapping, index, GFP_KERNEL); if (IS_ERR_OR_NULL(*page)) - return 1; + return -EINVAL; p = kmap(*page); @@ -1180,7 +1180,7 @@ static int btrfs_read_disk_super(struct block_device *bdev, u64 bytenr, if (btrfs_super_bytenr(*disk_super) != bytenr || btrfs_super_magic(*disk_super) != BTRFS_MAGIC) { btrfs_release_disk_super(*page); - return 1; + return -EINVAL; } if ((*disk_super)->label[0] && @@ -1218,10 +1218,9 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, if (IS_ERR(bdev)) return PTR_ERR(bdev); - if (btrfs_read_disk_super(bdev, bytenr, , _super)) { - ret = -EINVAL; + ret = btrfs_read_disk_super(bdev, bytenr, , _super); + if (ret < 0) goto error_bdev_put; - } mutex_lock(_mutex); device = device_list_add(path, disk_super); -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: wipe copies of the stale superblock beyond -b size
During the mkfs.btrfs -b btrfs_prepare_device() zeros all the superblock bytenr locations only if the bytenr is below the blockcount. The problem with this is that if the BTRFS is recreated with a smaller size then we will leave the stale superblock in the disk which shall confuse the recovery. As shown in the test case below. mkfs.btrfs -qf /dev/mapper/vg-lv mkfs.btrfs -qf -b1G /dev/mapper/vg-lv btrfs in dump-super -a /dev/mapper/vg-lv | grep '.fsid|superblock:' superblock: bytenr=65536, device=/dev/mapper/vg-lv dev_item.fsid ebc67d01-7fc5-43f0-90b4-d1925002551e [match] superblock: bytenr=67108864, device=/dev/mapper/vg-lv dev_item.fsid ebc67d01-7fc5-43f0-90b4-d1925002551e [match] superblock: bytenr=274877906944, device=/dev/mapper/vg-lv dev_item.fsid b97a9206-593b-4933-a424-c6a6ee23fe7c [match] So if we find a valid superblock zero it even if it's beyond the blockcount. Signed-off-by: Anand Jain--- utils.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/utils.c b/utils.c index e9cb3a82fda6..6a9408b06e73 100644 --- a/utils.c +++ b/utils.c @@ -365,6 +365,41 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret, return 1; } + /* +* Check for the BTRFS SB copies up until btrfs_device_size() and zero +* it. So that kernel (or user for the manual recovery) don't have to +* confuse with the stale SB copy during recovery. +*/ + if (block_count != btrfs_device_size(fd, )) { + for (i = 1; i < BTRFS_SUPER_MIRROR_MAX; i++) { + struct btrfs_super_block *disk_super; + char buf[BTRFS_SUPER_INFO_SIZE]; + disk_super = (struct btrfs_super_block *)buf; + + /* Already zeroed above */ + if (btrfs_sb_offset(i) < block_count) + continue; + + /* Beyond actual disk size */ + if (btrfs_sb_offset(i) >= btrfs_device_size(fd, )) + continue; + + /* Does not contain any stale SB */ + if (btrfs_read_dev_super(fd, disk_super, +btrfs_sb_offset(i), 0)) + continue; + + ret = zero_dev_clamped(fd, btrfs_sb_offset(i), + BTRFS_SUPER_INFO_SIZE, + btrfs_device_size(fd, )); + if (ret < 0) { + error("failed to zero device '%s' bytenr %llu: %s", + file, btrfs_sb_offset(i), strerror(-ret)); + return 1; + } + } + } + *block_count_ret = block_count; return 0; } -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 6/8] btrfs: verify superblock checksum during scan
During the scan context, we aren't verifying the superblock- checksum when read. This patch fixes it by adding the checksum verification function btrfs_check_super_csum() in the function btrfs_read_disk_super(). And makes device scan to error fail if the primary superblock csum is wrong, whereas if the copy-superblock csum is wrong it will just just report mismatch and continue mount/scan as usual. When the mount is successful We anyway overwrite all superblocks upon unmount. The context in which this will be called is - device scan, device ready, and mount -o device option. Test script: Corrupt primary superblock and check if device scan and mount fails: mkfs.btrfs -fq /dev/sdc dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K btrfs dev scan /dev/sdc mount /dev/sdc /btrfs Corrupt secondary superblock and check if device scan and mount is succcessful, check for the dmesg for errors. mkfs.btrfs -fq /dev/sdc dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=67108864 btrfs dev scan /dev/sdc mount /dev/sdc /btrfs Signed-off-by: Anand Jain--- v1->v2: changed title. use explicit (< 0) check for %errr. Un-split pr_err() string. Fix typo in the git commit log. Move the csum check after bytenr and btrfs magic verified. fs/btrfs/volumes.c | 13 + 1 file changed, 13 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b099823f60d1..eda86ba258fc 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1149,6 +1149,7 @@ static int btrfs_read_disk_super(struct block_device *bdev, u64 bytenr, struct page **page, struct btrfs_super_block **disk_super) { + int err; void *p; pgoff_t index; @@ -1183,6 +1184,18 @@ static int btrfs_read_disk_super(struct block_device *bdev, u64 bytenr, return -EINVAL; } + err = btrfs_check_super_csum((char *) *disk_super); + if (err < 0) { + if (err == -EINVAL) + pr_err("BTRFS error (device %pg): unsupported checksum type, bytenr=%llu", + bdev, bytenr); + else + pr_err("BTRFS error (device %pg): superblock checksum failed, bytenr=%llu", + bdev, bytenr); + btrfs_release_disk_super(*page); + return err; + } + if ((*disk_super)->label[0] && (*disk_super)->label[BTRFS_LABEL_SIZE - 1]) (*disk_super)->label[BTRFS_LABEL_SIZE - 1] = '\0'; -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/8] btrfs: return required error from btrfs_check_super_csum
Return the required -EINVAL and -EUCLEAN from the function btrfs_check_super_csum(). And more the error log into the parent function. Signed-off-by: Anand Jain--- v1->v2: Fix commit indent reported by checkpatch.pl Fix pr_err split string fs/btrfs/disk-io.c | 25 ++--- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b9b435579527..4c573602480c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -392,9 +392,10 @@ static int verify_parent_transid(struct extent_io_tree *io_tree, /* * Return 0 if the superblock checksum type matches the checksum value of that * algorithm. Pass the raw disk superblock data. + * Otherwise: -EINVAL if csum type is not found + * -EUCLEAN if csum does not match */ -static int btrfs_check_super_csum(struct btrfs_fs_info *fs_info, - char *raw_disk_sb) +static int btrfs_check_super_csum(char *raw_disk_sb) { struct btrfs_super_block *disk_sb = (struct btrfs_super_block *)raw_disk_sb; @@ -402,11 +403,8 @@ static int btrfs_check_super_csum(struct btrfs_fs_info *fs_info, u32 crc = ~(u32)0; char result[sizeof(crc)]; - if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) { - btrfs_err(fs_info, "unsupported checksum algorithm %u", - csum_type); - return 1; - } + if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) + return -EINVAL; /* * The super_block structure does not span the whole @@ -418,7 +416,7 @@ static int btrfs_check_super_csum(struct btrfs_fs_info *fs_info, btrfs_csum_final(crc, result); if (memcmp(raw_disk_sb, result, sizeof(result))) - return 1; + return -EUCLEAN; return 0; } @@ -2570,9 +2568,14 @@ int open_ctree(struct super_block *sb, * We want to check superblock checksum, the type is stored inside. * Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). */ - if (btrfs_check_super_csum(fs_info, bh->b_data)) { - btrfs_err(fs_info, "superblock checksum mismatch"); - err = -EINVAL; + err = btrfs_check_super_csum(bh->b_data); + if (err) { + if (err == -EINVAL) + pr_err("BTRFS error (device %pg): unsupported checksum algorithm", + fs_devices->latest_bdev); + else + pr_err("BTRFS error (device %pg): superblock checksum mismatch", + fs_devices->latest_bdev); brelse(bh); goto fail_alloc; } -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/8] btrfs: verify checksum for all devices in mount context
During mount context, we aren't verifying the superblock checksum for all the devices, instead, we verify it only for the struct btrfs_fs_device::latest_bdev. This patch fixes it by moving the checksum verification code from the function open_ctree() into the function btrfs_read_dev_one_super(). By doing this now we are verifying the superblock checksum in the mount-context, device-replace and, device-delete context. The device-replace and device-delete call-chain is as show below after this patch. delete-device/replace: btrfs_rm_device() || btrfs_dev_replace_by_ioctl() |_btrfs_find_device_by_devspec() |_btrfs_find_device_missing_or_by_path() |_btrfs_find_device_by_path() |_btrfs_get_bdev_and_sb() |_btrfs_read_dev_super() |_btrfs_read_dev_one_super() |_btrfs_check_super_csum() Test case: Before: mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K mount /dev/sdb /btrfs <-- success as kernel does not check csum for non-btrfs_fs_devices::latest_bdev. After: mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K mount /dev/sdb /btrfs mount: mount /dev/sdc on /btrfs failed: Structure needs cleaning mount -o degraded /dev/sdc /btrfs mount: mount /dev/sdc on /btrfs failed: Structure needs cleaning So the current recovery step is to fix the primary superblock, by using the btrfs-progs cli. Signed-off-by: Anand JainReviewed-by: Nikolay Borisov --- v1->v2: git commit log update. With the call-chain (which I believe will go away in the long term, we need the uuid from the userland not the disk-path). And add test case. Check err < 0 explicitly and drop check for "else if (err == -EUCLEAN)" v1: err = btrfs_check_super_csum(bh->b_data); if (err) { if (err == -EINVAL) pr_err("BTRFS error (device %pg): unsupported checksum algorithm", bdev); else if (err == -EUCLEAN) pr_err("BTRFS error (device %pg): superblock checksum mismatch", bdev); brelse(bh); return err; } fs/btrfs/disk-io.c | 33 + 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 35dbbdc613cd..110465bec775 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2565,22 +2565,6 @@ int open_ctree(struct super_block *sb, } /* -* We want to check superblock checksum, the type is stored inside. -* Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). -*/ - err = btrfs_check_super_csum(bh->b_data); - if (err) { - if (err == -EINVAL) - pr_err("BTRFS error (device %pg): unsupported checksum algorithm", - fs_devices->latest_bdev); - else - pr_err("BTRFS error (device %pg): superblock checksum mismatch", - fs_devices->latest_bdev); - brelse(bh); - goto fail_alloc; - } - - /* * super_copy is zeroed at allocation time and we never touch the * following bytes up to INFO_SIZE, the checksum is calculated from * the whole block of INFO_SIZE @@ -3126,6 +3110,7 @@ int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num, struct buffer_head *bh; struct btrfs_super_block *super; u64 bytenr; + int err; bytenr = btrfs_sb_offset(copy_num); if (bytenr + BTRFS_SUPER_INFO_SIZE >= i_size_read(bdev->bd_inode)) @@ -3146,6 +3131,22 @@ int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num, return -EINVAL; } + /* +* Check the superblock checksum, the type is stored inside. +* Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). +*/ + err = btrfs_check_super_csum(bh->b_data); + if (err < 0) { + if (err == -EINVAL) + pr_err("BTRFS error (device %pg): unsupported checksum algorithm", + bdev); + else + pr_err("BTRFS error (device %pg): superblock checksum mismatch", + bdev); + brelse(bh); + return err; + } + *bh_ret = bh; return 0; } -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/9] Superblock read and verify cleanups
v1->v2: Various changes suggested by Nicokay. Thanks. For specific changes pls ref to the patch. Patch 1-4/8 are preparatory patches adds cleanups and nonstatic requisites. Patch 5/8 makes sure that all copies of the superblock have the same fsid when we scan the device. Patch 6/8 verifies superblock csum when we read it in the scan context. Patch 7/8 fixes a bug that we weren't verifying the superblock csum for the non-latest_bdev. And 8/8 patch drops the redundant invalidate_bdev() call during mount. There is a btrfs-progs patch which is a kind of related, as its found that we weren't wiping the non-overwritten superblock, so it could cause confusion during the superblock recovery process. So the patch btrfs-progs 1/1 adds code to wipe superblock if we aren't overwriting it. Now since kernel patch 5/8 checks if all the superblock copies are pointing to the same fsid on the disk, so the scan will fail if without the above 1/1 btrfs-progs, as in the example below [1]. However the simple workaround is to wipe the superblock manually [2] or apply the btrfs-progs patch below. [1] mkfs.btrfs -qf /dev/sdb <-- 1T disk mkfs.btrfs -b 256M /dev/sdb ERROR: device scan failed on /dev/sdb [2] dd if=/dev/zero of= seek=274877906944 ibs=1 obs=1 count4K Unfortunately, the error messages should have been failed to register [3] device into the kernel to be more appropriate to the error. [3] ret = ioctl(fd, BTRFS_IOC_SCAN_DEV, ); if (ret < 0) { error("device scan failed on '%s': %m", fname); ret = -errno; } Patches 1-7/8 were sent independently before. And I found few more things to fix alongs the line, and since they are related, so I am sending these all together. Also, as there are minor changes, like in pr_err strings, and splitting the unrelated changes into a separate patch, so though I am thankful for the received reviewed-by, I couldn't include them here. Sorry. Finally, here I am including the function relations [4] so that it will help to review the code. And this flow is before these patches were applied. [4] In the long term, I suggest deprecating ioctl args which pass device path (where possible), like in delete-device/replace. And btrfs_read_dev_one_super() should replace btrfs_read_disk_super() delete-device/replace: btrfs_rm_device() || btrfs_dev_replace_by_ioctl() |_btrfs_find_device_by_devspec() |_btrfs_find_device_missing_or_by_path() |_btrfs_find_device_by_path() |_btrfs_get_bdev_and_sb() |_btrfs_read_dev_super() |_btrfs_read_dev_one_super() |___bread() btrfs_mount_root() | |_btrfs_parse_early_options (-o device only) | |_btrfs_scan_one_device | |_btrfs_read_disk_super() | |_read_cache_page_gfp() | |_btrfs_scan_one_device(mount-arg-dev only) | |_btrfs_read_disk_super() | |_read_cache_page_gfp() | | |_btrfs_open_devices(fsid:all) | |_btrfs_open_one_device() ||_btrfs_get_bdev_and_sb() <--- invalidate_bdev(fsid:all) | |_btrfs_read_dev_super() ||_btrfs_read_dev_one_super() | |___bread() | |_btrfs_fill_super() |_btrfs_open_ctree() <-- invalidate_bdev(latest_bdev) <-- redundant |_btrfs_read_dev_super(latest_bdev only) | |_btrfs_read_dev_one_super(latest_bdev only) | |___bread(latest_bdev) | |_btrfs_check_super_csum(latest_bdev only) [*] | |_btrfs_read_chunk_tree | |_read_one_dev() | |_open_seed_devices() | |_btrfs_open_devices(fs_devices->seed only) scan/ready | |_btrfs_scan_one_device(ioctl-arg-dev only) |_btrfs_read_disk_super() |_read_cache_page_gfp() Anand Jain (8): btrfs: cleanup btrfs_check_super_csum() for better code flow btrfs: return required error from btrfs_check_super_csum btrfs: cleanup btrfs_read_disk_super() to return std error btrfs: make btrfs_check_super_csum() non static btrfs: check if the fsid in the primary sb and copy sb are same btrfs: verify checksum when superblock is read for scan btrfs: verify checksum for all devices in mount context btrfs: drop the redundant invalidate_bdev() fs/btrfs/disk-io.c | 72 +++--- fs/btrfs/disk-io.h | 1 + fs/btrfs/volumes.c | 84 ++ 3 files changed, 102 insertions(+), 55 deletions(-) Anand Jain (1): btrfs-progs: wipe copies of the stale superblock beyond -b size utils.c | 35 +++ 1 file changed, 35 insertions(+) -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/8] btrfs: make btrfs_check_super_csum() non-static
In preparation to add the superblock csum verification for the scan context, make btrfs_check_super_csum() non-static. Signed-off-by: Anand JainReviewed-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 2 +- fs/btrfs/disk-io.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 4c573602480c..35dbbdc613cd 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -395,7 +395,7 @@ static int verify_parent_transid(struct extent_io_tree *io_tree, * Otherwise: -EINVAL if csum type is not found * -EUCLEAN if csum does not match */ -static int btrfs_check_super_csum(char *raw_disk_sb) +int btrfs_check_super_csum(char *raw_disk_sb) { struct btrfs_super_block *disk_sb = (struct btrfs_super_block *)raw_disk_sb; diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index 70a88d61b547..c400cc68f913 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -69,6 +69,7 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors); struct buffer_head *btrfs_read_dev_super(struct block_device *bdev); int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num, struct buffer_head **bh_ret); +int btrfs_check_super_csum(char *raw_disk_sb); int btrfs_commit_super(struct btrfs_fs_info *fs_info); struct btrfs_root *btrfs_read_fs_root(struct btrfs_root *tree_root, struct btrfs_key *location); -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/8] btrfs: cleanup btrfs_check_super_csum() for better code flow
We check the %csum_type twice. Drop one. Check for the csum_type and then for the csum. Which also matches with the progs which have better logic. This is preparatory patch to get proper error code from btrfs_check_super_csum(). Signed-off-by: Anand JainReviewed-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 38 +- 1 file changed, 17 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 1657d6aa4fa6..b9b435579527 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -399,32 +399,28 @@ static int btrfs_check_super_csum(struct btrfs_fs_info *fs_info, struct btrfs_super_block *disk_sb = (struct btrfs_super_block *)raw_disk_sb; u16 csum_type = btrfs_super_csum_type(disk_sb); - int ret = 0; - - if (csum_type == BTRFS_CSUM_TYPE_CRC32) { - u32 crc = ~(u32)0; - char result[sizeof(crc)]; - - /* -* The super_block structure does not span the whole -* BTRFS_SUPER_INFO_SIZE range, we expect that the unused space -* is filled with zeros and is included in the checksum. -*/ - crc = btrfs_csum_data(raw_disk_sb + BTRFS_CSUM_SIZE, - crc, BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE); - btrfs_csum_final(crc, result); - - if (memcmp(raw_disk_sb, result, sizeof(result))) - ret = 1; - } + u32 crc = ~(u32)0; + char result[sizeof(crc)]; if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) { btrfs_err(fs_info, "unsupported checksum algorithm %u", - csum_type); - ret = 1; + csum_type); + return 1; } - return ret; + /* +* The super_block structure does not span the whole +* BTRFS_SUPER_INFO_SIZE range, we expect that the unused space +* is filled with zeros and is included in the checksum. +*/ + crc = btrfs_csum_data(raw_disk_sb + BTRFS_CSUM_SIZE, + crc, BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE); + btrfs_csum_final(crc, result); + + if (memcmp(raw_disk_sb, result, sizeof(result))) + return 1; + + return 0; } /* -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: wipe copies of the stale superblock beyond -b size
On 03/27/2018 02:19 PM, Nikolay Borisov wrote: On 27.03.2018 02:50, Anand Jain wrote: I told you this code can be made a lot simpler, simply by modifying the last argument passed to zero_dev_clamped. I even posted the resulting diff which was just 3 lines changed. I agree that it's a good idea to wipe all available superblock when we use -b. However I don't agree with your approach - it adds a loop, it adds a bunch of checks and makes the complexity orders of magnitude higher than it could be. So I'm asking again - is there any inherent benefit which I'm missing in your newly added 35 lines of code against just passing the block device to zero_dev_clamped and letting the existing logic take care of everything? I had that idea for v1 as well, but I didn't do it because it would zero bytenr_copy#2 even when there is no btrfs superblock, (which is fine only with in block_count). Some might view it as corrupting the usr data (for which they have specified -b option?)? I am just discussing I don't have any usecase to prove it though. Do you have any idea? If you think it should be ok, I shall go ahead and zero without checking for the btrfs superblock beyond block_count. Right, so the pertinent question is - is anyone expecting to do mkfs on a volume irrespective of the options and expect to be able to recover data prior to the mkfs? I think the answer is 'no', but let's see what the community says. Writing zero even if there is no superblock is not a question of recovering btrfs, its a question whether there was any other good data that is intersting to the user, for which they have specified -b option. Asking community is a good idea, lets take a conservative approach to zero stale superblock only if we find a valid superblock, atleast for now, we can add cleanups later. Thanks, Aannd Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/8] btrfs: verify checksum when superblock is read for scan
Test script: Corrupt primary superblock and check if device scan and mount fails: mkfs.btrfs -fq /dev/sdc dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K btrfs dev scan mount /dev/sdb /btrfs Corrupt secondary superblock and check if device scan and mount is succcessful, check for the dmesg for errors. mkfs.btrfs -fq /dev/sdc dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K btrfs dev scan mount /dev/sdb /btrfs Have you considered adding fstests, it will be very easy to test for this behavior? This is one off kind of bug, not sure if it would value add for checking it all the time in xfstests? Same comment as before regarding string literals splitting across lines. accepted. + else + pr_err("BTRFS error (device %pg): "\ + "superblock checksum failed, bytenr=%llu", + bdev, bytenr); + btrfs_release_disk_super(*page); + return err; + } Also it will be better to have the checksum check after we have verified some basic invariants - that bytenr and magic have sane values. accepted. Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] btrfs: verify checksum for all devices in mount context
On 03/27/2018 08:21 PM, Nikolay Borisov wrote: On 26.03.2018 11:27, Anand Jain wrote: During mount context, we aren't verifying the superblock checksum for all the devices, instead, we verify it only for the struct btrfs_fs_device::latest_bdev. This patch fixes it by moving the checksum verification code from the function open_ctree() into the function btrfs_read_dev_one_super(). By doing this now we are verifying the superblock checksum in the mount-context, device-replace and, device-delete context. Which call chain provides the device-replace and device-delete contexts? I think it is worth it documenting them in the changelog. Will copy it here from the cover-letter. Signed-off-by: Anand JainReviewed-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 35 +-- 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 6299ab18da5f..3cc50041c0b9 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2565,24 +2565,6 @@ int open_ctree(struct super_block *sb, } /* -* We want to check superblock checksum, the type is stored inside. -* Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). -*/ - err = btrfs_check_super_csum(bh->b_data); - if (err) { - if (err == -EINVAL) - pr_err("BTRFS error (device %pg): "\ - "unsupported checksum algorithm", - fs_devices->latest_bdev); - else - pr_err("BTRFS error (device %pg): "\ - "superblock checksum mismatch", - fs_devices->latest_bdev); - brelse(bh); - goto fail_alloc; - } - - /* * super_copy is zeroed at allocation time and we never touch the * following bytes up to INFO_SIZE, the checksum is calculated from * the whole block of INFO_SIZE @@ -3128,6 +3110,7 @@ int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num, struct buffer_head *bh; struct btrfs_super_block *super; u64 bytenr; + int err; bytenr = btrfs_sb_offset(copy_num); if (bytenr + BTRFS_SUPER_INFO_SIZE >= i_size_read(bdev->bd_inode)) @@ -3148,6 +3131,22 @@ int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num, return -EINVAL; } + /* +* Check the superblock checksum, the type is stored inside. +* Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). +*/ + err = btrfs_check_super_csum(bh->b_data); + if (err) { I am fixing this to if (err < 0) { + if (err == -EINVAL) + pr_err("BTRFS error (device %pg): unsupported checksum algorithm", + bdev); + else if (err == -EUCLEAN) + pr_err("BTRFS error (device %pg): superblock checksum mismatch", + bdev); I am also dropping else if to else in v2. Thanks, Anand + brelse(bh); + return err; + } + *bh_ret = bh; return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/8] btrfs: check if the fsid in the primary sb and copy sb are same
On 03/27/2018 04:49 PM, Nikolay Borisov wrote: On 26.03.2018 11:27, Anand Jain wrote: During the btrfs dev scan make sure that other copies of superblock contain the same fsid as the primary SB. So that we bring to the user notice if the superblock has been overwritten. mkfs.btrfs -fq /dev/sdc mkfs.btrfs -fq /dev/sdb dd if=/dev/sdb of=/dev/sdc count=4K skip=64K seek=64K obs=1 ibs=1 mount /dev/sdc /btrfs Caveat: Pls note that older btrfs-progs do not wipe the non-overwriting stale superblock like copy2 if a smaller mkfs.btrfs -bis created. So this patch in the kernel will report error. The workaround is to wipe the superblock manually, like dd if=/dev/zero of= seek=274877906944 ibs=1 obs=1 count4K OR apply btrfs-progs patch btrfs-progs: wipe copies of the stale superblock beyond -b size which shall find and wipe the non overwriting superblock. Signed-off-by: Anand Jain--- fs/btrfs/volumes.c | 60 ++ 1 file changed, 47 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index ed22f0a3d239..45dd0674571b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1198,40 +1198,74 @@ static int btrfs_read_disk_super(struct block_device *bdev, u64 bytenr, int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, struct btrfs_fs_devices **fs_devices_ret) { + struct btrfs_super_block *disk_super_primary; struct btrfs_super_block *disk_super; struct btrfs_device *device; struct block_device *bdev; struct page *page; int ret = 0; - u64 bytenr; + int i; - /* -* we would like to check all the supers, but that would make -* a btrfs mount succeed after a mkfs from a different FS. -* So, we need to add a special mount option to scan for -* later supers, using BTRFS_SUPER_MIRROR_MAX instead -*/ - bytenr = btrfs_sb_offset(0); flags |= FMODE_EXCL; bdev = blkdev_get_by_path(path, flags, holder); if (IS_ERR(bdev)) return PTR_ERR(bdev); - ret = btrfs_read_disk_super(bdev, bytenr, , _super); - if (ret) + disk_super_primary = kzalloc(sizeof(*disk_super_primary), GFP_KERNEL); + if (!disk_super_primary) { + ret = -ENOMEM; goto error_bdev_put; + } + + /* +* We would like to check all the supers and use one good copy, +* but that would make a btrfs mount succeed after a mkfs from +* a different FS. +* So, we need to add a special mount option to scan for +* later supers, using BTRFS_SUPER_MIRROR_MAX instead. +* So, just validate if all copies of the superblocks are ok +* and have the same fsid. +*/ + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { + u64 bytenr = btrfs_sb_offset(i); + + ret = btrfs_read_disk_super(bdev, bytenr, , _super); + if (ret) { + if (i == 0) + goto error_kfree; + /* copy2 is optional */ + ret = 0; + continue; + } + + if (i == 0) { + memcpy(disk_super_primary, disk_super, + sizeof(*disk_super_primary)); + btrfs_release_disk_super(page); + continue; Doing the memcpy is enough here, the bottom of the loop already releases the disk page and continues on the next iteration. The page map happens inside btrfs_read_disk_super(), we need unmap before going for the next superblock. + } else if (memcmp(disk_super_primary->fsid, disk_super->fsid, + BTRFS_FSID_SIZE)) { + pr_err("BTRFS (device %pg): superblock fsid missmatch "\ + "primary %pU copy%d %pU", bdev, + disk_super_primary->fsid, i, disk_super->fsid); + ret = -EINVAL; + btrfs_release_disk_super(page); + goto error_kfree; + } + btrfs_release_disk_super(page); I'd say split the "read first sb" from the loop, because alway want to read it and return an error if it fails. And then have the loop begin at i = 1 and handle only the possible mirrors of the sb. That would clean up the nested 'if' in handling the ret. Also you could introduce another struct *page primary_page where you read the first super block. That way you save a memcpy + kzalloc but you'd have to always free it on function exit so I am not sure how much value it brings in terms of readability. Right. Also with this approach I don't have to kzallo() anymore. Will fix. Thanks, Anand + } mutex_lock(_mutex); - device =
Re: [PATCH 2/8] btrfs: return required error from btrfs_check_super_csum
On 03/28/2018 03:16 AM, David Sterba wrote: On Tue, Mar 27, 2018 at 11:05:53AM +0300, Nikolay Borisov wrote: + if (err) { + if (err == -EINVAL) + pr_err("BTRFS error (device %pg): "\ + "unsupported checksum algorithm", + fs_devices->latest_bdev); I don't think strings should be idented. I.e the correct thing here should be to have only fs_devices->latest_bdev on a new line, even if the string goes above the char limit of 80. In any case the \ is not needed due to gcc's support for string literal concatenation: pr_err("BTRFS error (device %pg): " "unsupported checksum algorithm", fs_devices->latest_bdev) should work equally well. But as I said I don't think this is even needed. Let's wait and see what David says. The strings should not be split, because we want to be able to grep the sources for the error messages. If the string does not fit 80 chars per line, then it can be moved to the next line and un-indented to the left. Plenty of examples. I understand reason behind string should not be split, but I thought here will be an exception, as if you want to search for the string you will still use "unsupported checksum algorithm" which is still in one line. About the grep not able to find the string which are split, I look at it as fixing the wrong end of the problem. That is Grep end problem being fixes at the c code end. Anyway as its kind of linux general guidlines, I shall fix my patch. Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
HI
-- Greeting, once again is me Lucy Boston this is twice am contacting you please is very urgent respond to me for more details through my. Email: dr.lucybos...@gmail.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] btrfs: use RCU in btrfs_show_devname for device list traversal
On 03/28/2018 02:48 AM, David Sterba wrote: The show_devname callback is used to print device name in /proc/self/mounts, we need to traverse the device list consistently and read the name that's copied to a seq buffer so we don't need further locking. If the first device is being deleted at the same time, the RCU will allow us to read the device name, though it will become stale right after the RCU protection ends. This is unavoidable and the user can expect that the device will disappear from the filesystem's list at some point. The device_list_mutex was pretty heavy as it is used eg. for writing superblock and a few other IO related contexts. This can stall any application that reads the proc file for no reason. Signed-off-by: David SterbaReviewed-by: Anand Jain -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/8] btrfs: return required error from btrfs_check_super_csum
On Tue, Mar 27, 2018 at 11:05:53AM +0300, Nikolay Borisov wrote: > > + if (err) { > > + if (err == -EINVAL) > > + pr_err("BTRFS error (device %pg): "\ > > + "unsupported checksum algorithm", > > + fs_devices->latest_bdev); > > I don't think strings should be idented. I.e the correct thing here > should be to have only fs_devices->latest_bdev on a new line, even if > the string goes above the char limit of 80. In any case the \ is not > needed due to gcc's support for string literal concatenation: > > pr_err("BTRFS error (device %pg): " >"unsupported checksum algorithm", >fs_devices->latest_bdev) > > should work equally well. But as I said I don't think this is even > needed. Let's wait and see what David says. The strings should not be split, because we want to be able to grep the sources for the error messages. If the string does not fit 80 chars per line, then it can be moved to the next line and un-indented to the left. Plenty of examples. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Delayed refs cleanups/streamlining
On Wed, Mar 21, 2018 at 10:45:04AM +0200, Nikolay Borisov wrote: > This patchkit aims to simplify and streamline the logic involved in > initialising and adding delayed refs/delayed head structures which deal with > modification of the extent tree. Currently the logic for init and add was > contained in one function for each type of structure. This resulted in very > awkward interface with shitloads of arguments, this in turn made it really > hard > to understand the gist of the code. This series rectifies the situation by > extracting common parts and using those rather than open coding duplicated > logic for every type of delayed ref (tree or data ref). > > The first 5 patches deal with the delayed_ref structs. Each patch is > incremental and makes the code bisectable. The last tree factor out the init > code for delayed_ref_head into a separate function and begin to use it. The > final completely splits the two. > > The net result is both a cleaner interface as well as somewhat reduced > critical section under delayed_refs->lock spinlock. > > Nikolay Borisov (8): > btrfs: Factor out common delayed refs init code > btrfs: Use init_delayed_ref_common in add_delayed_tree_ref > btrfs: Use init_delayed_ref_common in add_delayed_data_ref > btrfs: Open-code add_delayed_tree_ref > btrfs: Open-code add_delayed_data_ref > btrfs: Introduce init_delayed_ref_head > btrfs: Use init_delayed_ref_head in add_delayed_ref_head > btrfs: split delayed ref head initialization and addition I'll add the patchset to for-next testing branch so it can get picked by the git repo scrapers, but I haven't reviewed nor tested it. There were some merge conflict with current misc-next (fs_info exists for add_delayed_data_ref) so I had a chance to look at some patches closely. The overall impression is good, so the review sholud make sure there are no typos or similar issues that can arise when moving code around. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] btrfs: use RCU in btrfs_show_devname for device list traversal
The show_devname callback is used to print device name in /proc/self/mounts, we need to traverse the device list consistently and read the name that's copied to a seq buffer so we don't need further locking. If the first device is being deleted at the same time, the RCU will allow us to read the device name, though it will become stale right after the RCU protection ends. This is unavoidable and the user can expect that the device will disappear from the filesystem's list at some point. The device_list_mutex was pretty heavy as it is used eg. for writing superblock and a few other IO related contexts. This can stall any application that reads the proc file for no reason. Signed-off-by: David Sterba--- fs/btrfs/super.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 4b817947e00f..9c18f05b954d 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2290,11 +2290,18 @@ static int btrfs_show_devname(struct seq_file *m, struct dentry *root) struct list_head *head; struct rcu_string *name; - mutex_lock(_info->fs_devices->device_list_mutex); + /* +* Lightweight locking of the devices. We should not need +* device_list_mutex here as we only read the device data and the list +* is protected by RCU. Even if a device is deleted during the list +* traversals, we'll get valid data, the freeing callback will wait at +* least until until the rcu_read_unlock. +*/ + rcu_read_lock(); cur_devices = fs_info->fs_devices; while (cur_devices) { head = _devices->devices; - list_for_each_entry(dev, head, dev_list) { + list_for_each_entry_rcu(dev, head, dev_list) { if (test_bit(BTRFS_DEV_STATE_MISSING, >dev_state)) continue; if (!dev->name) @@ -2306,14 +2313,12 @@ static int btrfs_show_devname(struct seq_file *m, struct dentry *root) } if (first_dev) { - rcu_read_lock(); name = rcu_dereference(first_dev->name); seq_escape(m, name->str, " \t\n\\"); - rcu_read_unlock(); } else { WARN_ON(1); } - mutex_unlock(_info->fs_devices->device_list_mutex); + rcu_read_unlock(); return 0; } -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] btrfs: split dev-replace locking helpers for read and write
The current calls are unclear in what way btrfs_dev_replace_lock takes the locks, so drop the argument, split the helpers and use similar naming as for read and write locks. Signed-off-by: David Sterba--- fs/btrfs/dev-replace.c | 98 +- fs/btrfs/dev-replace.h | 6 ++-- fs/btrfs/reada.c | 10 +++--- fs/btrfs/scrub.c | 14 fs/btrfs/volumes.c | 18 +- 5 files changed, 74 insertions(+), 72 deletions(-) diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 7efbc4d1128b..91d0815dd514 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -200,13 +200,13 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans, struct btrfs_dev_replace_item *ptr; struct btrfs_dev_replace *dev_replace = _info->dev_replace; - btrfs_dev_replace_lock(dev_replace, 0); + btrfs_dev_replace_read_lock(dev_replace); if (!dev_replace->is_valid || !dev_replace->item_needs_writeback) { - btrfs_dev_replace_unlock(dev_replace, 0); + btrfs_dev_replace_read_unlock(dev_replace); return 0; } - btrfs_dev_replace_unlock(dev_replace, 0); + btrfs_dev_replace_read_unlock(dev_replace); key.objectid = 0; key.type = BTRFS_DEV_REPLACE_KEY; @@ -264,7 +264,7 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans, ptr = btrfs_item_ptr(eb, path->slots[0], struct btrfs_dev_replace_item); - btrfs_dev_replace_lock(dev_replace, 1); + btrfs_dev_replace_write_lock(dev_replace); if (dev_replace->srcdev) btrfs_set_dev_replace_src_devid(eb, ptr, dev_replace->srcdev->devid); @@ -287,7 +287,7 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans, btrfs_set_dev_replace_cursor_right(eb, ptr, dev_replace->cursor_right); dev_replace->item_needs_writeback = 0; - btrfs_dev_replace_unlock(dev_replace, 1); + btrfs_dev_replace_write_unlock(dev_replace); btrfs_mark_buffer_dirty(eb); @@ -352,7 +352,7 @@ int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info, return PTR_ERR(trans); } - btrfs_dev_replace_lock(dev_replace, 1); + btrfs_dev_replace_write_lock(dev_replace); switch (dev_replace->replace_state) { case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED: case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED: @@ -390,7 +390,7 @@ int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info, dev_replace->item_needs_writeback = 1; atomic64_set(_replace->num_write_errors, 0); atomic64_set(_replace->num_uncorrectable_read_errors, 0); - btrfs_dev_replace_unlock(dev_replace, 1); + btrfs_dev_replace_write_unlock(dev_replace); ret = btrfs_sysfs_add_device_link(tgt_device->fs_devices, tgt_device); if (ret) @@ -402,7 +402,7 @@ int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info, trans = btrfs_start_transaction(root, 0); if (IS_ERR(trans)) { ret = PTR_ERR(trans); - btrfs_dev_replace_lock(dev_replace, 1); + btrfs_dev_replace_write_lock(dev_replace); goto leave; } @@ -426,7 +426,7 @@ int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info, leave: dev_replace->srcdev = NULL; dev_replace->tgtdev = NULL; - btrfs_dev_replace_unlock(dev_replace, 1); + btrfs_dev_replace_write_unlock(dev_replace); btrfs_destroy_dev_replace_tgtdev(fs_info, tgt_device); return ret; } @@ -493,18 +493,18 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info, /* don't allow cancel or unmount to disturb the finishing procedure */ mutex_lock(_replace->lock_finishing_cancel_unmount); - btrfs_dev_replace_lock(dev_replace, 0); + btrfs_dev_replace_read_lock(dev_replace); /* was the operation canceled, or is it finished? */ if (dev_replace->replace_state != BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED) { - btrfs_dev_replace_unlock(dev_replace, 0); + btrfs_dev_replace_read_unlock(dev_replace); mutex_unlock(_replace->lock_finishing_cancel_unmount); return 0; } tgt_device = dev_replace->tgtdev; src_device = dev_replace->srcdev; - btrfs_dev_replace_unlock(dev_replace, 0); + btrfs_dev_replace_read_unlock(dev_replace); /* * flush all outstanding I/O and inode extent mappings before the @@ -529,7 +529,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info, /* keep away write_all_supers() during the finishing procedure */ mutex_lock(_info->fs_devices->device_list_mutex); mutex_lock(_info->chunk_mutex); -
[PATCH 5/6] btrfs: remove stale comments about fs_mutex
The fs_mutex has been killed in 2008, a213501153fd66e2 ("Btrfs: Replace the big fs_mutex with a collection of other locks"), still remembered in some comments. We don't have any extra needs for locking in the ACL handlers. Signed-off-by: David Sterba--- fs/btrfs/acl.c | 8 1 file changed, 8 deletions(-) diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c index 1ba49ebe67da..700a3dfd3129 100644 --- a/fs/btrfs/acl.c +++ b/fs/btrfs/acl.c @@ -65,9 +65,6 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type) return acl; } -/* - * Needs to be called with fs_mutex held - */ static int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode, struct posix_acl *acl, int type) { @@ -127,11 +124,6 @@ int btrfs_set_acl(struct inode *inode, struct posix_acl *acl, int type) return ret; } -/* - * btrfs_init_acl is already generally called under fs_mutex, so the locking - * stuff has been fixed to work with that. If the locking stuff changes, we - * need to re-evaluate the acl locking stuff. - */ int btrfs_init_acl(struct btrfs_trans_handle *trans, struct inode *inode, struct inode *dir) { -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] btrfs: update barrier in should_cow_block
Once there was a simple int force_cow that was used with the plain barriers, and then converted to a bit, so we should use the appropriate barrier helper. Other variables in the complex if condition do not depend on a barrier, so we should be fine in case the atomic barrier becomes a no-op. Signed-off-by: David Sterba--- fs/btrfs/ctree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index b88a79e69ddf..4c75df2eea2e 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1502,8 +1502,8 @@ static inline int should_cow_block(struct btrfs_trans_handle *trans, if (btrfs_is_testing(root->fs_info)) return 0; - /* ensure we can see the force_cow */ - smp_rmb(); + /* Ensure we can see the FORCE_COW bit */ + smp_mb__before_atomic(); /* * We do not need to cow a block if -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] btrfs: use lockdep_assert_held for mutexes
Using lockdep_assert_held is preferred, replace mutex_is_locked. Signed-off-by: David Sterba--- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/scrub.c | 4 ++-- fs/btrfs/volumes.c | 10 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index c1618ab9fecf..4116fb20b1af 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4542,7 +4542,7 @@ void check_system_chunk(struct btrfs_trans_handle *trans, * Needed because we can end up allocating a system chunk and for an * atomic and race free space reservation in the chunk block reserve. */ - ASSERT(mutex_is_locked(_info->chunk_mutex)); + lockdep_assert_held(_info->chunk_mutex); info = __find_space_info(fs_info, BTRFS_BLOCK_GROUP_SYSTEM); spin_lock(>lock); diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index ec56f33feea9..9cfbccdeacf3 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -371,7 +371,7 @@ static struct full_stripe_lock *insert_full_stripe_lock( struct full_stripe_lock *entry; struct full_stripe_lock *ret; - WARN_ON(!mutex_is_locked(_root->lock)); + lockdep_assert_held(_root->lock); p = _root->root.rb_node; while (*p) { @@ -413,7 +413,7 @@ static struct full_stripe_lock *search_full_stripe_lock( struct rb_node *node; struct full_stripe_lock *entry; - WARN_ON(!mutex_is_locked(_root->lock)); + lockdep_assert_held(_root->lock); node = locks_root->root.rb_node; while (node) { diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b2d05c6b1c56..350c12f1f19e 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2047,7 +2047,7 @@ void btrfs_rm_dev_replace_remove_srcdev(struct btrfs_fs_info *fs_info, { struct btrfs_fs_devices *fs_devices; - WARN_ON(!mutex_is_locked(_info->fs_devices->device_list_mutex)); + lockdep_assert_held(_info->fs_devices->device_list_mutex); /* * in case of fs with no seed, srcdev->fs_devices will point @@ -2237,7 +2237,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) struct btrfs_device *device; u64 super_flags; - BUG_ON(!mutex_is_locked(_mutex)); + lockdep_assert_held(_mutex); if (!fs_devices->seeding) return -EINVAL; @@ -2984,7 +2984,7 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset) * we release the path used to search the chunk/dev tree and before * the current task acquires this mutex and calls us. */ - ASSERT(mutex_is_locked(_info->delete_unused_bgs_mutex)); + lockdep_assert_held(_info->delete_unused_bgs_mutex); ret = btrfs_can_relocate(fs_info, chunk_offset); if (ret) @@ -5068,7 +5068,7 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, { u64 chunk_offset; - ASSERT(mutex_is_locked(_info->chunk_mutex)); + lockdep_assert_held(_info->chunk_mutex); chunk_offset = find_next_chunk(fs_info); return __btrfs_alloc_chunk(trans, chunk_offset, type); } @@ -6618,7 +6618,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info, struct btrfs_fs_devices *fs_devices; int ret; - BUG_ON(!mutex_is_locked(_mutex)); + lockdep_assert_held(_mutex); ASSERT(fsid); fs_devices = fs_info->fs_devices->seed; -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] btrfs: use lockdep_assert_held for spinlocks
Using lockdep_assert_held is preferred, replace assert_spin_locked. Signed-off-by: David Sterba--- fs/btrfs/delayed-ref.c | 6 +++--- fs/btrfs/qgroup.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 7ab5e0128f0c..6dcb0128ea7f 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -216,7 +216,7 @@ int btrfs_delayed_ref_lock(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_root *delayed_refs; delayed_refs = >transaction->delayed_refs; - assert_spin_locked(_refs->lock); + lockdep_assert_held(_refs->lock); if (mutex_trylock(>mutex)) return 0; @@ -239,7 +239,7 @@ static inline void drop_delayed_ref(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_head *head, struct btrfs_delayed_ref_node *ref) { - assert_spin_locked(>lock); + lockdep_assert_held(>lock); rb_erase(>ref_node, >ref_tree); RB_CLEAR_NODE(>ref_node); if (!list_empty(>add_list)) @@ -307,7 +307,7 @@ void btrfs_merge_delayed_refs(struct btrfs_trans_handle *trans, struct rb_node *node; u64 seq = 0; - assert_spin_locked(>lock); + lockdep_assert_held(>lock); if (RB_EMPTY_ROOT(>ref_tree)) return; diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index aa259d6986e1..ad09845b618c 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1414,7 +1414,7 @@ int btrfs_qgroup_trace_extent_nolock(struct btrfs_fs_info *fs_info, struct btrfs_qgroup_extent_record *entry; u64 bytenr = record->bytenr; - assert_spin_locked(_refs->lock); + lockdep_assert_held(_refs->lock); trace_btrfs_qgroup_trace_extent(fs_info, record); while (*p) { -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] Btrfs lockdep and locking cleanups
A few cleanups related to lockdep and locking. The btrfs_show_devname is a functional change, uses the same mechanism as btrfs_ioctl_dev_info or btrfs_ioctl_fs_info so that's not a new thing. David Sterba (6): btrfs: use lockdep_assert_held for spinlocks btrfs: use lockdep_assert_held for mutexes btrfs: update barrier in should_cow_block btrfs: use RCU in btrfs_show_devname for device list traversal btrfs: remove stale comments about fs_mutex btrfs: split dev-replace locking helpers for read and write fs/btrfs/acl.c | 8 - fs/btrfs/ctree.c | 4 +-- fs/btrfs/delayed-ref.c | 6 ++-- fs/btrfs/dev-replace.c | 98 +- fs/btrfs/dev-replace.h | 6 ++-- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/qgroup.c | 2 +- fs/btrfs/reada.c | 10 +++--- fs/btrfs/scrub.c | 18 +- fs/btrfs/super.c | 15 +--- fs/btrfs/volumes.c | 28 +++ 11 files changed, 98 insertions(+), 99 deletions(-) -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] Qgroup metadata reservation rework
On Tue, Mar 27, 2018 at 4:23 PM, David Sterbawrote: > On Tue, Mar 27, 2018 at 07:49:02AM +0800, Qu Wenruo wrote: >> > Merging plan for this patchset from last week was to postpone until >> > 4.18 due to lack of final testing here. I've tried to run this with >> > quotas enabled an fstests that led to warnings in the power failure >> > simulation tests. >> >> Which test case? > > eg shared/298, the log is then flooded by this message > > [153085.406444] [ cut here ] > [153085.411426] qgroup 5 data reserved space underflow, have 32661504 to free > 268435456 > [153085.411584] WARNING: CPU: 7 PID: 22475 at fs/btrfs/qgroup.c:102 > qgroup_rsv_release+0x1fa/0x210 [btrfs] > [153085.499574] CPU: 7 PID: 22475 Comm: fio Tainted: GW > 4.16.0-rc6-1.ge195904-vanilla+ #192 > [153085.509186] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 > [153085.516008] RIP: 0010:qgroup_rsv_release+0x1fa/0x210 [btrfs] > [153085.521899] RSP: 0018:b1de01bf7c18 EFLAGS: 00010296 > [153085.527367] RAX: 0047 RBX: 928ce4950c00 RCX: > 0002 > [153085.534799] RDX: 0007 RSI: 0001 RDI: > 0202 > [153085.542236] RBP: R08: R09: > > [153085.549670] R10: R11: b1de01bf7aa0 R12: > 928ce4950c00 > [153085.557106] R13: R14: 1000 R15: > 928c3bb54000 > [153085.564545] FS: 7f90afdc5740() GS:928ce740() > knlGS: > [153085.572945] CS: 0010 DS: ES: CR0: 80050033 > [153085.578925] CR2: 0076daa8 CR3: 0001f86e CR4: > 06e0 > [153085.586367] Call Trace: > [153085.589112] btrfs_qgroup_free_refroot+0x174/0x2c0 [btrfs] > [153085.594873] __btrfs_run_delayed_refs+0xb9e/0x17e0 [btrfs] > [153085.600671] btrfs_run_delayed_refs+0x10d/0x1c0 [btrfs] > [153085.606179] btrfs_commit_transaction+0x30/0xad0 [btrfs] > [153085.611782] btrfs_sync_file+0x393/0x4f0 [btrfs] > [153085.616652] do_fsync+0x38/0x60 > [153085.620021] SyS_fsync+0xc/0x10 > [153085.623402] do_syscall_64+0x71/0x1a0 > [153085.627296] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > [153085.632590] RIP: 0033:0x7f90af0942cd > [153085.636379] RSP: 002b:7ffcb59a4c20 EFLAGS: 0293 ORIG_RAX: > 004a > [153085.644249] RAX: ffda RBX: RCX: > 7f90af0942cd > [153085.651675] RDX: 1000 RSI: 013e5960 RDI: > 0004 > [153085.659111] RBP: R08: 0008 R09: > 9e37fffc0001 > [153085.666549] R10: 0f75e468 R11: 0293 R12: > > [153085.673985] R13: 0640 R14: 7f90968acfb8 R15: > 7f90adad4db0 > [153085.700848] ---[ end trace 48a650f3d217fbac ]--- > > > or generic/475 This one is totally unrelated with this patchset, as it happens without it and without qgroups enabled. It happens very often actually, it's just some proper cleanup missing after some fatal error that lead to a transaction abort. > > [69595.290236] WARNING: CPU: 1 PID: 3659 at fs/btrfs/extent-tree.c:124 > btrfs_put_block_group+0x52/0x60 [btrfs] > [69595.300499] CPU: 1 PID: 3659 Comm: umount Tainted: GW > 4.16.0-rc6-1.ge195904-vanilla+ #192 > [69595.300502] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 > [69595.300555] RIP: 0010:btrfs_put_block_group+0x52/0x60 [btrfs] > [69595.300569] RSP: 0018:b1de02d67dd0 EFLAGS: 00010286 > [69595.393015] RAX: 0001 RBX: 928ca4b7dc00 RCX: > 0002 > [69595.393018] RDX: 928ca4b7ddd8 RSI: 0001 RDI: > 928ca4b7dc00 > [69595.393024] RBP: 928ca4b7dc00 R08: 928cd6c70e10 R09: > > [69595.420315] R10: R11: 928cd6c70e10 R12: > 928c9c7640d0 > [69595.420318] R13: 928c9c764000 R14: 928c9c764120 R15: > > [69595.420322] FS: 7ff5e66a2840() GS:928ce680() > knlGS: > [69595.420328] CS: 0010 DS: ES: CR0: 80050033 > [69595.449136] CR2: 7efdd6e6b594 CR3: 000211943000 CR4: > 06e0 > [69595.449140] Call Trace: > [69595.449207] btrfs_free_block_groups+0x164/0x420 [btrfs] > [69595.464610] close_ctree+0x127/0x2f0 [btrfs] > [69595.469046] generic_shutdown_super+0x69/0x110 > [69595.469054] kill_anon_super+0xe/0x20 > [69595.469102] btrfs_kill_super+0x13/0x100 [btrfs] > [69595.469114] deactivate_locked_super+0x39/0x70 > [69595.469122] cleanup_mnt+0x3b/0x70 > [69595.469128] task_work_run+0x80/0xb0 > [69595.469139] exit_to_usermode_loop+0x71/0xa0 > [69595.469148] do_syscall_64+0x157/0x1a0 > [69595.469158] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > [69595.469164] RIP: 0033:0x7ff5e5f99d37 > [69595.469167] RSP: 002b:7ffeb054adf8 EFLAGS: 0246 ORIG_RAX: > 00a6 > [69595.469172] RAX: RBX:
[PATCH] btrfs: lift errors from add_extent_changeset to the callers
The missing error handling in add_extent_changeset was hidden, so make it at least visible in the callers. Signed-off-by: David Sterba--- fs/btrfs/extent_io.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f27bad003f8e..f141d0fd7a65 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -119,23 +119,22 @@ struct extent_page_data { unsigned int sync_io:1; }; -static void add_extent_changeset(struct extent_state *state, unsigned bits, +static int add_extent_changeset(struct extent_state *state, unsigned bits, struct extent_changeset *changeset, int set) { int ret; if (!changeset) - return; + return 0; if (set && (state->state & bits) == bits) - return; + return 0; if (!set && (state->state & bits) == 0) - return; + return 0; changeset->bytes_changed += state->end - state->start + 1; ret = ulist_add(>range_changed, state->start, state->end, GFP_ATOMIC); - /* ENOMEM */ - BUG_ON(ret < 0); + return ret; } static void flush_write_bio(struct extent_page_data *epd); @@ -527,6 +526,7 @@ static struct extent_state *clear_state_bit(struct extent_io_tree *tree, { struct extent_state *next; unsigned bits_to_clear = *bits & ~EXTENT_CTLBITS; + int ret; if ((bits_to_clear & EXTENT_DIRTY) && (state->state & EXTENT_DIRTY)) { u64 range = state->end - state->start + 1; @@ -534,7 +534,8 @@ static struct extent_state *clear_state_bit(struct extent_io_tree *tree, tree->dirty_bytes -= range; } clear_state_cb(tree, state, bits); - add_extent_changeset(state, bits_to_clear, changeset, 0); + ret = add_extent_changeset(state, bits_to_clear, changeset, 0); + BUG_ON(ret); state->state &= ~bits_to_clear; if (wake) wake_up(>wq); @@ -805,13 +806,15 @@ static void set_state_bits(struct extent_io_tree *tree, unsigned *bits, struct extent_changeset *changeset) { unsigned bits_to_set = *bits & ~EXTENT_CTLBITS; + int ret; set_state_cb(tree, state, bits); if ((bits_to_set & EXTENT_DIRTY) && !(state->state & EXTENT_DIRTY)) { u64 range = state->end - state->start + 1; tree->dirty_bytes += range; } - add_extent_changeset(state, bits_to_set, changeset, 1); + ret = add_extent_changeset(state, bits_to_set, changeset, 1); + BUG_ON(ret); state->state |= bits_to_set; } -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] btrfs fiemap related BUG fix.
On Wed, Mar 07, 2018 at 04:20:17PM +0800, robbieko wrote: > From: Robbie Ko> > This patchset intends to fix btrfs fiemap related bug. > > The fiemap has the following problems: > > 1) Wrong extent count when fm_extent_count is zero. > > > 2) SHARED bit is not correct > I have two ideas, but I do not know which one is the best. > > Like: > # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file > # xfs_io -c "fiemap -v" /mnt/btrfs/file > /mnt/btrfs/file: > EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS >0: [0..63]: 4241424..424148764 0x1 > # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone > > 1. When any extent is shared in extent map, the entire extent map is shared > # xfs_io -c "fiemap -v" /mnt/btrfs/file > /mnt/btrfs/file: > EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS >0: [0..63]: 4241424..4241487 64 0x2001 > > 2. Split into different extent > # xfs_io -c "fiemap -v" /mnt/btrfs/file > /mnt/btrfs/file: > EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS >0: [0..31]: 4241424..4241455 32 0x0 >1: [32..63]:4241456..4241487 32 0x2001 > > Robbie Ko (2): > Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero > Btrfs: fix fiemap extent SHARED flag error with range clone. There were a lot of comments, I'm not sure I have a clear picture of what and how is addressed. Can you please update the patches and resend? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] Qgroup metadata reservation rework
On Tue, Mar 27, 2018 at 07:49:02AM +0800, Qu Wenruo wrote: > > Merging plan for this patchset from last week was to postpone until > > 4.18 due to lack of final testing here. I've tried to run this with > > quotas enabled an fstests that led to warnings in the power failure > > simulation tests. > > Which test case? eg shared/298, the log is then flooded by this message [153085.406444] [ cut here ] [153085.411426] qgroup 5 data reserved space underflow, have 32661504 to free 268435456 [153085.411584] WARNING: CPU: 7 PID: 22475 at fs/btrfs/qgroup.c:102 qgroup_rsv_release+0x1fa/0x210 [btrfs] [153085.499574] CPU: 7 PID: 22475 Comm: fio Tainted: GW 4.16.0-rc6-1.ge195904-vanilla+ #192 [153085.509186] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 [153085.516008] RIP: 0010:qgroup_rsv_release+0x1fa/0x210 [btrfs] [153085.521899] RSP: 0018:b1de01bf7c18 EFLAGS: 00010296 [153085.527367] RAX: 0047 RBX: 928ce4950c00 RCX: 0002 [153085.534799] RDX: 0007 RSI: 0001 RDI: 0202 [153085.542236] RBP: R08: R09: [153085.549670] R10: R11: b1de01bf7aa0 R12: 928ce4950c00 [153085.557106] R13: R14: 1000 R15: 928c3bb54000 [153085.564545] FS: 7f90afdc5740() GS:928ce740() knlGS: [153085.572945] CS: 0010 DS: ES: CR0: 80050033 [153085.578925] CR2: 0076daa8 CR3: 0001f86e CR4: 06e0 [153085.586367] Call Trace: [153085.589112] btrfs_qgroup_free_refroot+0x174/0x2c0 [btrfs] [153085.594873] __btrfs_run_delayed_refs+0xb9e/0x17e0 [btrfs] [153085.600671] btrfs_run_delayed_refs+0x10d/0x1c0 [btrfs] [153085.606179] btrfs_commit_transaction+0x30/0xad0 [btrfs] [153085.611782] btrfs_sync_file+0x393/0x4f0 [btrfs] [153085.616652] do_fsync+0x38/0x60 [153085.620021] SyS_fsync+0xc/0x10 [153085.623402] do_syscall_64+0x71/0x1a0 [153085.627296] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [153085.632590] RIP: 0033:0x7f90af0942cd [153085.636379] RSP: 002b:7ffcb59a4c20 EFLAGS: 0293 ORIG_RAX: 004a [153085.644249] RAX: ffda RBX: RCX: 7f90af0942cd [153085.651675] RDX: 1000 RSI: 013e5960 RDI: 0004 [153085.659111] RBP: R08: 0008 R09: 9e37fffc0001 [153085.666549] R10: 0f75e468 R11: 0293 R12: [153085.673985] R13: 0640 R14: 7f90968acfb8 R15: 7f90adad4db0 [153085.700848] ---[ end trace 48a650f3d217fbac ]--- or generic/475 [69595.290236] WARNING: CPU: 1 PID: 3659 at fs/btrfs/extent-tree.c:124 btrfs_put_block_group+0x52/0x60 [btrfs] [69595.300499] CPU: 1 PID: 3659 Comm: umount Tainted: GW 4.16.0-rc6-1.ge195904-vanilla+ #192 [69595.300502] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 [69595.300555] RIP: 0010:btrfs_put_block_group+0x52/0x60 [btrfs] [69595.300569] RSP: 0018:b1de02d67dd0 EFLAGS: 00010286 [69595.393015] RAX: 0001 RBX: 928ca4b7dc00 RCX: 0002 [69595.393018] RDX: 928ca4b7ddd8 RSI: 0001 RDI: 928ca4b7dc00 [69595.393024] RBP: 928ca4b7dc00 R08: 928cd6c70e10 R09: [69595.420315] R10: R11: 928cd6c70e10 R12: 928c9c7640d0 [69595.420318] R13: 928c9c764000 R14: 928c9c764120 R15: [69595.420322] FS: 7ff5e66a2840() GS:928ce680() knlGS: [69595.420328] CS: 0010 DS: ES: CR0: 80050033 [69595.449136] CR2: 7efdd6e6b594 CR3: 000211943000 CR4: 06e0 [69595.449140] Call Trace: [69595.449207] btrfs_free_block_groups+0x164/0x420 [btrfs] [69595.464610] close_ctree+0x127/0x2f0 [btrfs] [69595.469046] generic_shutdown_super+0x69/0x110 [69595.469054] kill_anon_super+0xe/0x20 [69595.469102] btrfs_kill_super+0x13/0x100 [btrfs] [69595.469114] deactivate_locked_super+0x39/0x70 [69595.469122] cleanup_mnt+0x3b/0x70 [69595.469128] task_work_run+0x80/0xb0 [69595.469139] exit_to_usermode_loop+0x71/0xa0 [69595.469148] do_syscall_64+0x157/0x1a0 [69595.469158] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [69595.469164] RIP: 0033:0x7ff5e5f99d37 [69595.469167] RSP: 002b:7ffeb054adf8 EFLAGS: 0246 ORIG_RAX: 00a6 [69595.469172] RAX: RBX: 563360af5060 RCX: 7ff5e5f99d37 [69595.469175] RDX: 0001 RSI: RDI: 563360af52c0 [69595.469178] RBP: 563360af52c0 R08: 7ff5e628b97d R09: 0005 [69595.469181] R10: 563360af5280 R11: 0246 R12: 7ff5e649a1a4 [69595.469183] R13: R14: R15: [69595.469339] ---[ end trace 48a650f3d217fba6 ]--- [69595.530603] WARNING: CPU: 1 PID: 3659 at fs/btrfs/extent-tree.c:9945
[PATCH v2 2/2] btrfs: Validate child tree block's level and first key
We have several reports about node pointer points to incorrect child tree blocks, which could have even wrong owner and level but still with valid generation and checksum. Although btrfs check could handle it and print error message like: leaf parent key incorrect 60670574592 Kernel doesn't have enough check on this type of corruption correctly. At least add such check to read_tree_block() and btrfs_read_buffer(), where we need two new parameters @level and @first_key to verify the child tree block. The new @level check is mandatory and all call sites are already modified to extract expected level from its call chain. While @first_key is optional, the following call sites are skipping such check: 1) Root node/leaf As ROOT_ITEM doesn't contain the first key, skip @first_key check. 2) Direct backref Only parent bytenr and level is known and we need to resolve the key all by ourselves, skip @first_key check. Another note of this verification is, it needs extra info from nodeptr or ROOT_ITEM, so it can't fit into current tree-checker framework, which is limited to node/leaf boundary. Signed-off-by: Qu Wenruo--- changelog: v2: Make @level check mandatory, suggesed by Jeff and Nikolay. Change parameter order as @level is now mandatory, put it in front of @first_key. Change verify_parent_level() to verify_key_level() to avoid confusion on the @level parameter. Add btrfs_error() output for CONFIG_BTRFS_DEBUG to help debugging. --- fs/btrfs/backref.c | 6 ++-- fs/btrfs/ctree.c | 28 +++ fs/btrfs/disk-io.c | 95 +++--- fs/btrfs/disk-io.h | 8 +++-- fs/btrfs/extent-tree.c | 6 +++- fs/btrfs/print-tree.c | 10 -- fs/btrfs/qgroup.c | 7 ++-- fs/btrfs/ref-verify.c | 7 +++- fs/btrfs/relocation.c | 21 --- fs/btrfs/tree-log.c| 28 +-- 10 files changed, 170 insertions(+), 46 deletions(-) diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index 26484648d090..ec71c7e66cfa 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -738,7 +738,8 @@ static int add_missing_keys(struct btrfs_fs_info *fs_info, BUG_ON(ref->key_for_search.type); BUG_ON(!ref->wanted_disk_byte); - eb = read_tree_block(fs_info, ref->wanted_disk_byte, 0); + eb = read_tree_block(fs_info, ref->wanted_disk_byte, 0, +ref->level - 1, NULL); if (IS_ERR(eb)) { free_pref(ref); return PTR_ERR(eb); @@ -1291,7 +1292,8 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans, ref->level == 0) { struct extent_buffer *eb; - eb = read_tree_block(fs_info, ref->parent, 0); + eb = read_tree_block(fs_info, ref->parent, 0, +ref->level, NULL); if (IS_ERR(eb)) { ret = PTR_ERR(eb); goto out; diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index b88a79e69ddf..4f1aefccd34a 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1418,6 +1418,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) struct tree_mod_root *old_root = NULL; u64 old_generation = 0; u64 logical; + int level; eb_root = btrfs_read_lock_root_node(root); tm = __tree_mod_log_oldest_root(fs_info, eb_root, time_seq); @@ -1428,15 +1429,17 @@ get_old_root(struct btrfs_root *root, u64 time_seq) old_root = >old_root; old_generation = tm->generation; logical = old_root->logical; + level = old_root->level; } else { logical = eb_root->start; + level = btrfs_header_level(eb_root); } tm = tree_mod_log_search(fs_info, logical, time_seq); if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { btrfs_tree_read_unlock(eb_root); free_extent_buffer(eb_root); - old = read_tree_block(fs_info, logical, 0); + old = read_tree_block(fs_info, logical, 0, level, NULL); if (WARN_ON(IS_ERR(old) || !extent_buffer_uptodate(old))) { if (!IS_ERR(old)) free_extent_buffer(old); @@ -1656,6 +1659,7 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, btrfs_set_lock_blocking(parent); for (i = start_slot; i <= end_slot; i++) { + struct btrfs_key first_key; int close = 1; btrfs_node_key(parent, _key, i); @@ -1665,6 +1669,7 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, progress_passed = 1; blocknr =
[PATCH 1/2] btrfs: tests/qgroup: Fix wrong tree backref level
The extent tree of the test fs is like the following: BTRFS info (device (null)): leaf 16327509003777336587 total ptrs 1 free space 3919 item 0 key (4096 168 4096) itemoff 3944 itemsize 51 extent refs 1 gen 1 flags 2 tree block key (68719476736 0 0) level 1 ^^^ ref#0: tree block backref root 5 And it's using an empty tree for fs tree, so there is no way that its level can be 1. For REAL (created by mkfs) fs tree backref with no skinny metadata, the result should look like: item 3 key (30408704 EXTENT_ITEM 4096) itemoff 3845 itemsize 51 refs 1 gen 4 flags TREE_BLOCK tree block key (256 INODE_ITEM 0) level 0 ^^^ tree block backref root 5 Fix the level to 0, so it won't break later tree level checker. Fixes: faa2dbf004e8 ("Btrfs: add sanity tests for new qgroup accounting code") Signed-off-by: Qu Wenruo--- fs/btrfs/tests/qgroup-tests.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/tests/qgroup-tests.c b/fs/btrfs/tests/qgroup-tests.c index 90204b166643..160eb2fba726 100644 --- a/fs/btrfs/tests/qgroup-tests.c +++ b/fs/btrfs/tests/qgroup-tests.c @@ -63,7 +63,7 @@ static int insert_normal_tree_ref(struct btrfs_root *root, u64 bytenr, btrfs_set_extent_generation(leaf, item, 1); btrfs_set_extent_flags(leaf, item, BTRFS_EXTENT_FLAG_TREE_BLOCK); block_info = (struct btrfs_tree_block_info *)(item + 1); - btrfs_set_tree_block_level(leaf, block_info, 1); + btrfs_set_tree_block_level(leaf, block_info, 0); iref = (struct btrfs_extent_inline_ref *)(block_info + 1); if (parent > 0) { btrfs_set_extent_inline_ref_type(leaf, iref, -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/3] btrfs: Allow rmdir(2) to delete a subvolume
On 26.03.2018 11:30, Misono Tomohiro wrote: > This patch changes the behavior of rmdir(2) to allow it to delete > an empty subvolume by default, unless it is not a default subvolume > and send is not in progress. > > New function btrfs_delete_subvolume() is almost equal to the second half > of btrfs_ioctl_snap_destroy(). This function requires inode_lock for both > @dir and inode of @dentry. For rmdir(2) it is already acquired in vfs > layer before calling btrfs_rmdir(). > > Note that while a non-privileged user cannot delete a read-only subvolume > by "btrfs subvolume delete" when user_subvol_rm_allowd mount option is > enabled, rmdir(2) can delete an empty read-only subvolume. > (However, rm -r cannot use for read-only subvolume containing files.) > > Signed-off-by: Tomohiro Misono> --- > fs/btrfs/inode.c | 143 > ++- > 1 file changed, 142 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index db66fa4fede6..b778776eee8e 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -4387,6 +4387,147 @@ noinline int may_destroy_subvol(struct btrfs_root > *root) > return ret; > } > > +static int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry) > +{ > + struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb); > + struct btrfs_root *root = BTRFS_I(dir)->root; > + struct inode *inode = d_inode(dentry); > + struct btrfs_root *dest = BTRFS_I(inode)->root; > + struct btrfs_trans_handle *trans; > + struct btrfs_block_rsv block_rsv; > + u64 root_flags; > + u64 qgroup_reserved; > + int ret; > + int err; > + > + /* > + * Don't allow to delete a subvolume with send in progress. This is > + * inside the i_mutex so the error handling that has to drop the bit > + * again is not run concurrently. > + */ > + spin_lock(>root_item_lock); > + root_flags = btrfs_root_flags(>root_item); > + if (dest->send_in_progress == 0) { > + btrfs_set_root_flags(>root_item, > + root_flags | BTRFS_ROOT_SUBVOL_DEAD); > + spin_unlock(>root_item_lock); > + } else { > + spin_unlock(>root_item_lock); > + btrfs_warn(fs_info, > +"Attempt to delete subvolume %llu during send", > +dest->root_key.objectid); > + err = -EPERM; > + return err; > + } > + > + down_write(_info->subvol_sem); > + > + err = may_destroy_subvol(dest); > + if (err) > + goto out_up_write; > + > + btrfs_init_block_rsv(_rsv, BTRFS_BLOCK_RSV_TEMP); > + /* > + * One for dir inode, two for dir entries, two for root > + * ref/backref. > + */ > + err = btrfs_subvolume_reserve_metadata(root, _rsv, > +5, _reserved, true); > + if (err) > + goto out_up_write; > + > + trans = btrfs_start_transaction(root, 0); > + if (IS_ERR(trans)) { > + err = PTR_ERR(trans); > + goto out_release; > + } > + trans->block_rsv = _rsv; > + trans->bytes_reserved = block_rsv.size; > + > + btrfs_record_snapshot_destroy(trans, BTRFS_I(dir)); > + > + ret = btrfs_unlink_subvol(trans, root, dir, > + dest->root_key.objectid, > + dentry->d_name.name, > + dentry->d_name.len); > + if (ret) { > + err = ret; > + btrfs_abort_transaction(trans, ret); > + goto out_end_trans; > + } > + > + btrfs_record_root_in_trans(trans, dest); > + > + memset(>root_item.drop_progress, 0, > + sizeof(dest->root_item.drop_progress)); > + dest->root_item.drop_level = 0; > + btrfs_set_root_refs(>root_item, 0); > + > + if (!test_and_set_bit(BTRFS_ROOT_ORPHAN_ITEM_INSERTED, >state)) { > + ret = btrfs_insert_orphan_item(trans, > + fs_info->tree_root, > + dest->root_key.objectid); > + if (ret) { > + btrfs_abort_transaction(trans, ret); > + err = ret; > + goto out_end_trans; > + } > + } > + > + ret = btrfs_uuid_tree_rem(trans, fs_info, dest->root_item.uuid, > + BTRFS_UUID_KEY_SUBVOL, > + dest->root_key.objectid); > + if (ret && ret != -ENOENT) { > + btrfs_abort_transaction(trans, ret); > + err = ret; > + goto out_end_trans; > + } > + if (!btrfs_is_empty_uuid(dest->root_item.received_uuid)) { > + ret = btrfs_uuid_tree_rem(trans, fs_info, > + dest->root_item.received_uuid, > +
Re: [PATCH 7/8] btrfs: verify checksum for all devices in mount context
On 26.03.2018 11:27, Anand Jain wrote: > During mount context, we aren't verifying the superblock checksum > for all the devices, instead, we verify it only for the > struct btrfs_fs_device::latest_bdev. This patch fixes it > by moving the checksum verification code from the function open_ctree() > into the function btrfs_read_dev_one_super(). > > By doing this now we are verifying the superblock checksum > in the mount-context, device-replace and, device-delete context. Which call chain provides the device-replace and device-delete contexts? I think it is worth it documenting them in the changelog. > > Signed-off-by: Anand JainReviewed-by: Nikolay Borisov > --- > fs/btrfs/disk-io.c | 35 +-- > 1 file changed, 17 insertions(+), 18 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 6299ab18da5f..3cc50041c0b9 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -2565,24 +2565,6 @@ int open_ctree(struct super_block *sb, > } > > /* > - * We want to check superblock checksum, the type is stored inside. > - * Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). > - */ > - err = btrfs_check_super_csum(bh->b_data); > - if (err) { > - if (err == -EINVAL) > - pr_err("BTRFS error (device %pg): "\ > - "unsupported checksum algorithm", > - fs_devices->latest_bdev); > - else > - pr_err("BTRFS error (device %pg): "\ > - "superblock checksum mismatch", > - fs_devices->latest_bdev); > - brelse(bh); > - goto fail_alloc; > - } > - > - /* >* super_copy is zeroed at allocation time and we never touch the >* following bytes up to INFO_SIZE, the checksum is calculated from >* the whole block of INFO_SIZE > @@ -3128,6 +3110,7 @@ int btrfs_read_dev_one_super(struct block_device *bdev, > int copy_num, > struct buffer_head *bh; > struct btrfs_super_block *super; > u64 bytenr; > + int err; > > bytenr = btrfs_sb_offset(copy_num); > if (bytenr + BTRFS_SUPER_INFO_SIZE >= i_size_read(bdev->bd_inode)) > @@ -3148,6 +3131,22 @@ int btrfs_read_dev_one_super(struct block_device > *bdev, int copy_num, > return -EINVAL; > } > > + /* > + * Check the superblock checksum, the type is stored inside. > + * Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). > + */ > + err = btrfs_check_super_csum(bh->b_data); > + if (err) { > + if (err == -EINVAL) > + pr_err("BTRFS error (device %pg): unsupported checksum > algorithm", > + bdev); > + else if (err == -EUCLEAN) > + pr_err("BTRFS error (device %pg): superblock checksum > mismatch", > + bdev); > + brelse(bh); > + return err; > + } > + > *bh_ret = bh; > return 0; > } > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Corruption on Big Endian System
On Mon, Mar 26, 2018 at 10:00:04AM -0500, Ashu Tiwary wrote: > It appears my system may have hit the issue reverted here ( > https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg74621.html > ) ( [PATCH] Revert "btrfs: use proper endianness accessors for > super_copy" ); system is an IBM OpenPower 720 (Big Endian) running > Fedora 27; kernel was at 4.15.9; attempting to reboot after updating > system (including kernel to 4.15.10) caused system to not be able to > boot: > > = > Mounting /sysroot... > [ 34.644721] BTRFS warning (device dm-3): suspicious: generation < > chunk_root_generation: 15959351903540740096 < 17261735521269317632 > [ 34.644761] BTRFS info (device dm-3): disk space caching is enabled > [ 34.644771] BTRFS info (device dm-3): has skinny extents > [ 34.645925] BTRFS critical (device dm-3): unable to find logical > 71472550772736 length 65536 > [ 34.645941] BTRFS critical (device dm-3): unable to find logical > 71472550772736 length 65536 > [ 34.645952] BTRFS error (device dm-3): failed to read chunk root > [ 34.807156] BTRFS error (device dm-3): open_ctree failed > [FAILED] Failed to mount /sysroot. > = > > It appears there is a manual method available to repair the corrupted > superblock: can that be made available? Yes, details below. Quick check from your logs: chunk_root_generation: 15959351903540740096 = 0xdd7b0200 should be: 162781 = 0x0x27bdd tree root block pointer: 71472550772736 = 0x4101 should be: 21037056 = 0x141 The tool is available in the branch devel in my repos. Setup repository unless you already have one: $ git clone git://github.com/kdave/btrfs-progs $ cd btrfs-progs $ git checkout devel $ ./autogen.sh $ ./configure Build the tool: $ make btrfs-sb-mod Use it (replace the path with the real one) and save the output in case something goes unexpectedly wrong: device=/path/to/the/device ./btrfs-sb-mod $device root @0 ./btrfs-sb-mod $device generation @0 ./btrfs-sb-mod $device chunk_root @0 ./btrfs-sb-mod $device chunk_root_generation @0 ./btrfs-sb-mod $device cache_generation @0 ./btrfs-sb-mod $device uuid_tree_generation @0 It prints the current and new values so it's reversible, besides that the byteswap can be run twice to get back to the starting point. Then the filesystem should be mountable again. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/8] btrfs: make btrfs_check_super_csum() non-static
On 26.03.2018 11:27, Anand Jain wrote: > In preparation to add the superblock csum verification for the > scan context, make btrfs_check_super_csum() non-static. > > Signed-off-by: Anand JainReviewed-by: Nikolay Borisov > --- > fs/btrfs/disk-io.c | 2 +- > fs/btrfs/disk-io.h | 1 + > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 8b4e602ed60a..6299ab18da5f 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -395,7 +395,7 @@ static int verify_parent_transid(struct extent_io_tree > *io_tree, > * Otherwise: -EINVAL if csum type is not found > * -EUCLEAN if csum does not match > */ > -static int btrfs_check_super_csum(char *raw_disk_sb) > +int btrfs_check_super_csum(char *raw_disk_sb) > { > struct btrfs_super_block *disk_sb = > (struct btrfs_super_block *)raw_disk_sb; > diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h > index 70a88d61b547..c400cc68f913 100644 > --- a/fs/btrfs/disk-io.h > +++ b/fs/btrfs/disk-io.h > @@ -69,6 +69,7 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int > max_mirrors); > struct buffer_head *btrfs_read_dev_super(struct block_device *bdev); > int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num, > struct buffer_head **bh_ret); > +int btrfs_check_super_csum(char *raw_disk_sb); > int btrfs_commit_super(struct btrfs_fs_info *fs_info); > struct btrfs_root *btrfs_read_fs_root(struct btrfs_root *tree_root, > struct btrfs_key *location); > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/8] btrfs: verify checksum when superblock is read for scan
On 27.03.2018 14:30, Nikolay Borisov wrote: > > > On 26.03.2018 11:27, Anand Jain wrote: >> During the scan context, we aren't verifying the superblock- >> checksum when read. >> This patch fixes it by adding the checksum verification function >> btrfs_check_super_csum() in the function btrfs_read_disk_super(). >> And makes device scan to error fail if the primary superblock csum >> is wrong, whereas if the copy-superblock csum is wrong it will just >> just report missmatch and continue mount/scan as usual. When the >> mount is successful We anyway overwrite all superblocks upon unmount. >> >> The context in which this will be called is - device scan, device ready, >> and mount -o device option. >> >> Test script: >> >> Corrupt primary superblock and check if device scan and mount >> fails: >> mkfs.btrfs -fq /dev/sdc >> dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K >> btrfs dev scan >> mount /dev/sdb /btrfs >> >> Corrupt secondary superblock and check if device scan and mount >> is succcessful, check for the dmesg for errors. >> mkfs.btrfs -fq /dev/sdc >> dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K >> btrfs dev scan >> mount /dev/sdb /btrfs > > Have you considered adding fstests, it will be very easy to test for > this behavior? > >> >> Signed-off-by: Anand Jain>> --- >> fs/btrfs/volumes.c | 15 +++ >> 1 file changed, 15 insertions(+) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index 45dd0674571b..384e503c83ff 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -1149,6 +1149,7 @@ static int btrfs_read_disk_super(struct block_device >> *bdev, u64 bytenr, >> struct page **page, >> struct btrfs_super_block **disk_super) >> { >> +int err; >> void *p; >> pgoff_t index; >> >> @@ -1177,6 +1178,20 @@ static int btrfs_read_disk_super(struct block_device >> *bdev, u64 bytenr, >> /* align our pointer to the offset of the super block */ >> *disk_super = p + (bytenr & ~PAGE_MASK); >> >> +err = btrfs_check_super_csum((char *) *disk_super); >> +if (err) { > if (err < 0) >> +if (err == -EINVAL) >> +pr_err("BTRFS error (device %pg): "\ >> +"unsupported checksum type, bytenr=%llu", >> +bdev, bytenr); > > Same comment as before regarding string literals splitting across lines. > >> +else >> +pr_err("BTRFS error (device %pg): "\ >> +"superblock checksum failed, bytenr=%llu", >> +bdev, bytenr); >> +btrfs_release_disk_super(*page); >> +return err; >> +} Also it will be better to have the checksum check after we have verified some basic invariants - that bytenr and magic have sane values. >> + >> if (btrfs_super_bytenr(*disk_super) != bytenr || >> btrfs_super_magic(*disk_super) != BTRFS_MAGIC) { >> btrfs_release_disk_super(*page); >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] btrfs: cleanup btrfs_check_super_csum() for better code flow
On 26.03.2018 11:27, Anand Jain wrote: > We check the %csum_type twice. Drop one. Check for the csum_type and > then for the csum. Which also matches with the progs which have better > logic. This is preparatory patch to get proper error code from > btrfs_check_super_csum(). > > Signed-off-by: Anand JainReviewed-by: Nikolay Borisov > --- > fs/btrfs/disk-io.c | 38 +- > 1 file changed, 17 insertions(+), 21 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 1657d6aa4fa6..b9b435579527 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -399,32 +399,28 @@ static int btrfs_check_super_csum(struct btrfs_fs_info > *fs_info, > struct btrfs_super_block *disk_sb = > (struct btrfs_super_block *)raw_disk_sb; > u16 csum_type = btrfs_super_csum_type(disk_sb); > - int ret = 0; > - > - if (csum_type == BTRFS_CSUM_TYPE_CRC32) { > - u32 crc = ~(u32)0; > - char result[sizeof(crc)]; > - > - /* > - * The super_block structure does not span the whole > - * BTRFS_SUPER_INFO_SIZE range, we expect that the unused space > - * is filled with zeros and is included in the checksum. > - */ > - crc = btrfs_csum_data(raw_disk_sb + BTRFS_CSUM_SIZE, > - crc, BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE); > - btrfs_csum_final(crc, result); > - > - if (memcmp(raw_disk_sb, result, sizeof(result))) > - ret = 1; > - } > + u32 crc = ~(u32)0; > + char result[sizeof(crc)]; > > if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) { > btrfs_err(fs_info, "unsupported checksum algorithm %u", > - csum_type); > - ret = 1; > + csum_type); > + return 1; > } > > - return ret; > + /* > + * The super_block structure does not span the whole > + * BTRFS_SUPER_INFO_SIZE range, we expect that the unused space > + * is filled with zeros and is included in the checksum. > + */ > + crc = btrfs_csum_data(raw_disk_sb + BTRFS_CSUM_SIZE, > + crc, BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE); > + btrfs_csum_final(crc, result); > + > + if (memcmp(raw_disk_sb, result, sizeof(result))) > + return 1; > + > + return 0; > } > > /* > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/8] btrfs: verify checksum when superblock is read for scan
On 26.03.2018 11:27, Anand Jain wrote: > During the scan context, we aren't verifying the superblock- > checksum when read. > This patch fixes it by adding the checksum verification function > btrfs_check_super_csum() in the function btrfs_read_disk_super(). > And makes device scan to error fail if the primary superblock csum > is wrong, whereas if the copy-superblock csum is wrong it will just > just report missmatch and continue mount/scan as usual. When the > mount is successful We anyway overwrite all superblocks upon unmount. > > The context in which this will be called is - device scan, device ready, > and mount -o device option. > > Test script: > > Corrupt primary superblock and check if device scan and mount > fails: > mkfs.btrfs -fq /dev/sdc > dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K > btrfs dev scan > mount /dev/sdb /btrfs > > Corrupt secondary superblock and check if device scan and mount > is succcessful, check for the dmesg for errors. > mkfs.btrfs -fq /dev/sdc > dd if=/dev/urandom of=/dev/sdc ibs=1 obs=1 count=1 seek=64K > btrfs dev scan > mount /dev/sdb /btrfs Have you considered adding fstests, it will be very easy to test for this behavior? > > Signed-off-by: Anand Jain> --- > fs/btrfs/volumes.c | 15 +++ > 1 file changed, 15 insertions(+) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 45dd0674571b..384e503c83ff 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1149,6 +1149,7 @@ static int btrfs_read_disk_super(struct block_device > *bdev, u64 bytenr, >struct page **page, >struct btrfs_super_block **disk_super) > { > + int err; > void *p; > pgoff_t index; > > @@ -1177,6 +1178,20 @@ static int btrfs_read_disk_super(struct block_device > *bdev, u64 bytenr, > /* align our pointer to the offset of the super block */ > *disk_super = p + (bytenr & ~PAGE_MASK); > > + err = btrfs_check_super_csum((char *) *disk_super); > + if (err) { if (err < 0) > + if (err == -EINVAL) > + pr_err("BTRFS error (device %pg): "\ > + "unsupported checksum type, bytenr=%llu", > + bdev, bytenr); Same comment as before regarding string literals splitting across lines. > + else > + pr_err("BTRFS error (device %pg): "\ > + "superblock checksum failed, bytenr=%llu", > + bdev, bytenr); > + btrfs_release_disk_super(*page); > + return err; > + } > + > if (btrfs_super_bytenr(*disk_super) != bytenr || > btrfs_super_magic(*disk_super) != BTRFS_MAGIC) { > btrfs_release_disk_super(*page); > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/8] btrfs: check if the fsid in the primary sb and copy sb are same
On 26.03.2018 11:27, Anand Jain wrote: > During the btrfs dev scan make sure that other copies of superblock > contain the same fsid as the primary SB. So that we bring to the > user notice if the superblock has been overwritten. > > mkfs.btrfs -fq /dev/sdc > mkfs.btrfs -fq /dev/sdb > dd if=/dev/sdb of=/dev/sdc count=4K skip=64K seek=64K obs=1 ibs=1 > mount /dev/sdc /btrfs > > Caveat: Pls note that older btrfs-progs do not wipe the non-overwriting > stale superblock like copy2 if a smaller mkfs.btrfs -bis created. > So this patch in the kernel will report error. The workaround is to wipe > the superblock manually, like > dd if=/dev/zero of= seek=274877906944 ibs=1 obs=1 count4K > OR apply btrfs-progs patch > btrfs-progs: wipe copies of the stale superblock beyond -b size > which shall find and wipe the non overwriting superblock. > > Signed-off-by: Anand Jain> --- > fs/btrfs/volumes.c | 60 > ++ > 1 file changed, 47 insertions(+), 13 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index ed22f0a3d239..45dd0674571b 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1198,40 +1198,74 @@ static int btrfs_read_disk_super(struct block_device > *bdev, u64 bytenr, > int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, > struct btrfs_fs_devices **fs_devices_ret) > { > + struct btrfs_super_block *disk_super_primary; > struct btrfs_super_block *disk_super; > struct btrfs_device *device; > struct block_device *bdev; > struct page *page; > int ret = 0; > - u64 bytenr; > + int i; > > - /* > - * we would like to check all the supers, but that would make > - * a btrfs mount succeed after a mkfs from a different FS. > - * So, we need to add a special mount option to scan for > - * later supers, using BTRFS_SUPER_MIRROR_MAX instead > - */ > - bytenr = btrfs_sb_offset(0); > flags |= FMODE_EXCL; > > bdev = blkdev_get_by_path(path, flags, holder); > if (IS_ERR(bdev)) > return PTR_ERR(bdev); > > - ret = btrfs_read_disk_super(bdev, bytenr, , _super); > - if (ret) > + disk_super_primary = kzalloc(sizeof(*disk_super_primary), GFP_KERNEL); > + if (!disk_super_primary) { > + ret = -ENOMEM; > goto error_bdev_put; > + } > + > + /* > + * We would like to check all the supers and use one good copy, > + * but that would make a btrfs mount succeed after a mkfs from > + * a different FS. > + * So, we need to add a special mount option to scan for > + * later supers, using BTRFS_SUPER_MIRROR_MAX instead. > + * So, just validate if all copies of the superblocks are ok > + * and have the same fsid. > + */ > + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { > + u64 bytenr = btrfs_sb_offset(i); > + > + ret = btrfs_read_disk_super(bdev, bytenr, , _super); > + if (ret) { > + if (i == 0) > + goto error_kfree; > + /* copy2 is optional */ > + ret = 0; > + continue; > + } > + > + if (i == 0) { > + memcpy(disk_super_primary, disk_super, > +sizeof(*disk_super_primary)); > + btrfs_release_disk_super(page); > + continue; Doing the memcpy is enough here, the bottom of the loop already releases the disk page and continues on the next iteration. > + } else if (memcmp(disk_super_primary->fsid, disk_super->fsid, > +BTRFS_FSID_SIZE)) { > + pr_err("BTRFS (device %pg): superblock fsid missmatch "\ > + "primary %pU copy%d %pU", bdev, > + disk_super_primary->fsid, i, disk_super->fsid); > + ret = -EINVAL; > + btrfs_release_disk_super(page); > + goto error_kfree; > + } > + btrfs_release_disk_super(page); I'd say split the "read first sb" from the loop, because alway want to read it and return an error if it fails. And then have the loop begin at i = 1 and handle only the possible mirrors of the sb. That would clean up the nested 'if' in handling the ret. Also you could introduce another struct *page primary_page where you read the first super block. That way you save a memcpy + kzalloc but you'd have to always free it on function exit so I am not sure how much value it brings in terms of readability. > + } > > mutex_lock(_mutex); > - device = device_list_add(path, disk_super); > + device = device_list_add(path, disk_super_primary); > if (IS_ERR(device)) > ret = PTR_ERR(device); > else >
[PATCH] fstests: test btrfs fsync after hole punching with no-holes mode
From: Filipe MananaTest that when we have the no-holes mode enabled and a specific metadata layout, if we punch a hole and fsync the file, at replay time the whole hole was preserved. This issue is fixed by the following btrfs patch for the linux kernel: "Btrfs: fix fsync after hole punching when using no-holes feature" Signed-off-by: Filipe Manana --- tests/btrfs/159 | 100 tests/btrfs/159.out | 5 +++ tests/btrfs/group | 1 + 3 files changed, 106 insertions(+) create mode 100755 tests/btrfs/159 create mode 100644 tests/btrfs/159.out diff --git a/tests/btrfs/159 b/tests/btrfs/159 new file mode 100755 index ..6083975a --- /dev/null +++ b/tests/btrfs/159 @@ -0,0 +1,100 @@ +#! /bin/bash +# FSQA Test No. 159 +# +# Test that when we have the no-holes mode enabled and a specific metadata +# layout, if we punch a hole and fsync the file, at replay time the whole +# hole was preserved. +# +#--- +# +# Copyright (C) 2018 SUSE Linux Products GmbH. All Rights Reserved. +# Author: Filipe Manana +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + _cleanup_flakey + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/dmflakey + +# real QA test starts here +_supported_fs generic +_supported_os Linux +_require_scratch +_require_dm_target flakey +_require_xfs_io_command "fpunch" + +rm -f $seqres.full + +# We create the filesystem with a node size of 64Kb because we need to create a +# specific metadata layout in order to trigger the bug we are testing. At the +# moment the node size can not be smaller then the system's page size, so given +# that the largest possible page size is 64Kb and by default the node size is +# set to the system's page size value, we explictly create a filesystem with a +# 64Kb node size. +_scratch_mkfs -O no-holes -n $((64 * 1024)) >>$seqres.full 2>&1 +_require_metadata_journaling $SCRATCH_DEV +_init_flakey +_mount_flakey + +# Create our test file with 831 extents of 256Kb each. Before each extent, there +# is a 256Kb hole (except for the first extent, which starts at offset 0). This +# creates two leafs in the filesystem tree, with the extent at offset 216530944 +# being the last item in the first leaf and the extent at offset 217055232 being +# the first item in the second leaf. +for ((i = 0; i <= 831; i++)); do + offset=$((i * 2 * 256 * 1024)) + $XFS_IO_PROG -f -c "pwrite -S 0xab -b 256K $offset 256K" \ + $SCRATCH_MNT/foobar >/dev/null +done + +# Now persist everything done so far. +sync + +# Now punch a hole that covers part of the extent at offset 216530944. +$XFS_IO_PROG -c "fpunch $((216530944 + 128 * 1024 - 4000)) 256K" \ +-c "fsync" \ +$SCRATCH_MNT/foobar + +echo "File digest before power failure:" +md5sum $SCRATCH_MNT/foobar | _filter_scratch + +# Simulate a power failure and mount the filesystem to check that replaying the +# fsync log/journal succeeds and our test file has the expected content. +_flakey_drop_and_remount + +echo "File digest after power failure and log replay:" +md5sum $SCRATCH_MNT/foobar | _filter_scratch + +_unmount_flakey +_cleanup_flakey + +status=0 +exit diff --git a/tests/btrfs/159.out b/tests/btrfs/159.out new file mode 100644 index ..3317e516 --- /dev/null +++ b/tests/btrfs/159.out @@ -0,0 +1,5 @@ +QA output created by 159 +File digest before power failure: +c5c0a13588a639529c979c57c336f441 SCRATCH_MNT/foobar +File digest after power failure and log replay: +c5c0a13588a639529c979c57c336f441 SCRATCH_MNT/foobar diff --git a/tests/btrfs/group b/tests/btrfs/group index 8007e07e..ba766f6b 100644 --- a/tests/btrfs/group +++ b/tests/btrfs/group @@ -161,3 +161,4 @@ 156 auto quick trim 157 auto quick raid 158 auto quick raid scrub +159 auto quick -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to
[PATCH 2/2] Btrfs: fix copy_items() return value when logging an inode
From: Filipe MananaWhen logging an inode, at tree-log.c:copy_items(), if we call btrfs_next_leaf() at the loop which checks for the need to log holes, we need to make sure copy_items() returns the value 1 to its caller and not 0 (on success). This is because the path the caller passed was released and is now different from what is was before, and the caller expects a return value of 0 to mean both success and that the path has not changed, while a return value of 1 means both success and signals the caller that it can not reuse the path, it has to perform another tree search. Even though this is a case that should not be triggered on normal circumstances or very rare at least, its consequences can be very unpredictable (especially when replaying a log tree). Fixes: 16e7549f045d ("Btrfs: incompatible format change to remove hole extents") Signed-off-by: Filipe Manana --- fs/btrfs/tree-log.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 1d738ff5c41b..abd9fd81cd1c 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -3960,6 +3960,7 @@ static noinline int copy_items(struct btrfs_trans_handle *trans, ASSERT(ret == 0); src = src_path->nodes[0]; i = 0; + need_find_last_extent = true; } btrfs_item_key_to_cpu(src, , i); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Btrfs: fix fsync after hole punching when using no-holes feature
From: Filipe MananaWhen we have the no-holes mode enabled and fsync a file after punching a hole in it, we can end up not logging the whole hole range in the log tree. This happens if the file has extent items that span more than one leaf and we punch a hole that covers a range that starts in a leaf but does not go beyond the offset of the first extent in the next leaf. Example: $ mkfs.btrfs -f -O no-holes -n 65536 /dev/sdb $ mount /dev/sdb /mnt $ for ((i = 0; i <= 831; i++)); do offset=$((i * 2 * 256 * 1024)) xfs_io -f -c "pwrite -S 0xab -b 256K $offset 256K" \ /mnt/foobar >/dev/null done $ sync # We now have 2 leafs in our filesystem fs tree, the first leaf has an # item corresponding the extent at file offset 216530944 and the second # leaf has a first item corresponding to the extent at offset 217055232. # Now we punch a hole that partially covers the range of the extent at # offset 216530944 but does go beyond the offset 217055232. $ xfs_io -c "fpunch $((216530944 + 128 * 1024 - 4000)) 256K" /mnt/foobar $ xfs_io -c "fsync" /mnt/foobar # mount to replay the log $ mount /dev/sdb /mnt # Before this patch, only the subrange [216658016, 216662016[ (length of # 4000 bytes) was logged, leaving an incorrect file layout after log # replay. Fix this by checking if there is a hole between the last extent item that we processed and the first extent item in the next leaf, and if there is one, log an explicit hole extent item. Fixes: 16e7549f045d ("Btrfs: incompatible format change to remove hole extents") Signed-off-by: Filipe Manana --- fs/btrfs/tree-log.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index fd573816f461..1d738ff5c41b 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -3994,6 +3994,36 @@ static noinline int copy_items(struct btrfs_trans_handle *trans, break; *last_extent = extent_end; } + + /* +* Check if there is a hole between the last extent found in our leaf +* and the first extent in the next leaf. If there is one, we need to +* log an explicit hole so that at replay time we can punch the hole. +*/ + if (ret == 0 && + key.objectid == btrfs_ino(inode) && + key.type == BTRFS_EXTENT_DATA_KEY && + i == btrfs_header_nritems(src_path->nodes[0])) { + ret = btrfs_next_leaf(inode->root, src_path); + need_find_last_extent = true; + if (ret > 0) { + ret = 0; + } else if (ret == 0) { + btrfs_item_key_to_cpu(src_path->nodes[0], , + src_path->slots[0]); + if (key.objectid == btrfs_ino(inode) && + key.type == BTRFS_EXTENT_DATA_KEY && + *last_extent < key.offset) { + const u64 len = key.offset - *last_extent; + + ret = btrfs_insert_file_extent(trans, log, + btrfs_ino(inode), + *last_extent, 0, + 0, len, 0, len, + 0, 0, 0); + } + } + } /* * Need to let the callers know we dropped the path so they should * re-search. -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/8] btrfs: cleanup btrfs_read_disk_super() to return std error
On 26.03.2018 11:27, Anand Jain wrote: > The only caller btrfs_scan_one_device() sets -EINVAL for error from > btrfs_read_disk_super(), so this patch returns -EINVAL from the latter > function. > > A preparatory patch to add csum check in btrfs_read_disk_super(). > > Signed-off-by: Anand Jain> --- > fs/btrfs/volumes.c | 15 +++ > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 87d183accdb2..ed22f0a3d239 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -1154,23 +1154,23 @@ static int btrfs_read_disk_super(struct block_device > *bdev, u64 bytenr, > > /* make sure our super fits in the device */ > if (bytenr + PAGE_SIZE >= i_size_read(bdev->bd_inode)) > - return 1; > + return -EINVAL; > > /* make sure our super fits in the page */ > if (sizeof(**disk_super) > PAGE_SIZE) > - return 1; > + return -EINVAL; > > /* make sure our super doesn't straddle pages on disk */ > index = bytenr >> PAGE_SHIFT; > if ((bytenr + sizeof(**disk_super) - 1) >> PAGE_SHIFT != index) > - return 1; > + return -EINVAL; > > /* pull in the page with our super */ > *page = read_cache_page_gfp(bdev->bd_inode->i_mapping, > index, GFP_KERNEL); > > if (IS_ERR_OR_NULL(*page)) > - return 1; > + return -EINVAL; > > p = kmap(*page); > > @@ -1180,7 +1180,7 @@ static int btrfs_read_disk_super(struct block_device > *bdev, u64 bytenr, > if (btrfs_super_bytenr(*disk_super) != bytenr || > btrfs_super_magic(*disk_super) != BTRFS_MAGIC) { > btrfs_release_disk_super(*page); > - return 1; > + return -EINVAL; > } > > if ((*disk_super)->label[0] && > @@ -1218,10 +1218,9 @@ int btrfs_scan_one_device(const char *path, fmode_t > flags, void *holder, > if (IS_ERR(bdev)) > return PTR_ERR(bdev); > > - if (btrfs_read_disk_super(bdev, bytenr, , _super)) { > - ret = -EINVAL; > + ret = btrfs_read_disk_super(bdev, bytenr, , _super); > + if (ret) I'd rather have explicit ret < 0 check. > goto error_bdev_put; > - } > > mutex_lock(_mutex); > device = device_list_add(path, disk_super); > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/8] btrfs: return required error from btrfs_check_super_csum
On 26.03.2018 11:27, Anand Jain wrote: > Return the required -EINVAL and -EUCLEAN from the function > btrfs_check_super_csum(). And more the error log into the > parent function. > > Signed-off-by: Anand Jain> --- > fs/btrfs/disk-io.c | 27 --- > 1 file changed, 16 insertions(+), 11 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index b9b435579527..8b4e602ed60a 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -392,9 +392,10 @@ static int verify_parent_transid(struct extent_io_tree > *io_tree, > /* > * Return 0 if the superblock checksum type matches the checksum value of > that > * algorithm. Pass the raw disk superblock data. > + * Otherwise: -EINVAL if csum type is not found > + * -EUCLEAN if csum does not match > */ > -static int btrfs_check_super_csum(struct btrfs_fs_info *fs_info, > - char *raw_disk_sb) > +static int btrfs_check_super_csum(char *raw_disk_sb) > { > struct btrfs_super_block *disk_sb = > (struct btrfs_super_block *)raw_disk_sb; > @@ -402,11 +403,8 @@ static int btrfs_check_super_csum(struct btrfs_fs_info > *fs_info, > u32 crc = ~(u32)0; > char result[sizeof(crc)]; > > - if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) { > - btrfs_err(fs_info, "unsupported checksum algorithm %u", > - csum_type); > - return 1; > - } > + if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) > + return -EINVAL; > > /* >* The super_block structure does not span the whole > @@ -418,7 +416,7 @@ static int btrfs_check_super_csum(struct btrfs_fs_info > *fs_info, > btrfs_csum_final(crc, result); > > if (memcmp(raw_disk_sb, result, sizeof(result))) > - return 1; > + return -EUCLEAN; > > return 0; > } > @@ -2570,9 +2568,16 @@ int open_ctree(struct super_block *sb, >* We want to check superblock checksum, the type is stored inside. >* Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k). >*/ > - if (btrfs_check_super_csum(fs_info, bh->b_data)) { > - btrfs_err(fs_info, "superblock checksum mismatch"); > - err = -EINVAL; > + err = btrfs_check_super_csum(bh->b_data); > + if (err) { > + if (err == -EINVAL) > + pr_err("BTRFS error (device %pg): "\ > + "unsupported checksum algorithm", > + fs_devices->latest_bdev); I don't think strings should be idented. I.e the correct thing here should be to have only fs_devices->latest_bdev on a new line, even if the string goes above the char limit of 80. In any case the \ is not needed due to gcc's support for string literal concatenation: pr_err("BTRFS error (device %pg): " "unsupported checksum algorithm", fs_devices->latest_bdev) should work equally well. But as I said I don't think this is even needed. Let's wait and see what David says. > + else > + pr_err("BTRFS error (device %pg): "\ > + "superblock checksum mismatch", > + fs_devices->latest_bdev); > brelse(bh); > goto fail_alloc; > } > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/2] btrfs-progs: print-tree: Use macro to replace immediate number for readable flags string buffer length
On 27.03.2018 09:04, Qu Wenruo wrote: > In print-tree, we have a lot of parsers to convert numeric flags to > human readable string. > > For the buffer size we're using immediate numbers for all their callers. > Change this to macro so it will be much easier for us to expand the > buffer size. > > Signed-off-by: Qu WenruoReviewed-by: Nikolay Borisov > --- > changelog: > v2: > Move all buffer length macros to the header part. > Change comment to refer to the buffer length marco for later > modication. > --- > print-tree.c | 50 +++--- > 1 file changed, 39 insertions(+), 11 deletions(-) > > diff --git a/print-tree.c b/print-tree.c > index a2f6bfc027c9..064f1bb9ff8c 100644 > --- a/print-tree.c > +++ b/print-tree.c > @@ -26,6 +26,12 @@ > #include "print-tree.h" > #include "utils.h" > > +#define BG_FLAGS_BUF_LEN 64 /* for bg_flags_to_str() */ > +#define QGROUP_FLAGS_BUF_LEN 32 /* for qgroup_flags_to_str() */ > +#define EXTENT_FLAGS_BUF_LEN 32 /* for extent_flags_to_str() */ > +#define ROOT_FLAGS_BUF_LEN 16 /* for root_flags_to_str() */ > +#define INODE_FLAGS_BUF_LEN 128 /* for inode_flags_to_str() */ > +#define HEADER_FLAGS_BUF_LEN 32 /* for header_flags_to_str() */ > > static void print_dir_item_type(struct extent_buffer *eb, > struct btrfs_dir_item *di) > @@ -137,7 +143,13 @@ static void print_inode_ref_item(struct extent_buffer > *eb, u32 size, > } > } > > -/* Caller should ensure sizeof(*ret)>=21 "DATA|METADATA|RAID10" */ > +/* > + * Caller should ensure sizeof(*ret)>=21 "DATA|METADATA|RAID10" > + * Here we bump up to 64 bytes to handle the worst (and invalid) case: > + * "DATA|METADATA|SYSTEM|RAID0|RAID1|DUP|RAID10|RAID5|RAID6" > + * > + * BG_FLAGS_BUF_LEN is ensured to have enough space. > + */ > static void bg_flags_to_str(u64 flags, char *ret) > { > int empty = 1; > @@ -180,7 +192,11 @@ static void bg_flags_to_str(u64 flags, char *ret) > } > } > > -/* Caller should ensure sizeof(*ret)>= 26 "OFF|SCANNING|INCONSISTENT" */ > +/* > + * Caller should ensure sizeof(*ret)>= 26 "OFF|SCANNING|INCONSISTENT" > + * > + * QGROUP_FLAGS_BUF_LEN is ensured to have enough space. > + */ > static void qgroup_flags_to_str(u64 flags, char *ret) > { > if (flags & BTRFS_QGROUP_STATUS_FLAG_ON) > @@ -199,7 +215,7 @@ void print_chunk_item(struct extent_buffer *eb, struct > btrfs_chunk *chunk) > u16 num_stripes = btrfs_chunk_num_stripes(eb, chunk); > int i; > u32 chunk_item_size; > - char chunk_flags_str[32] = {0}; > + char chunk_flags_str[BG_FLAGS_BUF_LEN] = {0}; > > /* The chunk must contain at least one stripe */ > if (num_stripes < 1) { > @@ -385,7 +401,11 @@ static void print_file_extent_item(struct extent_buffer > *eb, > compress_str); > } > > -/* Caller should ensure sizeof(*ret) >= 16("DATA|TREE_BLOCK") */ > +/* > + * Caller should ensure sizeof(*ret) > strlen("DATA|TREE_BLOCK|FULL_BACKREF") > + * > + * EXTENT_FLAGS_BUF_LEN is ensured to have enough space. > + */ > static void extent_flags_to_str(u64 flags, char *ret) > { > int empty = 1; > @@ -420,7 +440,7 @@ void print_extent_item(struct extent_buffer *eb, int > slot, int metadata) > u32 item_size = btrfs_item_size_nr(eb, slot); > u64 flags; > u64 offset; > - char flags_str[32] = {0}; > + char flags_str[EXTENT_FLAGS_BUF_LEN] = {0}; > > if (item_size < sizeof(*ei)) { > #ifdef BTRFS_COMPAT_EXTENT_TREE_V0 > @@ -542,6 +562,8 @@ static int empty_uuid(const u8 *uuid) > > /* > * Caller must ensure sizeof(*ret) >= 7 "RDONLY" > + * > + * ROOT_FLAGS_BUF_LEN is ensured to have enough space > */ > static void root_flags_to_str(u64 flags, char *ret) > { > @@ -577,7 +599,7 @@ static void print_root_item(struct extent_buffer *leaf, > int slot) > struct btrfs_root_item root_item; > int len; > char uuid_str[BTRFS_UUID_UNPARSED_SIZE]; > - char flags_str[32] = {0}; > + char flags_str[ROOT_FLAGS_BUF_LEN] = {0}; > struct btrfs_key drop_key; > > ri = btrfs_item_ptr(leaf, slot, struct btrfs_root_item); > @@ -887,6 +909,8 @@ static void print_uuid_item(struct extent_buffer *l, > unsigned long offset, > /* > * Caller should ensure sizeof(*ret) >= 102: all charactors plus '|' of > * BTRFS_INODE_* flags > + * > + * INODE_FLAGS_BUF_LEN is ensured to have enough space > */ > static void inode_flags_to_str(u64 flags, char *ret) > { > @@ -911,7 +935,7 @@ static void inode_flags_to_str(u64 flags, char *ret) > static void print_inode_item(struct extent_buffer *eb, > struct btrfs_inode_item *ii) > { > - char flags_str[256]; > + char flags_str[INODE_FLAGS_BUF_LEN]; > > memset(flags_str, 0, sizeof(flags_str)); > inode_flags_to_str(btrfs_inode_flags(eb, ii), flags_str); > @@ -1002,7 +1026,7 @@
[PATCH v2 1/4] btrfs-progs: Remove btrfs-debug-tree command
There is already a replacement in the face of btrfs inspect-internal dump-tree. And this command is just a simple wrapper around it. Just remove it and adjust the show-blocks script to call the main btrfs binary to achieve the same effect. Signed-off-by: Nikolay Borisov--- V2 (analogical changes to other patches so changelog is present only in the first patch): * Remove tool reference from .gitignore and Documentation/Makefile.in * Document that the standalone tools are removed in the STANDALONE section in the documentation. .gitignore | 1 - Documentation/Makefile.in| 1 - Documentation/btrfs.asciidoc | 3 ++- Makefile | 4 +--- btrfs-debug-tree.c | 41 - show-blocks | 2 +- 6 files changed, 4 insertions(+), 48 deletions(-) delete mode 100644 btrfs-debug-tree.c diff --git a/.gitignore b/.gitignore index 4ec04cb6dcf6..237a45182e8e 100644 --- a/.gitignore +++ b/.gitignore @@ -10,7 +10,6 @@ Documentation/*.gz Documentation/*.html btrfs btrfs.static -btrfs-debug-tree btrfs-map-logical btrfs-fragments btrfsck diff --git a/Documentation/Makefile.in b/Documentation/Makefile.in index 64947afb3f6f..0a90f803c020 100644 --- a/Documentation/Makefile.in +++ b/Documentation/Makefile.in @@ -103,7 +103,6 @@ install-man: man $(INSTALL) -m 644 $(GZ_MAN8) $(DESTDIR)$(man8dir) $(LN_S) -f btrfs-check.8.gz $(DESTDIR)$(man8dir)/btrfsck.8.gz $(LN_S) -f btrfs-rescue.8.gz $(DESTDIR)$(man8dir)/btrfs-zero-log.8.gz - $(LN_S) -f btrfs-inspect-internal.8.gz $(DESTDIR)$(man8dir)/btrfs-debug-tree.8.gz $(LN_S) -f btrfs-inspect-internal.8.gz $(DESTDIR)$(man8dir)/btrfs-show-super.8.gz uninstall: diff --git a/Documentation/btrfs.asciidoc b/Documentation/btrfs.asciidoc index 8253023444b6..5b328cd18dbf 100644 --- a/Documentation/btrfs.asciidoc +++ b/Documentation/btrfs.asciidoc @@ -120,7 +120,8 @@ long (years) depreciation period. Deprecated and obsolete tools: -*btrfs-debug-tree*:: moved to *btrfs inspect-internal dump-tree* +*btrfs-debug-tree*:: moved to *btrfs inspect-internal dump-tree*. Removed from +source distribution. *btrfs-show-super*:: moved to *btrfs inspect-internal dump-super* *btrfs-zero-log*:: moved to *btrfs rescue zero-log* diff --git a/Makefile b/Makefile index 6065522a615f..a7726e98947d 100644 --- a/Makefile +++ b/Makefile @@ -210,8 +210,7 @@ MAKEOPTS = --no-print-directory Q=$(Q) progs = $(progs_install) btrfsck btrfs-corrupt-block # install only selected -progs_install = btrfs mkfs.btrfs btrfs-debug-tree \ - btrfs-map-logical btrfs-image btrfs-zero-log \ +progs_install = btrfs mkfs.btrfs btrfs-map-logical btrfs-image btrfs-zero-log \ btrfs-find-root btrfstune \ btrfs-select-super @@ -229,7 +228,6 @@ endif btrfs_convert_cflags = -DBTRFSCONVERT_EXT2=$(BTRFSCONVERT_EXT2) btrfs_convert_cflags += -DBTRFSCONVERT_REISERFS=$(BTRFSCONVERT_REISERFS) btrfs_fragments_libs = -lgd -lpng -ljpeg -lfreetype -btrfs_debug_tree_objects = cmds-inspect-dump-tree.o btrfs_show_super_objects = cmds-inspect-dump-super.o btrfs_calc_size_objects = cmds-inspect-tree-stats.o cmds_restore_cflags = -DBTRFSRESTORE_ZSTD=$(BTRFSRESTORE_ZSTD) diff --git a/btrfs-debug-tree.c b/btrfs-debug-tree.c deleted file mode 100644 index 7bee018f604e.. --- a/btrfs-debug-tree.c +++ /dev/null @@ -1,41 +0,0 @@ -/* - * Copyright (C) 2007 Oracle. All rights reserved. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License v2 as published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this program; if not, write to the - * Free Software Foundation, Inc., 59 Temple Place - Suite 330, - * Boston, MA 021110-1307, USA. - */ - -#include "disk-io.h" -#include "volumes.h" -#include "utils.h" -#include "commands.h" -#include "help.h" - -int main(int argc, char **argv) -{ - int ret; - - set_argv0(argv); - - if (argc > 1 && !strcmp(argv[1], "--help")) - usage(cmd_inspect_dump_tree_usage); - - radix_tree_init(); - - ret = cmd_inspect_dump_tree(argc, argv); - - btrfs_close_all_devices(); - - return ret; -} diff --git a/show-blocks b/show-blocks index 49e1770e5bd0..3876b2acfa74 100755 --- a/show-blocks +++ b/show-blocks @@ -108,7 +108,7 @@ from pylab import * return X def run_debug_tree(device): -p = os.popen('btrfs-debug-tree -e ' + device) +p = os.popen('btrfs inspect-internal dump-tree -e ' + device) data = loaddata(p) return data -- 2.7.4 -- To
[PATCH v2 3/4] btrfs-progs: Remove deprecated btrfs-show-super
Its function has been superseded by btrfs inspect-internal show-super. Furthermore the tools is currently not built by default. Just remove it. Signed-off-by: Nikolay Borisov--- .gitignore| 1 - Documentation/Makefile.in | 1 - Documentation/btrfs-select-super.asciidoc | 3 +-- Documentation/btrfs.asciidoc | 3 ++- Makefile | 2 +- btrfs-show-super.c| 39 --- 6 files changed, 4 insertions(+), 45 deletions(-) delete mode 100644 btrfs-show-super.c diff --git a/.gitignore b/.gitignore index c0f85fc448c8..06db51f39512 100644 --- a/.gitignore +++ b/.gitignore @@ -27,7 +27,6 @@ btrfs-convert btrfs-find-root btrfs-find-root.static btrfs-image -btrfs-show-super btrfs-corrupt-block btrfs-select-super btrfs-calc-size diff --git a/Documentation/Makefile.in b/Documentation/Makefile.in index ee32a50dc068..184647c41940 100644 --- a/Documentation/Makefile.in +++ b/Documentation/Makefile.in @@ -102,7 +102,6 @@ install-man: man $(MV) $(DESTDIR)$(man5dir)/btrfs-man5.5.gz $(DESTDIR)$(man5dir)/btrfs.5.gz $(INSTALL) -m 644 $(GZ_MAN8) $(DESTDIR)$(man8dir) $(LN_S) -f btrfs-check.8.gz $(DESTDIR)$(man8dir)/btrfsck.8.gz - $(LN_S) -f btrfs-inspect-internal.8.gz $(DESTDIR)$(man8dir)/btrfs-show-super.8.gz uninstall: cd $(DESTDIR)$(man8dir); rm -f btrfs-check.8.gz $(GZ_MAN8) diff --git a/Documentation/btrfs-select-super.asciidoc b/Documentation/btrfs-select-super.asciidoc index 6e94a034f482..e3bca98b4f6e 100644 --- a/Documentation/btrfs-select-super.asciidoc +++ b/Documentation/btrfs-select-super.asciidoc @@ -21,8 +21,7 @@ The filesystem specified by 'device' must not be mounted. NOTE: *Prior to overwriting the primary superblock, please make sure that the backup copies are valid!* -To dump a superblock use the *btrfs inspect-internal -dump-super* command, or the obsolete command *btrfs-show-super*. +To dump a superblock use the *btrfs inspect-internal dump-super* command. Then run the check (in the non-repair mode) using the command *btrfs check -s* where '-s' specifies the superblock copy to use. diff --git a/Documentation/btrfs.asciidoc b/Documentation/btrfs.asciidoc index e89400a2e5b5..0c08950b38fd 100644 --- a/Documentation/btrfs.asciidoc +++ b/Documentation/btrfs.asciidoc @@ -122,7 +122,8 @@ long (years) depreciation period. *btrfs-debug-tree*:: moved to *btrfs inspect-internal dump-tree*. Removed from source distribution. -*btrfs-show-super*:: moved to *btrfs inspect-internal dump-super* +*btrfs-show-super*:: moved to *btrfs inspect-internal dump-super*, standalone +removed. *btrfs-zero-log*:: moved to *btrfs rescue zero-log*, standalone removed. EXIT STATUS diff --git a/Makefile b/Makefile index 2a2a76855a26..087289b756ef 100644 --- a/Makefile +++ b/Makefile @@ -215,7 +215,7 @@ progs_install = btrfs mkfs.btrfs btrfs-map-logical btrfs-image \ btrfs-select-super # other tools, not built by default -progs_extra = btrfs-fragments btrfs-calc-size btrfs-show-super +progs_extra = btrfs-fragments btrfs-calc-size progs_static = $(foreach p,$(progs),$(p).static) diff --git a/btrfs-show-super.c b/btrfs-show-super.c deleted file mode 100644 index 4273e42dca3d.. --- a/btrfs-show-super.c +++ /dev/null @@ -1,39 +0,0 @@ -/* - * Copyright (C) 2012 STRATO AG. All rights reserved. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License v2 as published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this program; if not, write to the - * Free Software Foundation, Inc., 59 Temple Place - Suite 330, - * Boston, MA 021110-1307, USA. - */ - -#include "utils.h" -#include "commands.h" -#include "help.h" - -int main(int argc, char **argv) -{ - - int ret; - - set_argv0(argv); - - warning( -"\nthe tool has been deprecated, please use 'btrfs inspect-internal dump-super' instead\n"); - - if (argc > 1 && !strcmp(argv[1], "--help")) - usage(cmd_inspect_dump_super_usage); - - ret = cmd_inspect_dump_super(argc, argv); - - return ret; -} -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/4] btrfs-progs: Remove deprecated btrfs-calc-size tool
Its function has been superseded by btrfs inspect-internal tree-stats. Just remove it. Signed-off-by: Nikolay Borisov--- .gitignore| 1 - Makefile | 2 +- btrfs-calc-size.c | 39 --- 3 files changed, 1 insertion(+), 41 deletions(-) delete mode 100644 btrfs-calc-size.c diff --git a/.gitignore b/.gitignore index 06db51f39512..9e4b7520e709 100644 --- a/.gitignore +++ b/.gitignore @@ -29,7 +29,6 @@ btrfs-find-root.static btrfs-image btrfs-corrupt-block btrfs-select-super -btrfs-calc-size btrfs-crc btrfstune mktables diff --git a/Makefile b/Makefile index 087289b756ef..051739b8cbc9 100644 --- a/Makefile +++ b/Makefile @@ -215,7 +215,7 @@ progs_install = btrfs mkfs.btrfs btrfs-map-logical btrfs-image \ btrfs-select-super # other tools, not built by default -progs_extra = btrfs-fragments btrfs-calc-size +progs_extra = btrfs-fragments progs_static = $(foreach p,$(progs),$(p).static) diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c deleted file mode 100644 index d2d68ab259eb.. --- a/btrfs-calc-size.c +++ /dev/null @@ -1,39 +0,0 @@ -/* - * Copyright (C) 2011 Red Hat. All rights reserved. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License v2 as published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this program; if not, write to the - * Free Software Foundation, Inc., 59 Temple Place - Suite 330, - * Boston, MA 021110-1307, USA. - */ - -#include "volumes.h" -#include "utils.h" -#include "commands.h" -#include "help.h" - -int main(int argc, char **argv) -{ - int ret; - - warning( -"\nthe tool has been deprecated, please use 'btrfs inspect-internal tree-stats' instead\n"); - - if (argc > 1 && !strcmp(argv[1], "--help")) - usage(cmd_inspect_tree_stats_usage); - - ret = cmd_inspect_tree_stats(argc, argv); - - btrfs_close_all_devices(); - - return ret; -} -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/4] btrfs-progs: Remove deprecated btrfs-zero-log standalone tool
Its function has been subsumed by "btrfs rescue zero-log". Remove its source file and adjust make/tests soruces accordingly. Signed-off-by: Nikolay Borisov--- .gitignore| 1 - Documentation/Makefile.in | 1 - Documentation/btrfs.asciidoc | 2 +- Makefile | 4 +- btrfs-zero-log.c | 78 --- tests/misc-tests.sh | 1 - tests/misc-tests/003-zero-log/test.sh | 6 +-- 7 files changed, 4 insertions(+), 89 deletions(-) delete mode 100644 btrfs-zero-log.c diff --git a/.gitignore b/.gitignore index 237a45182e8e..c0f85fc448c8 100644 --- a/.gitignore +++ b/.gitignore @@ -28,7 +28,6 @@ btrfs-find-root btrfs-find-root.static btrfs-image btrfs-show-super -btrfs-zero-log btrfs-corrupt-block btrfs-select-super btrfs-calc-size diff --git a/Documentation/Makefile.in b/Documentation/Makefile.in index 0a90f803c020..ee32a50dc068 100644 --- a/Documentation/Makefile.in +++ b/Documentation/Makefile.in @@ -102,7 +102,6 @@ install-man: man $(MV) $(DESTDIR)$(man5dir)/btrfs-man5.5.gz $(DESTDIR)$(man5dir)/btrfs.5.gz $(INSTALL) -m 644 $(GZ_MAN8) $(DESTDIR)$(man8dir) $(LN_S) -f btrfs-check.8.gz $(DESTDIR)$(man8dir)/btrfsck.8.gz - $(LN_S) -f btrfs-rescue.8.gz $(DESTDIR)$(man8dir)/btrfs-zero-log.8.gz $(LN_S) -f btrfs-inspect-internal.8.gz $(DESTDIR)$(man8dir)/btrfs-show-super.8.gz uninstall: diff --git a/Documentation/btrfs.asciidoc b/Documentation/btrfs.asciidoc index 5b328cd18dbf..e89400a2e5b5 100644 --- a/Documentation/btrfs.asciidoc +++ b/Documentation/btrfs.asciidoc @@ -123,7 +123,7 @@ long (years) depreciation period. *btrfs-debug-tree*:: moved to *btrfs inspect-internal dump-tree*. Removed from source distribution. *btrfs-show-super*:: moved to *btrfs inspect-internal dump-super* -*btrfs-zero-log*:: moved to *btrfs rescue zero-log* +*btrfs-zero-log*:: moved to *btrfs rescue zero-log*, standalone removed. EXIT STATUS --- diff --git a/Makefile b/Makefile index a7726e98947d..2a2a76855a26 100644 --- a/Makefile +++ b/Makefile @@ -210,7 +210,7 @@ MAKEOPTS = --no-print-directory Q=$(Q) progs = $(progs_install) btrfsck btrfs-corrupt-block # install only selected -progs_install = btrfs mkfs.btrfs btrfs-map-logical btrfs-image btrfs-zero-log \ +progs_install = btrfs mkfs.btrfs btrfs-map-logical btrfs-image \ btrfs-find-root btrfstune \ btrfs-select-super @@ -322,7 +322,7 @@ test-fsck: btrfs btrfs-image btrfs-corrupt-block mkfs.btrfs btrfstune $(Q)bash tests/fsck-tests.sh test-misc: btrfs btrfs-image btrfs-corrupt-block mkfs.btrfs btrfstune fssum \ - btrfs-zero-log btrfs-find-root btrfs-select-super btrfs-convert + btrfs-find-root btrfs-select-super btrfs-convert @echo "[TEST] misc-tests.sh" $(Q)bash tests/misc-tests.sh diff --git a/btrfs-zero-log.c b/btrfs-zero-log.c deleted file mode 100644 index 2fce59e98d6e.. --- a/btrfs-zero-log.c +++ /dev/null @@ -1,78 +0,0 @@ -/* - * Copyright (C) 2007 Oracle. All rights reserved. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License v2 as published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this program; if not, write to the - * Free Software Foundation, Inc., 59 Temple Place - Suite 330, - * Boston, MA 021110-1307, USA. - */ - -#include "kerncompat.h" - -#include -#include -#include "ctree.h" -#include "disk-io.h" -#include "transaction.h" -#include "utils.h" -#include "help.h" - -__attribute__((noreturn)) static void print_usage(void) -{ - printf("usage: btrfs-zero-log dev\n"); - exit(1); -} - -int main(int argc, char **argv) -{ - struct btrfs_root *root; - struct btrfs_trans_handle *trans; - struct btrfs_super_block *sb; - int ret; - - set_argv0(argv); - if (check_argc_exact(argc - optind, 1)) - print_usage(); - - radix_tree_init(); - - printf("WARNING: this utility is deprecated, please use 'btrfs rescue zero-log'\n\n"); - - if ((ret = check_mounted(argv[optind])) < 0) { - fprintf(stderr, "ERROR: could not check mount status: %s\n", strerror(-ret)); - goto out; - } else if (ret) { - fprintf(stderr, "ERROR: %s is currently mounted\n", argv[optind]); - ret = -EBUSY; - goto out; - } - - root = open_ctree(argv[optind], 0, OPEN_CTREE_WRITES | OPEN_CTREE_PARTIAL); - if (!root) { -
Re: [PATCH 00/10] Simplify function interfaces
On 03/27/2018 03:19 PM, Nikolay Borisov wrote: A bunch of functions in lowmem mode take an 'ext_ref' parameter only to pass it down the call chain where it eventually is consumed. Turns out the functions which actually check the parameter are find_inode_ref and check_inode_item, the are only passing it down to them. At the same time those functions can get a reference to fs_info and do the check in them at the appropriate time. This patchset cleanups the interface of various function by dropping the ext_ref argument and moving the actual query of the EXTREF feature closer to where it's being used. The final patch just changes signature of __btrfs_fs_incompat to better match the logic of the code and be identical to its kernel counterpart. All in all this series doesn't introduce any functional changes per-se. Nikolay Borisov (10): btrfs-progs: Drop ext_ref parameter from find_inode_ref btrfs-progs: Drop ext_ref param from check_dir_item btrfs-progs: Drop ext_ref argument from check_inode_item btrfs-progs: Drop unused ext_ref parameter from process_one_leaf btrfs-progs: Remove ext_ref param from check_fs_first_inode btrfs-progs: Remove ext_ref param from walk_down_tree btrfs-progs: Drop ext_ref param from check_fs_first_inode btrfs-progs: Drop ext_ref arument from check_fs_root btrfs-progs: Remove ext_ref local variable from check_fs_roots_lowmem btrfs-progs: Make __btrfs_fs_incompat return bool Nice work. For all, Reviewed-by: Su Yuecheck/mode-lowmem.c | 63 +++-- ctree.h | 2 +- 2 files changed, 28 insertions(+), 37 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/10] btrfs-progs: Drop ext_ref param from check_dir_item
This parameter is no longer used in this function, so drop it Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index a8a2f76549a2..2c7b270a3ffa 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -1188,14 +1188,12 @@ static void print_dir_item_err(struct btrfs_root *root, struct btrfs_key *key, * @key: the key of the INODE_REF/INODE_EXTREF * @path: the path * @size: the st_size of the INODE_ITEM - * @ext_ref: the EXTENDED_IREF feature * * Return 0 if no error occurred. * Return DIR_COUNT_AGAIN if the isize of the inode should be recalculated. */ static int check_dir_item(struct btrfs_root *root, struct btrfs_key *di_key, - struct btrfs_path *path, u64 *size, - unsigned int ext_ref) + struct btrfs_path *path, u64 *size) { struct btrfs_dir_item *di; struct btrfs_inode_item *ii; @@ -2011,7 +2009,7 @@ static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path, imode_to_type(mode), key.objectid, key.offset); } - ret = check_dir_item(root, , path, , ext_ref); + ret = check_dir_item(root, , path, ); err |= ret; break; case BTRFS_EXTENT_DATA_KEY: -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/10] btrfs-progs: Drop ext_ref argument from check_inode_item
We can derive this argument from root->fs_info and also make it local to the sole switch case it's used. Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index 2c7b270a3ffa..e26e9bdf3bb3 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -1918,13 +1918,10 @@ static bool has_orphan_item(struct btrfs_root *root, u64 ino) * 2. check inode ref/extref * 3. check dir item/index * - * @ext_ref: the EXTENDED_IREF feature - * * Return 0 if no error occurred. * Return >0 for error or hit the traversal is done(by error bitmap) */ -static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path, - unsigned int ext_ref) +static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path) { struct extent_buffer *node; struct btrfs_inode_item *ii; @@ -1993,6 +1990,9 @@ static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path, err |= ret; break; case BTRFS_INODE_EXTREF_KEY: + { + bool ext_ref = btrfs_fs_incompat(root->fs_info, +EXTENDED_IREF); if (key.type == BTRFS_INODE_EXTREF_KEY && !ext_ref) warning("root %llu EXTREF[%llu %llu] isn't supported", root->objectid, key.objectid, @@ -2001,6 +2001,7 @@ static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path, mode); err |= ret; break; + } case BTRFS_DIR_ITEM_KEY: case BTRFS_DIR_INDEX_KEY: if (!dir) { @@ -2171,7 +2172,7 @@ static int process_one_leaf(struct btrfs_root *root, struct btrfs_path *path, path->slots[0] = i; again: - err |= check_inode_item(root, path, ext_ref); + err |= check_inode_item(root, path); /* modify cur since check_inode_item may change path */ cur = path->nodes[0]; -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/10] btrfs-progs: Remove ext_ref local variable from check_fs_roots_lowmem
All real consumers of that variable have inlined the checks since they are simple enough. So just remove it. Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index eb6ca6453dc5..4f463d4f726f 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -4458,13 +4458,10 @@ int check_fs_roots_lowmem(struct btrfs_fs_info *fs_info) struct btrfs_path path; struct btrfs_key key; struct extent_buffer *node; - unsigned int ext_ref; int slot; int ret; int err = 0; - ext_ref = btrfs_fs_incompat(fs_info, EXTENDED_IREF); - btrfs_init_path(); key.objectid = BTRFS_FS_TREE_OBJECTID; key.offset = 0; -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/10] btrfs-progs: Drop unused ext_ref parameter from process_one_leaf
It's no longer used in the function so just remove it Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index e26e9bdf3bb3..34519f30a4aa 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -2141,7 +2141,7 @@ static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path) * Returns 0 No errors found */ static int process_one_leaf(struct btrfs_root *root, struct btrfs_path *path, - struct node_refs *nrefs, int *level, int ext_ref) + struct node_refs *nrefs, int *level) { struct extent_buffer *cur = path->nodes[0]; struct btrfs_key key; @@ -4031,7 +4031,7 @@ static int walk_down_tree(struct btrfs_trans_handle *trans, ret = 0; if (!check_all) ret = process_one_leaf(root, path, nrefs, - level, ext_ref); + level); else ret = check_leaf_items(trans, root, path, nrefs, account_file_data); -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/10] btrfs-progs: Drop ext_ref parameter from find_inode_ref
This is a boolean parameter which signals whether the fs has the EXTENDED_IREF feature flag toggled or not. Since a reference to fs_info can be obtained there is no need to pollute the interface. Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index dac3201b7d99..a8a2f76549a2 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -921,14 +921,13 @@ static int check_inode_extref(struct btrfs_root *root, * @namelen: the length of name in the INODE_REF/INODE_EXTREF * @index_ret: the index in the INODE_REF/INODE_EXTREF, * value (64)-1 means do not check index - * @ext_ref: the EXTENDED_IREF feature * * Return 0 if no error occurred. * Return >0 for error bitmap */ static int find_inode_ref(struct btrfs_root *root, struct btrfs_key *key, - char *name, int namelen, u64 *index_ret, - unsigned int ext_ref) + char *name, int namelen, u64 *index_ret) + { struct btrfs_path path; struct btrfs_inode_ref *ref; @@ -1001,8 +1000,9 @@ static int find_inode_ref(struct btrfs_root *root, struct btrfs_key *key, } extref: + /* Skip if not support EXTENDED_IREF feature */ - if (!ext_ref) + if (!btrfs_fs_incompat(root->fs_info, EXTENDED_IREF)) goto out; btrfs_release_path(); @@ -1311,8 +1311,7 @@ static int check_dir_item(struct btrfs_root *root, struct btrfs_key *di_key, key.objectid = location.objectid; key.type = BTRFS_INODE_REF_KEY; key.offset = di_key->objectid; - tmp_err |= find_inode_ref(root, , namebuf, len, - , ext_ref); + tmp_err |= find_inode_ref(root, , namebuf, len, ); /* check relative INDEX/ITEM */ key.objectid = di_key->objectid; @@ -4241,7 +4240,7 @@ static int check_fs_first_inode(struct btrfs_root *root, unsigned int ext_ref) /* special index value */ index = 0; - ret = find_inode_ref(root, , "..", strlen(".."), , ext_ref); + ret = find_inode_ref(root, , "..", strlen(".."), ); if (ret < 0) goto out; err |= ret; -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/10] btrfs-progs: Drop ext_ref param from check_fs_first_inode
It's no longer used in that function so can be dropped altogether. Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index 76365a214e34..daa088c056b6 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -4263,7 +4263,6 @@ static int check_fs_first_inode(struct btrfs_root *root) * blocks and integrity of fs tree items. * * @root: the root of the tree to be checked. - * @ext_ref feature EXTENDED_IREF is enable or not. * @account if NOT 0 means check the tree (including tree)'s treeblocks. *otherwise means check fs tree(s) items relationship and * @root MUST be a fs tree root. @@ -4271,8 +4270,7 @@ static int check_fs_first_inode(struct btrfs_root *root) * Returns not 0 represents error. */ static int check_btrfs_root(struct btrfs_trans_handle *trans, - struct btrfs_root *root, unsigned int ext_ref, - int check_all) + struct btrfs_root *root, int check_all) { struct btrfs_path path; struct node_refs nrefs; @@ -4352,7 +4350,7 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans, static int check_fs_root(struct btrfs_root *root, unsigned int ext_ref) { reset_cached_block_groups(root->fs_info); - return check_btrfs_root(NULL, root, ext_ref, 0); + return check_btrfs_root(NULL, root, 0); } /* @@ -4556,11 +4554,11 @@ int check_chunks_and_extents_lowmem(struct btrfs_fs_info *fs_info) } root1 = root->fs_info->chunk_root; - ret = check_btrfs_root(trans, root1, 0, 1); + ret = check_btrfs_root(trans, root1, 1); err |= ret; root1 = root->fs_info->tree_root; - ret = check_btrfs_root(trans, root1, 0, 1); + ret = check_btrfs_root(trans, root1, 1); err |= ret; btrfs_init_path(); @@ -4591,7 +4589,7 @@ int check_chunks_and_extents_lowmem(struct btrfs_fs_info *fs_info) goto next; } - ret = check_btrfs_root(trans, cur_root, 0, 1); + ret = check_btrfs_root(trans, cur_root, 1); err |= ret; if (key.objectid == BTRFS_TREE_RELOC_OBJECTID) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/10] btrfs-progs: Remove ext_ref param from check_fs_first_inode
It's no longer use and can be dropped. Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index 34519f30a4aa..29c55e202f2b 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -4198,7 +4198,7 @@ static int repair_fs_first_inode(struct btrfs_root *root, int err) * returns >0 means error * returns <0 means fatal error */ -static int check_fs_first_inode(struct btrfs_root *root, unsigned int ext_ref) +static int check_fs_first_inode(struct btrfs_root *root) { struct btrfs_path path; struct btrfs_key key; @@ -4290,7 +4290,7 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans, * the first inode item in the leaf, if inode item (256) is * missing we will skip it forever. */ - ret = check_fs_first_inode(root, ext_ref); + ret = check_fs_first_inode(root); if (ret < 0) return ret; } -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/10] btrfs-progs: Make __btrfs_fs_incompat return bool
First this function does the '!!' trick to turn its value into a bool, yet the return type is int. Second, the kernel counterpart also returns bool. Signed-off-by: Nikolay Borisov--- ctree.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ctree.h b/ctree.h index fa861ba0b4c3..1fafe7f90837 100644 --- a/ctree.h +++ b/ctree.h @@ -2460,7 +2460,7 @@ static inline u32 btrfs_file_extent_inline_len(struct extent_buffer *eb, #define btrfs_fs_incompat(fs_info, opt) \ __btrfs_fs_incompat((fs_info), BTRFS_FEATURE_INCOMPAT_##opt) -static inline int __btrfs_fs_incompat(struct btrfs_fs_info *fs_info, u64 flag) +static inline bool __btrfs_fs_incompat(struct btrfs_fs_info *fs_info, u64 flag) { struct btrfs_super_block *disk_super; disk_super = fs_info->super_copy; -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/10] btrfs-progs: Remove ext_ref param from walk_down_tree
It's no longer used so just remove it. Signed-off-by: Nikolay Borisov--- check/mode-lowmem.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index 29c55e202f2b..76365a214e34 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -3959,8 +3959,7 @@ static int check_leaf_items(struct btrfs_trans_handle *trans, */ static int walk_down_tree(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, - int *level, struct node_refs *nrefs, int ext_ref, - int check_all) + int *level, struct node_refs *nrefs, int check_all) { enum btrfs_tree_block_status status; u64 bytenr; @@ -4318,7 +4317,7 @@ static int check_btrfs_root(struct btrfs_trans_handle *trans, while (1) { ret = walk_down_tree(trans, root, , , , -ext_ref, check_all); +check_all); err |= !!ret; -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 09/10] btrfs-progs: tests: add testcase for undelete-subvol
The testcase checks the functionality of "btrfs rescue undelete-subvol", including recovering an intact subvolume, and handling correctly incomplete subvolume. Signed-off-by: Lu Fengqi--- v2: add shell quoting to test script. .../030-undelete-subvol/deleted_subvolume.img | Bin 0 -> 4096 bytes .../030-undelete-subvol/drop_progress.raw.xz | Bin 0 -> 23452 bytes tests/misc-tests/030-undelete-subvol/test.sh | 34 + 3 files changed, 34 insertions(+) create mode 100644 tests/misc-tests/030-undelete-subvol/deleted_subvolume.img create mode 100644 tests/misc-tests/030-undelete-subvol/drop_progress.raw.xz create mode 100755 tests/misc-tests/030-undelete-subvol/test.sh diff --git a/tests/misc-tests/030-undelete-subvol/deleted_subvolume.img b/tests/misc-tests/030-undelete-subvol/deleted_subvolume.img new file mode 100644 index ..9168a8e33fbbb14f9ea78514721a52191b94d1a0 GIT binary patch literal 4096 zcmeH|do+|=AIB%x8p^0djl0}JQb@$O%TY$;lEx)vlF+ymV;WO45{D?%IqnhB$S}Fi zjN6EihKkC4NQhyKaT(V$2k-mO={@KF_pD{Fy}r-)`R-@^*504B_p@W+PlCQF!fF8j zZGs!m9b0|F@NPJOvGJw?_&4=#{gw~i3;@(EFS(@+TiUf5-~Erz<=ODYja#|MmTnGi zw`~I31pbc*g!Km3O1HeZD~KRV`{Pj~uzt3r2|UgTH}a zWDkSP-%;<0K6oq%LMk0=5)g}2I{;`k(RwtnGuI_hB01WRSAongy!#b`_p*d$38#+; zcYI;9gzxul-@Q0?=XZCRp1a_O(t&`~#!SFI<>VfKQp$%5D%q@qRO^IYiZQ11 z*LJ*O@akHeln2DGHANbna7)texSTKS8w%9X2EBg2OP_*;ZbHjccM=V#lO%} zATc9#siZ*izH<@v{m4ksm!$X!TLEok)EiGD1DIiBhF{t3@g30ydV%CQ75thpDR_Sz zkNpn7>SYJJu-@rDqHlw-Wb-Wdq8i)|MhOAQWLvnU+THRaIvD@>=47X$XKjKl#3c>c z+gVL~u-1qB(5Zq5RlSf2!q|mGkUWzvayXPkcZ+}`@>>iAb(MsWDCeg9_g?z@ z+UzN}@~(J8&7SGm+x+QKwvX*meP^nxg%)4Y+tC!c%#Li1uw)!=uIZLYL%h({>6 zI?6W8VFdIKTctfwulDqceyAL)1|1Zhpij6oC^|RGxbla6`S|q7kxq3u8hQMPc;I2Y zCWZ`c;*ciOjl@3Y^DqdF9BWsR_*&1bIYVLtMwmWM2N+tNJ-7j(O%B-L35XizYGSd1 zB~$y~k-c18T=w!oFaLH~OTx)ncn*m3#tpvpB({MHrvM%M=n2J!-i}LGsv{? z<~5bwAY@lJ)i%#ao Op^!R$6CJL^-x?kT(roP@)*Q8XnwB2j3pd-#ynd#jp7WCx#4WSfc|{y z#5H*01m+|Z94ps6)UYtvKCre(ZF$dTAxE?$+j2>nxqYpZ_Rkl)A)IQAwvH~^(SzZi z0p9u3gpBilS%_h>xu4tFT+N#=O{4QmFL|iRrJJjnU7WSMuKUyHtOvks7;`zGw0BVH zJQ}jPdQ!xy(s!Us?OIYAl7nvL$7V~c{(G8ho_TID+~?_UeBFyGcNvvctBnsrrf zR>%@OeHtPSWh}(@*~6Mo%wR1Xc{kY<0UPWj-p+;|;pI=V<}0+sd36pSJ9HL1 GOmQHYWmh`5IF7C{XLOE6~i$Hfktz9xZp$KS@J7FBer zU6XsQh2^|0*VWg^M;!+bUUhGGt|)c?CClbV{)#jL~#x7@Hgi$!??u)-rbQ^;oY zY!$-MsH;KQ%fzKtwafY8t$(=9mvBoO{a>gHD^*bc*g7T=1Gzb*{j2yz>ok& z7M6gGH4Q*~983Vsv#8?bD~SiDobt+CdWEqjraBk8ESoH7vV6%oI3Ejb*$ zuNVEyMF_j+Q>(9d(j#nI_W+HW?IqPn%^pKydSPdh>>Y3s5VIg;-RjrS*7CMI7{u~HVW_{Y*$se=Z}$CDIdvvKrhnca(1M-26(9HuWE?Ohji zuBpmHt(QTgQ-Yq{m73~S!#xX-w!hB|`k?xhx6>;m!B6zHz=2>h)6=4D_a-K3kJ6t; zWMpX8X0U`fZB*msQ+^4vT8hol-6nS`h35;Ds!prmWNZ?5xhs3K9QX}xIY9op0XE<2 zd-c9BbluK;C2_kTap|WlT=|N;-RvR trenZQU zPiVGB&=zwO#labtRhCASg?G_h$wLNDK8)U6ui@n~0zl=OVap45dP71-m (j?er4a{gMzE>X8BhF^HR&+yN z``L#iGzIqve?b}mDKJ3R?!Sj~hEkClMj*aQPHs-je`Hw)G=a^_UGlEFuf?-JGn zrt3Oz*O1f$_uqL2)~`vD^hS#j3mRtrVnrHk6fM)Az8Q5M9FVc;SjPt^dWinLHEw z(Yd%niJ8}8Uk@IFxs`OJKq)!4{LOe!xf }C zCCb1~=m9Lg$9ZHABR*mWv80B!dG3fdJeMV!XUBJIiz)c*3_^%D@4@C+zdY>o!34x- zf=~s)a%m<7<=N$mEk3g{K0aS)5Q~o%4 (9R>QL7W2uspH*KopFWYyiKEB0WdA*o`T(uUGBdW_*E zzTw9jeaVG?gtI%=*AAzOt@~-PogS@pC Rs>(P<0-IA0-=um{T^q#LWS2+;ZNGGoXTU+*0WG?kjb!RY9Rd@zE! zprWN%HWIB*2GqTt(OQjR79XdXGC|eojjqRj@Ntje7V=+!*iK3 ze=gkgvO#(Wb3wy1yH#b!yCt$!YIsJE^$1(}*0xh|G%dc{X8vwN4zHQ}bVmsR RrWRKCJQwJZ4r;|fIzaNqWP$xtDSB^E`4R#+kNfF#ZlImmLpgm-*(wek`j)@7COOkP^)r&;tzG%wF$uJ>jkzy^dj z*q1CnCUcRGf<3#$dF~&$F~=`u6%?zfjrjSgZ#=WTN%JruO@f4|;_Itq@hv`` znjt?-*%=Zcor6q8k0n~uoI_Sjmizo#?Qx)NGtnCjRYcZ6yKf~=j+^wl)tqQ;;UoIe z6dYQm`k}lBct=xCfMBv`>_%x+lk+Tb7FuRhD^!2Cr(cdZ=3Hg+ski3Y-op zFEWTlcBk^eKNL|WnH>4v24}=}Oi<&5mq8M6PBQuKPF}pW3#$g6 7be*!O|OI5jBm@GHp1l$d3pM0hxL^?iZNjay1ELxtIk(9q%;7k{=WOq zVcqtGQXM9N6owoF$=%Ja6A{~D@{W^FvI{VZ9!L (Uc`zt6i4mv(BPmaC9yTV{>?PC1dsJG z4AQ!27bo-Ofq5a~mLqp4hPsA?n;ise0$mkdvDwg^$n=EO81vd*1PIXPHN*)(Tyl%q zOt)_gg-@hrL ba(Mc9JL}fC;bOb?$U|cn^Ofk)*8B
[PATCH v2 04/10] btrfs-progs: undelete-subvol: introduce is_subvol_intact
The function is used to determine whether the subvolume is intact. Signed-off-by: Lu Fengqi--- Makefile | 3 ++- undelete-subvol.c | 53 + undelete-subvol.h | 17 + 3 files changed, 72 insertions(+), 1 deletion(-) create mode 100644 undelete-subvol.c create mode 100644 undelete-subvol.h diff --git a/Makefile b/Makefile index 6065522a615f..cb38984c7386 100644 --- a/Makefile +++ b/Makefile @@ -116,7 +116,8 @@ objects = ctree.o disk-io.o kernel-lib/radix-tree.o extent-tree.o print-tree.o \ qgroup.o free-space-cache.o kernel-lib/list_sort.o props.o \ kernel-shared/ulist.o qgroup-verify.o backref.o string-table.o task-utils.o \ inode.o file.o find-root.o free-space-tree.o help.o send-dump.o \ - fsfeatures.o kernel-lib/tables.o kernel-lib/raid56.o transaction.o + fsfeatures.o kernel-lib/tables.o kernel-lib/raid56.o transaction.o \ + undelete-subvol.o cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \ cmds-quota.o cmds-qgroup.o cmds-replace.o check/main.o \ diff --git a/undelete-subvol.c b/undelete-subvol.c new file mode 100644 index ..00fcc4895778 --- /dev/null +++ b/undelete-subvol.c @@ -0,0 +1,53 @@ +/* + * Copyright (C) 2018 Fujitsu. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#include "ctree.h" + +/* + * Determines whether the subvolume is intact, according to the drop_progress + * of the root item specified by subvol_id. + * + * Return true if the subvolume is intact, otherwise return false. + */ +static bool is_subvol_intact(struct btrfs_root *root, u64 subvol_id) +{ + struct btrfs_path path; + struct btrfs_key key; + struct extent_buffer *leaf; + struct btrfs_root_item root_item; + u64 offset; + int ret; + + key.objectid = subvol_id; + key.type = BTRFS_ROOT_ITEM_KEY; + key.offset = 0; + + btrfs_init_path(); + ret = btrfs_search_slot(NULL, root, , , 0, 0); + if (ret) { + ret = false; + goto out; + } + + leaf = path.nodes[0]; + offset = btrfs_item_ptr_offset(leaf, path.slots[0]); + + read_extent_buffer(leaf, _item, offset, sizeof(root_item)); + + /* the subvolume is intact if the objectid of drop_progress == 0 */ + ret = btrfs_disk_key_objectid(_item.drop_progress) ? false : true; + +out: + btrfs_release_path(); + return ret; +} diff --git a/undelete-subvol.h b/undelete-subvol.h new file mode 100644 index ..7cfd100cce37 --- /dev/null +++ b/undelete-subvol.h @@ -0,0 +1,17 @@ +/* + * Copyright (C) 2018 Fujitsu. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#ifndef __BTRFS_UNDELETE_SUBVOLUME_H__ +#define __BTRFS_UNDELETE_SUBVOLUME_H__ + +#endif -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 03/10] btrfs-progs: use btrfs_find_free_dir_index to find free inode index
Since we have an existing function to find free inode index, we can reuse it here. Signed-off-by: Lu Fengqi--- inode.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/inode.c b/inode.c index 478036562652..86905365dfd8 100644 --- a/inode.c +++ b/inode.c @@ -628,7 +628,7 @@ int btrfs_link_subvol(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_inode_item *inode_item; struct extent_buffer *leaf; struct btrfs_key key; - u64 index = 2; + u64 index; char buf[BTRFS_NAME_LEN + 1]; /* for snprintf null */ int len; int i; @@ -638,28 +638,15 @@ int btrfs_link_subvol(struct btrfs_trans_handle *trans, struct btrfs_root *root, if (len == 0 || len > BTRFS_NAME_LEN) return -EINVAL; - /* find the free dir_index */ - btrfs_init_path(); - key.objectid = dirid; - key.type = BTRFS_DIR_INDEX_KEY; - key.offset = (u64)-1; - - ret = btrfs_search_slot(NULL, root, , , 0, 0); - if (ret <= 0) { - error("search for DIR_INDEX dirid %llu failed: %d", - (unsigned long long)dirid, ret); + /* add the dir_item/dir_index */ + ret = btrfs_find_free_dir_index(root, dirid, ); + if (ret < 0) { + error("unable to find free dir index dirid %llu failed: %d", + (unsigned long long)dirid, ret); goto fail; } - if (path.slots[0] > 0) { - path.slots[0]--; - btrfs_item_key_to_cpu(path.nodes[0], , path.slots[0]); - if (key.objectid == dirid && key.type == BTRFS_DIR_INDEX_KEY) - index = key.offset + 1; - } - btrfs_release_path(); - - /* add the dir_item/dir_index */ + btrfs_init_path(); key.objectid = dirid; key.offset = 0; key.type = BTRFS_INODE_ITEM_KEY; -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 01/10] btrfs-progs: copy btrfs_del_orphan_item from kernel
Add btrfs_del_orphan_item for the later subvolume undelete. Signed-off-by: Lu Fengqi--- ctree.h | 2 ++ inode.c | 30 ++ 2 files changed, 32 insertions(+) diff --git a/ctree.h b/ctree.h index fa861ba0b4c3..0fc31dd8d998 100644 --- a/ctree.h +++ b/ctree.h @@ -2793,6 +2793,8 @@ int btrfs_unlink(struct btrfs_trans_handle *trans, struct btrfs_root *root, int btrfs_add_orphan_item(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, u64 ino); +int btrfs_del_orphan_item(struct btrfs_trans_handle *trans, + struct btrfs_root *root, u64 offset); int btrfs_mkdir(struct btrfs_trans_handle *trans, struct btrfs_root *root, char *name, int namelen, u64 parent_ino, u64 *ino, int mode); struct btrfs_root *btrfs_mksubvol(struct btrfs_root *root, const char *base, diff --git a/inode.c b/inode.c index 2398bca4a109..8d0812c7cf50 100644 --- a/inode.c +++ b/inode.c @@ -262,6 +262,36 @@ int btrfs_add_orphan_item(struct btrfs_trans_handle *trans, return btrfs_insert_empty_item(trans, root, path, , 0); } +int btrfs_del_orphan_item(struct btrfs_trans_handle *trans, + struct btrfs_root *root, u64 offset) +{ + struct btrfs_path *path; + struct btrfs_key key; + int ret = 0; + + key.objectid = BTRFS_ORPHAN_OBJECTID; + key.type = BTRFS_ORPHAN_ITEM_KEY; + key.offset = offset; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(trans, root, , path, -1, 1); + if (ret < 0) + goto out; + if (ret) { + ret = -ENOENT; + goto out; + } + + ret = btrfs_del_item(trans, root, path); + +out: + btrfs_free_path(path); + return ret; +} + /* * Unlink an inode, which will remove its backref and corresponding dir_index/ * dir_item if any of them exists. -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 08/10] btrfs-progs: undelete-subvol: add undelete-subvol subcommand
Add the undelete-subvol subcommand for btrfs rescue. This subcommand is used to recover deleted subvolume left intact on the device. Signed-off-by: Lu Fengqi--- v2: add -s option to specify subvol_id. cmds-rescue.c | 70 +++ 1 file changed, 70 insertions(+) diff --git a/cmds-rescue.c b/cmds-rescue.c index c40088ad374e..c5132126e922 100644 --- a/cmds-rescue.c +++ b/cmds-rescue.c @@ -25,6 +25,7 @@ #include "disk-io.h" #include "commands.h" #include "utils.h" +#include "undelete-subvol.h" #include "help.h" static const char * const rescue_cmd_group_usage[] = { @@ -248,6 +249,73 @@ out: return !!ret; } +static const char * const cmd_rescue_undelete_subvol_usage[] = { + "btrfs rescue undelete-subvol [-s ] ", + "Undelete deleted subvolume", + "All deleted subvolume that still left intact on the device will be", + "recovered. If -s option is given, then just recover the", + "subvolume which specified by .", + "", + "-s specify the subvolume which will be recovered.", + NULL +}; + +static int cmd_rescue_undelete_subvol(int argc, char **argv) +{ + struct btrfs_fs_info *fs_info; + char *devname; + u64 subvol_id = 0; + int ret; + + while (1) { + int c = getopt(argc, argv, "s:"); + + if (c < 0) + break; + switch (c) { + case 's': + subvol_id = arg_strtou64(optarg); + if (!is_fstree(subvol_id)) { + error("%llu is not a valid subvolume id", + subvol_id); + ret = -EINVAL; + goto out; + } + break; + default: + usage(cmd_rescue_undelete_subvol_usage); + } + } + + if (check_argc_exact(argc - optind, 1)) + usage(cmd_rescue_undelete_subvol_usage); + + devname = argv[optind]; + ret = check_mounted(devname); + if (ret < 0) { + error("could not check mount status: %s", strerror(-ret)); + goto out; + } else if (ret) { + error("%s is currently mounted", devname); + ret = -EBUSY; + goto out; + } + + fs_info = open_ctree_fs_info(devname, 0, 0, 0, OPEN_CTREE_WRITES | +OPEN_CTREE_PARTIAL); + if (!fs_info) { + error("could not open btrfs"); + ret = -EIO; + goto out; + } + + ret = btrfs_undelete_intact_subvols(fs_info->tree_root, subvol_id); + + close_ctree(fs_info->tree_root); +out: + return ret; +} + static const char rescue_cmd_group_info[] = "toolbox for specific rescue operations"; @@ -260,6 +328,8 @@ const struct cmd_group rescue_cmd_group = { { "zero-log", cmd_rescue_zero_log, cmd_rescue_zero_log_usage, NULL, 0}, { "fix-device-size", cmd_rescue_fix_device_size, cmd_rescue_fix_device_size_usage, NULL, 0}, + { "undelete-subvol", cmd_rescue_undelete_subvol, + cmd_rescue_undelete_subvol_usage, NULL, 0}, NULL_CMD_STRUCT } }; -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 06/10] btrfs-progs: undelete-subvol: introduce link_subvol_to_lostfound
The function will create lost+found directory, link the deleted subvolume specified by the subvol_id to the directory, update the information of root_item and cleanup the associated orphan item. Signed-off-by: Lu Fengqi--- undelete-subvol.c | 76 +++ 1 file changed, 76 insertions(+) diff --git a/undelete-subvol.c b/undelete-subvol.c index 781057df2b84..9243e35545c5 100644 --- a/undelete-subvol.c +++ b/undelete-subvol.c @@ -106,3 +106,79 @@ out: btrfs_release_path(); return ret; } + +/* + * Recover a subvolume specified by subvol_id, and link it to the lost+found + * directory. + * + * @root: the root of the root tree. + * @subvol_id: specify the subvolume which will be linked, and also be the part + * of the subvolume name. + * + * Return 0 if no error occurred. + */ +static int link_subvol_to_lostfound(struct btrfs_root *root, u64 subvol_id) +{ + struct btrfs_trans_handle *trans; + struct btrfs_fs_info *fs_info = root->fs_info; + struct btrfs_root *fs_root = fs_info->fs_root; + char buf[BTRFS_NAME_LEN + 1] = {0}; /* 1 for snprintf null */ + char *dir_name = "lost+found"; + u64 lost_found_ino = 0; + u32 mode = 0700; + int ret; + + /* +* For link subvolume to lost+found, +* 2 for parent(256)'s dir_index and dir_item +* 2 for lost+found dir's inode_item and inode_ref +* 2 for lost+found dir's dir_index and dir_item for the subvolume +* 2 for the subvolume's root_ref and root_backref +*/ + trans = btrfs_start_transaction(fs_root, 8); + if (IS_ERR(trans)) { + error("unable to start transaction"); + ret = PTR_ERR(trans); + goto out; + } + + /* Create lost+found directory */ + ret = btrfs_mkdir(trans, fs_root, dir_name, strlen(dir_name), + BTRFS_FIRST_FREE_OBJECTID, _found_ino, + mode); + if (ret < 0) { + error("failed to create '%s' dir: %d", dir_name, ret); + goto out; + } + + /* Link the subvolume to lost+found directory */ + snprintf(buf, BTRFS_NAME_LEN + 1, "sub%llu", subvol_id); + ret = btrfs_link_subvol(trans, fs_root, buf, subvol_id, lost_found_ino, + false); + if (ret) { + error("failed to link the subvol %llu: %d", subvol_id, ret); + goto out; + } + + /* Clear root flags BTRFS_ROOT_SUBVOL_DEAD and increase root refs */ + ret = recover_dead_root(trans, root, subvol_id); + if (ret) + goto out; + + /* Delete the orphan item after undeletion is completed. */ + ret = btrfs_del_orphan_item(trans, root, subvol_id); + if (ret) { + error("failed to delete the orphan_item for %llu: %d", + subvol_id, ret); + goto out; + } + + ret = btrfs_commit_transaction(trans, fs_root); + if (ret) { + error("transaction commit failed: %d", ret); + goto out; + } + +out: + return ret; +} -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 07/10] btrfs-progs: undelete-subvol: introduce btrfs_undelete_intact_subvols
The function default will traverse the all orphan items on the tree root, and recover the all intact subvolumes. If subvol_id is specified, then only the corresponding subvolume will be recovered. Signed-off-by: Lu Fengqi--- v2: add subvol_id argumenta to specify subvol_id instead of recovering all subvolumes. undelete-subvol.c | 70 +++ undelete-subvol.h | 2 ++ 2 files changed, 72 insertions(+) diff --git a/undelete-subvol.c b/undelete-subvol.c index 9243e35545c5..5b494ca086ab 100644 --- a/undelete-subvol.c +++ b/undelete-subvol.c @@ -15,6 +15,7 @@ #include "transaction.h" #include "disk-io.h" #include "messages.h" +#include "undelete-subvol.h" /* * Determines whether the subvolume is intact, according to the drop_progress @@ -182,3 +183,72 @@ static int link_subvol_to_lostfound(struct btrfs_root *root, u64 subvol_id) out: return ret; } + +/* + * Traverse all orphan items on the root tree, restore them to the lost+found + * directory if the corresponding subvolumes are still intact left on the disk. + * + * @root the root of the root tree. + * @subvol_id if not set to 0, skip other subvolumes and only recover the + * subvolume specified by @subvol_id. + * + * Return 0 if no error occurred even if no subvolume was recovered. + */ +int btrfs_undelete_intact_subvols(struct btrfs_root *root, u64 subvol_id) +{ + struct btrfs_key key; + struct btrfs_path path; + u64 found_count = 0; + u64 recovered_count = 0; + int ret = 0; + + key.objectid = BTRFS_ORPHAN_OBJECTID; + key.type = BTRFS_ORPHAN_ITEM_KEY; + key.offset = subvol_id ? subvol_id + 1 : (u64)-1; + + btrfs_init_path(); + while (subvol_id != key.offset) { + ret = btrfs_search_slot(NULL, root, , , 0, 0); + if (ret < 0) { + error("search ORPHAN_ITEM for %llu failed.\n", + key.offset); + break; + } + + path.slots[0]--; + btrfs_item_key_to_cpu(path.nodes[0], , path.slots[0]); + + btrfs_release_path(); + + /* No more BTRFS_ORPHAN_ITEM, so we don't need to continue. */ + if (key.type != BTRFS_ORPHAN_ITEM_KEY) { + ret = 0; + break; + } + + /* If subvol_id is non-zero, skip other deleted subvolume. */ + if (subvol_id && subvol_id != key.offset) { + ret = -ENOENT; + break; + } + + if (!is_subvol_intact(root, key.offset)) + continue; + + /* Here we can confirm there is an intact subvolume. */ + found_count++; + ret = link_subvol_to_lostfound(root, key.offset); + if (ret == 0) { + recovered_count++; + printf( + "Recovered subvolume %llu to lost+found successfully.\n", + key.offset); + } + + } + + printf("Found %llu subvols left intact\n", found_count); + printf("Recovered %llu subvols\n", found_count); + + return ret; +} diff --git a/undelete-subvol.h b/undelete-subvol.h index 7cfd100cce37..f773210c46fe 100644 --- a/undelete-subvol.h +++ b/undelete-subvol.h @@ -14,4 +14,6 @@ #ifndef __BTRFS_UNDELETE_SUBVOLUME_H__ #define __BTRFS_UNDELETE_SUBVOLUME_H__ +int btrfs_undelete_intact_subvols(struct btrfs_root *root, u64 subvol_id); + #endif -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 02/10] btrfs-progs: extract btrfs_link_subvol from btrfs_mksubvol
The original btrfs_mksubvol is too specific to specify the directory that the subvolume will link to. Furthermore, in this transaction, we don't only need to create root_ref/dir-item, but also update the refs or flags of root_item. Extract a generic btrfs_link_subvol that allow the caller pass a trans argument for later subvolume undelete. No functional changes for the btrfs_mksubvol. Signed-off-by: Lu Fengqi--- convert/main.c | 57 + ctree.h| 5 +++-- inode.c| 46 +- 3 files changed, 81 insertions(+), 27 deletions(-) diff --git a/convert/main.c b/convert/main.c index 6bdfab40d0b0..dd6b42a1f2b1 100644 --- a/convert/main.c +++ b/convert/main.c @@ -1002,6 +1002,63 @@ err: return ret; } +/* + * Link the subvolume specified by @root_objectid to the root_dir of @root. + * + * @root the root of the file tree which the subvolume will + * be linked to. + * @subvol_namethe name of the subvolume which will be linked. + * @root_objectid specify the subvolume which will be linked. + * @convertthe flag to determine whether to try to resolve + * the name conflict. + * + * Return the root of the subvolume if success, otherwise return NULL. + */ +static struct btrfs_root *btrfs_mksubvol(struct btrfs_root *root, +const char *subvol_name, +u64 root_objectid, +bool convert) +{ + struct btrfs_trans_handle *trans; + struct btrfs_root *subvol_root = NULL; + struct btrfs_key key; + u64 dirid = btrfs_root_dirid(>root_item); + int ret; + + + trans = btrfs_start_transaction(root, 1); + if (IS_ERR(trans)) { + error("unable to start transaction"); + goto fail; + } + + ret = btrfs_link_subvol(trans, root, subvol_name, root_objectid, dirid, + true); + if (ret) { + error("unable to link subvolume %s", subvol_name); + goto fail; + } + + ret = btrfs_commit_transaction(trans, root); + if (ret) { + error("transaction commit failed: %d", ret); + goto fail; + } + + key.objectid = root_objectid; + key.offset = (u64)-1; + key.type = BTRFS_ROOT_ITEM_KEY; + + subvol_root = btrfs_read_fs_root(root->fs_info, ); + if (!subvol_root) { + error("unable to link subvolume %s", subvol_name); + goto fail; + } + +fail: + return subvol_root; +} + /* * Migrate super block to its default position and zero 0 ~ 16k */ diff --git a/ctree.h b/ctree.h index 0fc31dd8d998..4bff0b821472 100644 --- a/ctree.h +++ b/ctree.h @@ -2797,8 +2797,9 @@ int btrfs_del_orphan_item(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 offset); int btrfs_mkdir(struct btrfs_trans_handle *trans, struct btrfs_root *root, char *name, int namelen, u64 parent_ino, u64 *ino, int mode); -struct btrfs_root *btrfs_mksubvol(struct btrfs_root *root, const char *base, - u64 root_objectid, bool convert); +int btrfs_link_subvol(struct btrfs_trans_handle *trans, struct btrfs_root *root, + const char *base, u64 root_objectid, u64 dirid, + bool convert); /* file.c */ int btrfs_get_extent(struct btrfs_trans_handle *trans, diff --git a/inode.c b/inode.c index 8d0812c7cf50..478036562652 100644 --- a/inode.c +++ b/inode.c @@ -606,19 +606,28 @@ out: return ret; } -struct btrfs_root *btrfs_mksubvol(struct btrfs_root *root, - const char *base, u64 root_objectid, - bool convert) +/* + * Link the subvolume specified by @root_objectid to the directory specified by + * @dirid on the file tree specified by @root. + * + * @root the root of the file tree where the directory on. + * @base the name of the subvolume which will be linked. + * @root_objectid specify the subvolume which will be linked. + * @dirid specify the directory which the subvolume will be + * linked to. + * @convertthe flag to determine whether to try to resolve + * the name conflict. + */ +int btrfs_link_subvol(struct btrfs_trans_handle *trans, struct btrfs_root *root, + const char *base, u64 root_objectid, u64 dirid, + bool convert) { - struct btrfs_trans_handle *trans; struct btrfs_fs_info *fs_info = root->fs_info; struct btrfs_root *tree_root = fs_info->tree_root; - struct btrfs_root *new_root = NULL; struct btrfs_path path; struct
[PATCH v2 05/10] btrfs-progs: undelete-subvol: introduce recover_dead_root
The function will find the root_item specified by the subvol_id, clear the BTRFS_ROOT_SUBVOL_DEAD flag and set root_refs to one. Signed-off-by: Lu Fengqi--- ctree.h | 1 + undelete-subvol.c | 55 +++ 2 files changed, 56 insertions(+) diff --git a/ctree.h b/ctree.h index 4bff0b821472..7c0b8150bc4e 100644 --- a/ctree.h +++ b/ctree.h @@ -184,6 +184,7 @@ static int btrfs_csum_sizes[] = { 4 }; #define BTRFS_FT_MAX 9 #define BTRFS_ROOT_SUBVOL_RDONLY (1ULL << 0) +#define BTRFS_ROOT_SUBVOL_DEAD (1ULL << 48) /* * the key defines the order in the tree, and so it also defines (optimal) diff --git a/undelete-subvol.c b/undelete-subvol.c index 00fcc4895778..781057df2b84 100644 --- a/undelete-subvol.c +++ b/undelete-subvol.c @@ -12,6 +12,9 @@ */ #include "ctree.h" +#include "transaction.h" +#include "disk-io.h" +#include "messages.h" /* * Determines whether the subvolume is intact, according to the drop_progress @@ -51,3 +54,55 @@ out: btrfs_release_path(); return ret; } + +/* + * Clear BTRFS_ROOT_SUBVOL_DEAD flag and set the root_refs to one. + * + * @root the root of root tree. + * @subvol_id specify the root_item which will be modified. + * + * Return 0 if no error occurred. + */ +static int recover_dead_root(struct btrfs_trans_handle *trans, +struct btrfs_root *root, u64 subvol_id) +{ + struct btrfs_key key; + struct btrfs_path path; + struct extent_buffer *leaf; + struct btrfs_root_item root_item; + u64 root_flags; + u64 offset; + int ret; + + key.objectid = subvol_id; + key.type = BTRFS_ROOT_ITEM_KEY; + key.offset = 0; + + btrfs_init_path(); + ret = btrfs_search_slot(trans, root, , , 0, 0); + if (ret) { + error("couldn't find ROOT_ITEM for %llu failed: %d", + subvol_id, ret); + goto out; + } + + leaf = path.nodes[0]; + + offset = btrfs_item_ptr_offset(leaf, path.slots[0]); + read_extent_buffer(leaf, _item, offset, sizeof(root_item)); + + /* Clear BTRFS_ROOT_SUBVOL_DEAD */ + root_flags = btrfs_root_flags(_item); + btrfs_set_root_flags(_item, +root_flags & ~BTRFS_ROOT_SUBVOL_DEAD); + + /* Increase the refs */ + btrfs_set_root_refs(_item, 1); + + write_extent_buffer(leaf, _item, offset, sizeof(root_item)); + btrfs_mark_buffer_dirty(leaf); + +out: + btrfs_release_path(); + return ret; +} -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 00/10] undelete subvolume offline version
This patchset will add undelete-subvol subcommand for btrfs rescue. This enhancement allows undeleting a subvolume on an unmounted filesystem. Patchset can be fetched from github: https://github.com/littleroad/btrfs-progs.git undelete v1->v2: add -s option to allow user specify the subvolume which will be recovered. The first six patches are not modified. 7th-8th add the -s option to specify subvol_id instead of recovering all subvolumes. 9th add shell quoting to test script. 10th just add the -s option documentation. Lu Fengqi (10): btrfs-progs: copy btrfs_del_orphan_item from kernel btrfs-progs: extract btrfs_link_subvol from btrfs_mksubvol btrfs-progs: use btrfs_find_free_dir_index to find free inode index btrfs-progs: undelete-subvol: introduce is_subvol_intact btrfs-progs: undelete-subvol: introduce recover_dead_root btrfs-progs: undelete-subvol: introduce link_subvol_to_lostfound btrfs-progs: undelete-subvol: introduce btrfs_undelete_intact_subvols btrfs-progs: undelete-subvol: add undelete-subvol subcommand btrfs-progs: tests: add testcase for undelete-subvol btrfs-progs: undelete-subvol: update completion and documentation Documentation/btrfs-rescue.asciidoc| 12 + Makefile | 3 +- btrfs-completion | 2 +- cmds-rescue.c | 70 ++ convert/main.c | 57 + ctree.h| 8 +- inode.c| 99 .../030-undelete-subvol/deleted_subvolume.img | Bin 0 -> 4096 bytes .../030-undelete-subvol/drop_progress.raw.xz | Bin 0 -> 23452 bytes tests/misc-tests/030-undelete-subvol/test.sh | 34 +++ undelete-subvol.c | 254 + undelete-subvol.h | 19 ++ 12 files changed, 511 insertions(+), 47 deletions(-) create mode 100644 tests/misc-tests/030-undelete-subvol/deleted_subvolume.img create mode 100644 tests/misc-tests/030-undelete-subvol/drop_progress.raw.xz create mode 100755 tests/misc-tests/030-undelete-subvol/test.sh create mode 100644 undelete-subvol.c create mode 100644 undelete-subvol.h -- 2.16.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Persoonlijke en beleggingslening.
-- Goede dag, We zijn Funding Trusts Finance verstrekt leningen per postadvertentie. Wij bieden verschillende soorten leningen of projectleningen (korte en lange termijnleningen, persoonlijke leningen, leningen aan bedrijven enz.) Met een rentetarief van 3%. We verstrekken leningen aan mensen in nood, ongeacht hun locatie, geslacht, burgerlijke staat, opleiding of baan, maar moeten een legale manier van terugbetaling hebben. Onze leningen variren tussen 5.000,00 tot 20.000.000,00 US Dollar of Euro of Pond met een maximale duur van 15 jaar. Als u genteresseerd bent in meer informatie, hebben we investeerders die genteresseerd zijn in het financieren van projecten van groot volume. De procedures zijn als volgt: - 1-De klant moet een korte samenvatting van het project verzenden. Dit moet het totale bedrag omvatten dat vereist is voor het project, geschat rendement op investering, terugbetalingsperiode van de lening, dit mag niet meer dan 20 jaar zijn Neem contact met ons op: i...@fundingtrustsfinance.com INFORMATIE NODIG Jullie namen: Adres: ... Telefoon: ... Benodigde hoeveelheid: ... Looptijd: ... Beroep: ... Maandelijks inkomensniveau: .. Geslacht: .. Geboortedatum: ... Staat: ... Land: . Doel: . "Miljoenen mensen in de wereld hebben een kredietprobleem van een of andere vorm. Je bent niet de enige. We hebben een hele reeks leningopties die kunnen helpen. Ontdek nu je opties!" Met vriendelijke groet, Ronny Hens, i...@fundingtrustsfinance.com WEBSITE: www.fundingtrustfinance.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Persoonlijke en beleggingslening.
-- Goede dag, We zijn Funding Trusts Finance verstrekt leningen per postadvertentie. Wij bieden verschillende soorten leningen of projectleningen (korte en lange termijnleningen, persoonlijke leningen, leningen aan bedrijven enz.) Met een rentetarief van 3%. We verstrekken leningen aan mensen in nood, ongeacht hun locatie, geslacht, burgerlijke staat, opleiding of baan, maar moeten een legale manier van terugbetaling hebben. Onze leningen variren tussen 5.000,00 tot 20.000.000,00 US Dollar of Euro of Pond met een maximale duur van 15 jaar. Als u genteresseerd bent in meer informatie, hebben we investeerders die genteresseerd zijn in het financieren van projecten van groot volume. De procedures zijn als volgt: - 1-De klant moet een korte samenvatting van het project verzenden. Dit moet het totale bedrag omvatten dat vereist is voor het project, geschat rendement op investering, terugbetalingsperiode van de lening, dit mag niet meer dan 20 jaar zijn Neem contact met ons op: i...@fundingtrustsfinance.com INFORMATIE NODIG Jullie namen: Adres: ... Telefoon: ... Benodigde hoeveelheid: ... Looptijd: ... Beroep: ... Maandelijks inkomensniveau: .. Geslacht: .. Geboortedatum: ... Staat: ... Land: . Doel: . "Miljoenen mensen in de wereld hebben een kredietprobleem van een of andere vorm. Je bent niet de enige. We hebben een hele reeks leningopties die kunnen helpen. Ontdek nu je opties!" Met vriendelijke groet, Ronny Hens, i...@fundingtrustsfinance.com WEBSITE: www.fundingtrustfinance.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Persoonlijke en beleggingslening.
-- Goede dag, We zijn Funding Trusts Finance verstrekt leningen per postadvertentie. Wij bieden verschillende soorten leningen of projectleningen (korte en lange termijnleningen, persoonlijke leningen, leningen aan bedrijven enz.) Met een rentetarief van 3%. We verstrekken leningen aan mensen in nood, ongeacht hun locatie, geslacht, burgerlijke staat, opleiding of baan, maar moeten een legale manier van terugbetaling hebben. Onze leningen variren tussen 5.000,00 tot 20.000.000,00 US Dollar of Euro of Pond met een maximale duur van 15 jaar. Als u genteresseerd bent in meer informatie, hebben we investeerders die genteresseerd zijn in het financieren van projecten van groot volume. De procedures zijn als volgt: - 1-De klant moet een korte samenvatting van het project verzenden. Dit moet het totale bedrag omvatten dat vereist is voor het project, geschat rendement op investering, terugbetalingsperiode van de lening, dit mag niet meer dan 20 jaar zijn Neem contact met ons op: i...@fundingtrustsfinance.com INFORMATIE NODIG Jullie namen: Adres: ... Telefoon: ... Benodigde hoeveelheid: ... Looptijd: ... Beroep: ... Maandelijks inkomensniveau: .. Geslacht: .. Geboortedatum: ... Staat: ... Land: . Doel: . "Miljoenen mensen in de wereld hebben een kredietprobleem van een of andere vorm. Je bent niet de enige. We hebben een hele reeks leningopties die kunnen helpen. Ontdek nu je opties!" Met vriendelijke groet, Ronny Hens, i...@fundingtrustsfinance.com WEBSITE: www.fundingtrustfinance.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: wipe copies of the stale superblock beyond -b size
On 27.03.2018 02:50, Anand Jain wrote: > > >> I told you this code can be made a lot simpler, simply by modifying the >> last argument passed to zero_dev_clamped. I even posted the resulting >> diff which was just 3 lines changed. >> >> I agree that it's a good idea to wipe all available superblock when we >> use -b. However I don't agree with your approach - it adds a loop, it >> adds a bunch of checks and makes the complexity orders of magnitude >> higher than it could be. So I'm asking again - is there any inherent >> benefit which I'm missing in your newly added 35 lines of code against >> just passing the block device to zero_dev_clamped and letting the >> existing logic take care of everything? > > I had that idea for v1 as well, but I didn't do it because it would > zero bytenr_copy#2 even when there is no btrfs superblock, (which is > fine only with in block_count). > > Some might view it as corrupting the usr data (for which they have > specified -b option?)? I am just discussing I don't have any usecase > to prove it though. Do you have any idea? > If you think it should be ok, I shall go ahead and zero without > checking for the btrfs superblock beyond block_count. Right, so the pertinent question is - is anyone expecting to do mkfs on a volume irrespective of the options and expect to be able to recover data prior to the mkfs? I think the answer is 'no', but let's see what the community says. > > Thanks, Anand > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/2] btrfs-progs: print-tree: Use macro to replace immediate number for readable flags string buffer length
In print-tree, we have a lot of parsers to convert numeric flags to human readable string. For the buffer size we're using immediate numbers for all their callers. Change this to macro so it will be much easier for us to expand the buffer size. Signed-off-by: Qu Wenruo--- changelog: v2: Move all buffer length macros to the header part. Change comment to refer to the buffer length marco for later modication. --- print-tree.c | 50 +++--- 1 file changed, 39 insertions(+), 11 deletions(-) diff --git a/print-tree.c b/print-tree.c index a2f6bfc027c9..064f1bb9ff8c 100644 --- a/print-tree.c +++ b/print-tree.c @@ -26,6 +26,12 @@ #include "print-tree.h" #include "utils.h" +#define BG_FLAGS_BUF_LEN 64 /* for bg_flags_to_str() */ +#define QGROUP_FLAGS_BUF_LEN 32 /* for qgroup_flags_to_str() */ +#define EXTENT_FLAGS_BUF_LEN 32 /* for extent_flags_to_str() */ +#define ROOT_FLAGS_BUF_LEN 16 /* for root_flags_to_str() */ +#define INODE_FLAGS_BUF_LEN128 /* for inode_flags_to_str() */ +#define HEADER_FLAGS_BUF_LEN 32 /* for header_flags_to_str() */ static void print_dir_item_type(struct extent_buffer *eb, struct btrfs_dir_item *di) @@ -137,7 +143,13 @@ static void print_inode_ref_item(struct extent_buffer *eb, u32 size, } } -/* Caller should ensure sizeof(*ret)>=21 "DATA|METADATA|RAID10" */ +/* + * Caller should ensure sizeof(*ret)>=21 "DATA|METADATA|RAID10" + * Here we bump up to 64 bytes to handle the worst (and invalid) case: + * "DATA|METADATA|SYSTEM|RAID0|RAID1|DUP|RAID10|RAID5|RAID6" + * + * BG_FLAGS_BUF_LEN is ensured to have enough space. + */ static void bg_flags_to_str(u64 flags, char *ret) { int empty = 1; @@ -180,7 +192,11 @@ static void bg_flags_to_str(u64 flags, char *ret) } } -/* Caller should ensure sizeof(*ret)>= 26 "OFF|SCANNING|INCONSISTENT" */ +/* + * Caller should ensure sizeof(*ret)>= 26 "OFF|SCANNING|INCONSISTENT" + * + * QGROUP_FLAGS_BUF_LEN is ensured to have enough space. + */ static void qgroup_flags_to_str(u64 flags, char *ret) { if (flags & BTRFS_QGROUP_STATUS_FLAG_ON) @@ -199,7 +215,7 @@ void print_chunk_item(struct extent_buffer *eb, struct btrfs_chunk *chunk) u16 num_stripes = btrfs_chunk_num_stripes(eb, chunk); int i; u32 chunk_item_size; - char chunk_flags_str[32] = {0}; + char chunk_flags_str[BG_FLAGS_BUF_LEN] = {0}; /* The chunk must contain at least one stripe */ if (num_stripes < 1) { @@ -385,7 +401,11 @@ static void print_file_extent_item(struct extent_buffer *eb, compress_str); } -/* Caller should ensure sizeof(*ret) >= 16("DATA|TREE_BLOCK") */ +/* + * Caller should ensure sizeof(*ret) > strlen("DATA|TREE_BLOCK|FULL_BACKREF") + * + * EXTENT_FLAGS_BUF_LEN is ensured to have enough space. + */ static void extent_flags_to_str(u64 flags, char *ret) { int empty = 1; @@ -420,7 +440,7 @@ void print_extent_item(struct extent_buffer *eb, int slot, int metadata) u32 item_size = btrfs_item_size_nr(eb, slot); u64 flags; u64 offset; - char flags_str[32] = {0}; + char flags_str[EXTENT_FLAGS_BUF_LEN] = {0}; if (item_size < sizeof(*ei)) { #ifdef BTRFS_COMPAT_EXTENT_TREE_V0 @@ -542,6 +562,8 @@ static int empty_uuid(const u8 *uuid) /* * Caller must ensure sizeof(*ret) >= 7 "RDONLY" + * + * ROOT_FLAGS_BUF_LEN is ensured to have enough space */ static void root_flags_to_str(u64 flags, char *ret) { @@ -577,7 +599,7 @@ static void print_root_item(struct extent_buffer *leaf, int slot) struct btrfs_root_item root_item; int len; char uuid_str[BTRFS_UUID_UNPARSED_SIZE]; - char flags_str[32] = {0}; + char flags_str[ROOT_FLAGS_BUF_LEN] = {0}; struct btrfs_key drop_key; ri = btrfs_item_ptr(leaf, slot, struct btrfs_root_item); @@ -887,6 +909,8 @@ static void print_uuid_item(struct extent_buffer *l, unsigned long offset, /* * Caller should ensure sizeof(*ret) >= 102: all charactors plus '|' of * BTRFS_INODE_* flags + * + * INODE_FLAGS_BUF_LEN is ensured to have enough space */ static void inode_flags_to_str(u64 flags, char *ret) { @@ -911,7 +935,7 @@ static void inode_flags_to_str(u64 flags, char *ret) static void print_inode_item(struct extent_buffer *eb, struct btrfs_inode_item *ii) { - char flags_str[256]; + char flags_str[INODE_FLAGS_BUF_LEN]; memset(flags_str, 0, sizeof(flags_str)); inode_flags_to_str(btrfs_inode_flags(eb, ii), flags_str); @@ -1002,7 +1026,7 @@ static void print_block_group_item(struct extent_buffer *eb, struct btrfs_block_group_item *bgi) { struct btrfs_block_group_item bg_item; - char flags_str[256]; + char flags_str[BG_FLAGS_BUF_LEN]; read_extent_buffer(eb, _item, (unsigned long)bgi,