[PATCH v2 4/6] btrfs-progs: mkfs: Error out gracefully for --rootdir
--rootdir option will start a transaction to fill the fs, however if something goes wrong, from ENOSPC to lack of permission, we won't commit transaction and cause BUG_ON trigger by uncommitted transaction: -- extent buffer leak: start 29392896 len 16384 extent_io.c:579: free_extent_buffer: BUG_ON `eb->flags & EXTENT_DIRTY` triggered, value 1 -- The root fix is to introduce btrfs_abort_transaction() in btrfs-progs, however in this particular case, we can workaround it by force committing the transaction. Since during mkfs, the magic of btrfs is set to an invalid one, without setting fs_info->finalize_on_close() the fs is never able to be mounted. So even we force to commit wrong transaction we won't screw up things worse. Signed-off-by: Qu WenruoReviewed-by: Nikolay Borisov --- mkfs/main.c | 13 + 1 file changed, 13 insertions(+) diff --git a/mkfs/main.c b/mkfs/main.c index 60250c011ac3..358a046f1cf2 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -1073,6 +1073,19 @@ static int make_image(const char *source_dir, struct btrfs_root *root) printf("Making image is completed.\n"); return 0; fail: + /* +* Since we don't have btrfs_abort_transaction() yet, uncommitted trans +* will trigger a BUG_ON(). +* +* However before mkfs is fully finished, the magic number is invalid, +* so even we commit transaction here, the fs still can't be mounted. +* +* To do a graceful error out, here we commit transaction as a +* workaround. +* Since we have already hit some problem, the return value doesn't +* matter now. +*/ + btrfs_commit_transaction(trans, root); while (!list_empty(_head.list)) { dir_entry = list_entry(dir_head.list.next, struct directory_name_entry, list); -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/6] btrfs-progs: mkfs: Fix overwritten return value for mkfs
For mkfs failure, especially --rootdir errors like EPERM/ENOSPC, the out branch will overwrite return value, causing wrong status code. Signed-off-by: Qu WenruoReviewed-by: Nikolay Borisov --- mkfs/main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mkfs/main.c b/mkfs/main.c index 9d53c6632b45..60250c011ac3 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -1426,6 +1426,7 @@ int main(int argc, char **argv) int zero_end = 1; int fd = -1; int ret; + int close_ret; int i; int mixed = 0; int nodesize_forced = 0; @@ -1944,9 +1945,9 @@ raid_groups: */ fs_info->finalize_on_close = 1; out: - ret = close_ctree(root); + close_ret = close_ctree(root); - if (!ret) { + if (!close_ret) { optind = saved_optind; dev_cnt = argc - optind; while (dev_cnt-- > 0) { -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/6] btrfs-progs: mkfs: Avoid positive return value from cleanup_temp_chunks
Since we're calling btrfs_search_slot() the return value can be positive. However we just pass that return value out, causing undefined return value. This can cause mkfs return 1, which indicates something wrong. Fix it. Signed-off-by: Qu Wenruo--- mkfs/main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mkfs/main.c b/mkfs/main.c index 80e6089c37a1..9d53c6632b45 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -1350,6 +1350,9 @@ static int cleanup_temp_chunks(struct btrfs_fs_info *fs_info, ret = btrfs_search_slot(trans, root, , , 0, 0); if (ret < 0) goto out; + /* Don't pollute ret for >0 case */ + if (ret > 0) + ret = 0; btrfs_item_key_to_cpu(path.nodes[0], _key, path.slots[0]); -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 6/6] btrfs-progs: mkfs: Move source dir size calculation to its own files
Also rename the function from size_sourcedir() to mkfs_size_dir(). Signed-off-by: Qu WenruoReviewed-by: Nikolay Borisov --- mkfs/main.c| 66 ++ mkfs/rootdir.c | 63 +++ mkfs/rootdir.h | 2 ++ 3 files changed, 67 insertions(+), 64 deletions(-) diff --git a/mkfs/main.c b/mkfs/main.c index 7861e3075d6b..423b35579722 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -31,7 +31,6 @@ #include #include #include -#include #include "ctree.h" #include "disk-io.h" #include "volumes.h" @@ -448,67 +447,6 @@ static int create_chunks(struct btrfs_trans_handle *trans, return ret; } -/* - * This ignores symlinks with unreadable targets and subdirs that can't - * be read. It's a best-effort to give a rough estimate of the size of - * a subdir. It doesn't guarantee that prepopulating btrfs from this - * tree won't still run out of space. - */ -static u64 global_total_size; -static u64 fs_block_size; -static int ftw_add_entry_size(const char *fpath, const struct stat *st, - int type) -{ - if (type == FTW_F || type == FTW_D) - global_total_size += round_up(st->st_size, fs_block_size); - - return 0; -} - -static u64 size_sourcedir(const char *dir_name, u64 sectorsize, - u64 *num_of_meta_chunks_ret, u64 *size_of_data_ret) -{ - u64 dir_size = 0; - u64 total_size = 0; - int ret; - u64 default_chunk_size = SZ_8M; - u64 allocated_meta_size = SZ_8M; - u64 allocated_total_size = 20 * SZ_1M; /* 20MB */ - u64 num_of_meta_chunks = 0; - u64 num_of_data_chunks = 0; - u64 num_of_allocated_meta_chunks = - allocated_meta_size / default_chunk_size; - - global_total_size = 0; - fs_block_size = sectorsize; - ret = ftw(dir_name, ftw_add_entry_size, 10); - dir_size = global_total_size; - if (ret < 0) { - error("ftw subdir walk of %s failed: %s", dir_name, - strerror(errno)); - exit(1); - } - - num_of_data_chunks = (dir_size + default_chunk_size - 1) / - default_chunk_size; - - num_of_meta_chunks = (dir_size / 2) / default_chunk_size; - if (((dir_size / 2) % default_chunk_size) != 0) - num_of_meta_chunks++; - if (num_of_meta_chunks <= num_of_allocated_meta_chunks) - num_of_meta_chunks = 0; - else - num_of_meta_chunks -= num_of_allocated_meta_chunks; - - total_size = allocated_total_size + -(num_of_data_chunks * default_chunk_size) + -(num_of_meta_chunks * default_chunk_size); - - *num_of_meta_chunks_ret = num_of_meta_chunks; - *size_of_data_ret = num_of_data_chunks * default_chunk_size; - return total_size; -} - static int zero_output_file(int out_fd, u64 size) { int loop_num; @@ -1085,8 +1023,8 @@ int main(int argc, char **argv) goto error; } - source_dir_size = size_sourcedir(source_dir, sectorsize, -_of_meta_chunks, _of_data); + source_dir_size = btrfs_mkfs_size_dir(source_dir, sectorsize, + _of_meta_chunks, _of_data); if(block_count < source_dir_size) block_count = source_dir_size; ret = zero_output_file(fd, block_count); diff --git a/mkfs/rootdir.c b/mkfs/rootdir.c index 2cc8a3ac06d8..83a3191d2bd7 100644 --- a/mkfs/rootdir.c +++ b/mkfs/rootdir.c @@ -24,6 +24,7 @@ #include #include #include +#include #include "ctree.h" #include "internal.h" #include "disk-io.h" @@ -33,6 +34,15 @@ #include "mkfs/rootdir.h" #include "send-utils.h" +/* + * This ignores symlinks with unreadable targets and subdirs that can't + * be read. It's a best-effort to give a rough estimate of the size of + * a subdir. It doesn't guarantee that prepopulating btrfs from this + * tree won't still run out of space. + */ +static u64 global_total_size; +static u64 fs_block_size; + static u64 index_cnt = 2; static int add_directory_items(struct btrfs_trans_handle *trans, @@ -670,3 +680,56 @@ fail: out: return ret; } + +static int ftw_add_entry_size(const char *fpath, const struct stat *st, + int type) +{ + if (type == FTW_F || type == FTW_D) + global_total_size += round_up(st->st_size, fs_block_size); + + return 0; +} + +u64 btrfs_mkfs_size_dir(const char *dir_name, u64 sectorsize, + u64 *num_of_meta_chunks_ret, u64 *size_of_data_ret) +{ + u64 dir_size = 0; + u64 total_size = 0; + int ret; + u64 default_chunk_size = SZ_8M; + u64 allocated_meta_size = SZ_8M; + u64
[PATCH v2 5/6] btrfs-progs: mkfs: Move image creation of rootdir to its own files
In fact, --rootdir option is getting more and more independent from normal mkfs code. So move image creation function, make_image() and its related code to mkfs/rootdir.[ch], and rename the function to btrfs_mkfs_fill_dir(). Signed-off-by: Qu WenruoReviewed-by: Nikolay Borisov --- Makefile | 4 +- mkfs/main.c| 652 +-- mkfs/rootdir.c | 672 + mkfs/rootdir.h | 30 +++ 4 files changed, 706 insertions(+), 652 deletions(-) create mode 100644 mkfs/rootdir.c create mode 100644 mkfs/rootdir.h diff --git a/Makefile b/Makefile index d0657aaea0f5..12747547766f 100644 --- a/Makefile +++ b/Makefile @@ -113,7 +113,7 @@ cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \ cmds-property.o cmds-fi-usage.o cmds-inspect-dump-tree.o \ cmds-inspect-dump-super.o cmds-inspect-tree-stats.o cmds-fi-du.o \ - mkfs/common.o + mkfs/common.o mkfs/rootdir.o libbtrfs_objects = send-stream.o send-utils.o kernel-lib/rbtree.o btrfs-list.o \ kernel-lib/crc32c.o messages.o \ uuid-tree.o utils-lib.o rbtree-utils.o @@ -123,7 +123,7 @@ libbtrfs_headers = send-stream.h send-utils.h send.h kernel-lib/rbtree.h btrfs-l extent-cache.h extent_io.h ioctl.h ctree.h btrfsck.h version.h convert_objects = convert/main.o convert/common.o convert/source-fs.o \ convert/source-ext2.o convert/source-reiserfs.o -mkfs_objects = mkfs/main.o mkfs/common.o +mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o image_objects = image/main.o all_objects = $(objects) $(cmds_objects) $(libbtrfs_objects) $(convert_objects) \ $(mkfs_objects) $(image_objects) diff --git a/mkfs/main.c b/mkfs/main.c index 358a046f1cf2..7861e3075d6b 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -24,17 +24,12 @@ #include "ioctl.h" #include #include -#include -#include /* #include included via androidcompat.h */ #include #include #include #include #include -#include -#include -#include #include #include #include "ctree.h" @@ -45,20 +40,11 @@ #include "list_sort.h" #include "help.h" #include "mkfs/common.h" +#include "mkfs/rootdir.h" #include "fsfeatures.h" -int path_cat_out(char *out, const char *p1, const char *p2); - -static u64 index_cnt = 2; static int verbose = 1; -struct directory_name_entry { - const char *dir_name; - const char *path; - ino_t inum; - struct list_head list; -}; - struct mkfs_allocation { u64 data; u64 metadata; @@ -415,583 +401,6 @@ static char *parse_label(const char *input) return strdup(input); } -static int add_directory_items(struct btrfs_trans_handle *trans, - struct btrfs_root *root, u64 objectid, - ino_t parent_inum, const char *name, - struct stat *st, int *dir_index_cnt) -{ - int ret; - int name_len; - struct btrfs_key location; - u8 filetype = 0; - - name_len = strlen(name); - - location.objectid = objectid; - location.offset = 0; - location.type = BTRFS_INODE_ITEM_KEY; - - if (S_ISDIR(st->st_mode)) - filetype = BTRFS_FT_DIR; - if (S_ISREG(st->st_mode)) - filetype = BTRFS_FT_REG_FILE; - if (S_ISLNK(st->st_mode)) - filetype = BTRFS_FT_SYMLINK; - if (S_ISSOCK(st->st_mode)) - filetype = BTRFS_FT_SOCK; - if (S_ISCHR(st->st_mode)) - filetype = BTRFS_FT_CHRDEV; - if (S_ISBLK(st->st_mode)) - filetype = BTRFS_FT_BLKDEV; - if (S_ISFIFO(st->st_mode)) - filetype = BTRFS_FT_FIFO; - - ret = btrfs_insert_dir_item(trans, root, name, name_len, - parent_inum, , - filetype, index_cnt); - if (ret) - return ret; - ret = btrfs_insert_inode_ref(trans, root, name, name_len, -objectid, parent_inum, index_cnt); - *dir_index_cnt = index_cnt; - index_cnt++; - - return ret; -} - -static int fill_inode_item(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - struct btrfs_inode_item *dst, struct stat *src) -{ - u64 blocks = 0; - u64 sectorsize = root->fs_info->sectorsize; - - /* -* btrfs_inode_item has some reserved fields -* and represents on-disk inode entry, so -* zero everything to prevent information leak -*/ - memset(dst, 0, sizeof (*dst)); - - btrfs_set_stack_inode_generation(dst, trans->transid); - btrfs_set_stack_inode_size(dst,
[PATCH v2 1/6] btrfs-progs: Avoid BUG_ON for chunk allocation when ENOSPC happens
When passing directory larger than block device using --rootdir parameter, we get the following backtrace: -- extent-tree.c:2693: btrfs_reserve_extent: BUG_ON `ret` triggered, value -28 ./mkfs.btrfs(+0x1a05d)[0x557939e6b05d] ./mkfs.btrfs(btrfs_reserve_extent+0xb5a)[0x557939e710c8] ./mkfs.btrfs(+0xb0b6)[0x557939e5c0b6] ./mkfs.btrfs(main+0x15d5)[0x557939e5de04] /usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f83b101af6a] ./mkfs.btrfs(_start+0x2a)[0x557939e5af5a] -- Nothing special, just BUG_ON() abusing from ancient code. Fix them by using correct return. Signed-off-by: Qu WenruoReviewed-by: Nikolay Borisov --- extent-tree.c | 3 ++- volumes.c | 18 ++ 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/extent-tree.c b/extent-tree.c index 525a237e5923..055582c36da6 100644 --- a/extent-tree.c +++ b/extent-tree.c @@ -2690,7 +2690,8 @@ int btrfs_reserve_extent(struct btrfs_trans_handle *trans, search_start, search_end, hint_byte, ins, trans->alloc_exclude_start, trans->alloc_exclude_nr, data); - BUG_ON(ret); + if (ret < 0) + return ret; clear_extent_dirty(>free_space_cache, ins->objectid, ins->objectid + ins->offset - 1); return ret; diff --git a/volumes.c b/volumes.c index 2209e5a9100b..e1ee27d5f3ce 100644 --- a/volumes.c +++ b/volumes.c @@ -1032,11 +1032,13 @@ again: info->chunk_root->root_key.objectid, BTRFS_FIRST_CHUNK_TREE_OBJECTID, key.offset, calc_size, _offset, 0); - BUG_ON(ret); + if (ret < 0) + goto out_chunk_map; device->bytes_used += calc_size; ret = btrfs_update_device(trans, device); - BUG_ON(ret); + if (ret < 0) + goto out_chunk_map; map->stripes[index].dev = device; map->stripes[index].physical = dev_offset; @@ -1075,16 +1077,24 @@ again: map->ce.size = *num_bytes; ret = insert_cache_extent(>mapping_tree.cache_tree, >ce); - BUG_ON(ret); + if (ret < 0) + goto out_chunk_map; if (type & BTRFS_BLOCK_GROUP_SYSTEM) { ret = btrfs_add_system_chunk(info, , chunk, btrfs_chunk_item_size(num_stripes)); - BUG_ON(ret); + if (ret < 0) + goto out_chunk; } kfree(chunk); return ret; + +out_chunk_map: + kfree(map); +out_chunk: + kfree(chunk); + return ret; } /* -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/6] Rootdir refactor and small bug fixes
Sorry for the v2 patchset, just added a new 3-line patch. But since it can screw up bisect, I re-send the whole patchset, to make the new patch just before mkfs return value fix, so bisect will work as it used to do. First 4 patches are small bug fixes which can be applied even we don't touch the functionality of --rootdir. The last two patches will refactor --rootdir related functions ,mainly size_sourcedir() and make_image(), to mkfs/rootdir.[ch]. And rename them to btrfs_mkfs_size_dir() and btrfs_mkfs_fill_dir() respectively. Functionality is not changed at all, so it will still shrink the device or using the first 1M reserved space. This moved about 700 lines, which reduced about 1/3 of original mkfs.c. And by moving this ancient code to its own files, I also fixed several small nits exposed by checkpatch script. This provides a clean environment for later rootdir rework. changelog: v2: Add a new fix, to avoid mkfs return 1. The rest doesn't change. Add reviewed-by tag. Qu Wenruo (6): btrfs-progs: Avoid BUG_ON for chunk allocation when ENOSPC happens btrfs-progs: mkfs: Avoid positive return value from cleanup_temp_chunks btrfs-progs: mkfs: Fix overwritten return value for mkfs btrfs-progs: mkfs: Error out gracefully for --rootdir btrfs-progs: mkfs: Move image creation of rootdir to its own files btrfs-progs: mkfs: Move source dir size calculation to its own files Makefile | 4 +- extent-tree.c | 3 +- mkfs/main.c| 713 +-- mkfs/rootdir.c | 735 + mkfs/rootdir.h | 32 +++ volumes.c | 18 +- 6 files changed, 795 insertions(+), 710 deletions(-) create mode 100644 mkfs/rootdir.c create mode 100644 mkfs/rootdir.h -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/21] Btrfs: rework outstanding_extents
just a few quick things for the changelog: On 09/29/2017 01:43 PM, Josef Bacik wrote: > Right now we do a lot of weird hoops around outstanding_extents in order > to keep the extent count consistent. This is because we logically > transfer the outstanding_extent count from the initial reservation > through the set_delalloc_bits. This makes it pretty difficult to get a > handle on how and when we need to mess with outstanding_extents. > > Fix this by revamping the rules of how we deal with outstanding_extents. > Now instead everybody that is holding on to a delalloc extent is > required to increase the outstanding extents count for itself. This > means we'll have something like this > > btrfs_dealloc_reserve_metadata- outstanding_extents = 1 s/dealloc/delalloc/ > btrfs_set_delalloc - outstanding_extents = 2 should be btrfs_set_extent_delalloc? > btrfs_release_delalloc_extents- outstanding_extents = 1 > > for an initial file write. Now take the append write where we extend an > existing delalloc range but still under the maximum extent size > > btrfs_delalloc_reserve_metadata - outstanding_extents = 2 > btrfs_set_delalloc btrfs_set_extent_delalloc? > btrfs_set_bit_hook- outstanding_extents = 3 > btrfs_merge_bit_hook - outstanding_extents = 2 should be btrfs_clear_bit_hook? (or btrfs_merge_extent_hook?) > btrfs_release_delalloc_extents- outstanding_extnets = 1 btrfs_delalloc_release_metadata? > > In order to make the ordered extent transition we of course must now > make ordered extents carry their own outstanding_extent reservation, so > for cow_file_range we end up with > > btrfs_add_ordered_extent - outstanding_extents = 2 > clear_extent_bit - outstanding_extents = 1 > btrfs_remove_ordered_extent - outstanding_extents = 0 > > This makes all manipulations of outstanding_extents much more explicit. > Every successful call to btrfs_reserve_delalloc_metadata _must_ now be ^ btrfs_delalloc_reserve_metadata? Thanks, Ed -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] btrfs: Fix bug for misused dev_t when lookup in dev state hash table.
From: Gu JinXiangFix bug of commit 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index"). In this modify, use bio_dev(bio) to find dev state in function __btrfsic_submit_bio. But when dev_state added to hashtable, it is using dev_t of block_device. bio_dev(bio) returns a dev_t of part0 which is different from dev_t in block_device(bd_dev). bd_dev in block_device represents the exact partition. block_device.bd_dev = bio->bi_partno (same as block_device.bd_partno) + bio_dev(bio). When add a dev_state into hashtable it is using the exact partition's dev_t. So when lookup it, it should also use the exact partition's dev_t. Reproduce of this bug: Use MOUNT_OPTIONS="-o check_int" when run btrfs/001 in xfstest. Then there will be WARNING like below. WARNING: btrfs: attempt to write superblock which references block M @29523968 (sda7 /654400/2) which is never written! changelog: v1->v2: Add explanation that bio_dev(bio) is different with block_device(bd_dev). Signed-off-by: Gu JinXiang --- fs/btrfs/check-integrity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c index fb07e3c22b9a..02f9eb83173f 100644 --- a/fs/btrfs/check-integrity.c +++ b/fs/btrfs/check-integrity.c @@ -2803,7 +2803,7 @@ static void __btrfsic_submit_bio(struct bio *bio) mutex_lock(_mutex); /* since btrfsic_submit_bio() is also called before * btrfsic_mount(), this might return NULL */ - dev_state = btrfsic_dev_state_lookup(bio_dev(bio)); + dev_state = btrfsic_dev_state_lookup(bio_dev(bio) + bio->bi_partno); if (NULL != dev_state && (bio_op(bio) == REQ_OP_WRITE) && bio_has_data(bio)) { unsigned int i = 0; -- 2.13.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SOLVED - 32-bit kernel 4.13 bug - Mount failing - unable to find logical
On 10-17-2017 10:10 PM, Roman Mamedov wrote: On Wed, 18 Oct 2017 09:24:01 +0800 Qu Wenruowrote: On 2017-10-18 04:43, Cameron Kelley wrote: Hey btrfs gurus, I have a 4 disk btrfs filesystem that has suddenly stopped mounting after a recent reboot. The data is in an odd configuration due to originally being in a 3 disk RAID1 before adding a 4th disk and running a balance to convert to RAID10. There wasn't enough free space to completely convert, so about half the data is still in RAID1 while the other half is in RAID10. Both metadata and system are RAID10. It has been in this configuration for 6 months or so now since adding the 4th disk. It just holds archived media and hasn't had any data added or modified in quite some time. I feel pretty stupid now for not correcting that sooner though. I have tried mounting with different mount options for recovery, ro, degraded, etc. Log shows errors about "unable to find logical 3746892939264 length 4096" When I do a btrfs check, it doesn't find any issues. Running btrfs-find-root comes up with a message about a block that the generation doesn't match. If I specify that block on the btrfs check, I get transid verify failures. I ran a dry run of a recovery of the entire filesystem which runs through every file with no errors. I would just restore the data and start fresh, but unfortunately I don't have the free space at the moment for the ~4.5TB of data. I also ran full smart self tests on all 4 disks with no errors. root@nas2:~# uname -a Linux nas2 4.13.7-041307-generic #201710141430 SMP Sat Oct 14 14:39:06 UTC 2017 i686 i686 i686 GNU/Linux I don't think i686 kernel will cause any difference, but considering most of us are using x86_64 to develop/test, maybe it will be a good idea to upgrade to x86_64 kernel? Indeed a problem with mounting on 32-bit in 4.13 has been reported recently: https://www.spinics.net/lists/linux-btrfs/msg69734.html with the same error message. I believe it's this patchset that is supposed to fix that. https://www.spinics.net/lists/linux-btrfs/msg70001.html @Cameron maybe you didn't just reboot, but also upgraded your kernel at the same time? In any case, try a 4.9 series kernel, or a 64-bit machine if you want to stay with 4.13. Just for reference to anyone else having this issue, it is indeed a bug in the 32-bit release of the 4.13 kernel. The x64 kernel had no issues mounting it. An interesting thing to note is that I still had all the exact same mount issues and errors when I booted the latest PartedMagic live image with kernel 4.12.9 in 32-bit mode. The same PatedMagic image in 64-bit mode had no issues which is how I confirmed your suspicions. Now for the part where I feel more stupid than I have in a long time. 1. Apparently I had updated the kernel one this NAS without realizing it since I was doing updates on multiple appliances at once a little while ago and just hadn't rebooted it since. When I ran into issues, I updated the kernel to the latest without looking at the kernel I was on just to see if that solved it. 2. And here's the real kicker. The processor in this NAS (Pentium E5200) is actually x64 capable. I must have skimmed information too quickly when I first built this years ago and thought it wasn't x64 capable. I have rebuilt the NAS and I'm now running a scrub just to make sure steps I was taking to recover didn't cause any issues. Anything else you would recommend to make sure there aren't any other issues that could have been caused by my tinkering? Thank you very much for your help as I was banging my head against a wall. This NAS does so little that I tend to get careless with it. Lesson learned and embarrassment felt. The only solace is that this might help someone else who runs into this with kernel 4.13 on a 32-bit system. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is it safe to use btrfs on top of different types of devices?
On 2017-10-18 07:59, Adam Borowski wrote: On Wed, Oct 18, 2017 at 07:30:55AM -0400, Austin S. Hemmelgarn wrote: On 2017-10-17 16:21, Adam Borowski wrote: It's a single-device filesystem, thus disconnects are obviously fatal. But, they never caused even a single bit of damage (as scrub goes), thus proving btrfs handles this kind of disconnects well. Unlike times past, the kernel doesn't get confused thus no reboot is needed, merely an unmount, "service nbd-client restart", mount, restart the rebuild jobs. That's expected behavior though. _Single_ device BTRFS has nothing to get out of sync most of the time, the only time there's any possibility of an issue is when you die after writing the first copy of a block that's in a dup profile chunk, but even that is not very likely to cause problems (you'll just lose at most the last worth of data). How come? In a DUP profile, the writes are: chunk 1, chunk2, barrier, superblock. The two prior writes may be arbitrarily reordered -- both between each other or even individual sectors inside the chunks, but unless the disk lies about barriers, there's no way to have any corruption, thus running scrub is not needed. If the device dies after writing chunk 1 but before the barrier, you end up needing scrub. How much of a failure window is present is largely a function of how fast the device is, but there is a failure window there. CoW is there to ensure there is _no_ failure window. The new content doesn't matter until there are live pointers to it -- from the filesystem's point of view we merely scribbled something on an unused part of the block device. Only after all pieces are in place (as ensured by the barrier), the superblock is updated with a reference to the new metadata->data chain. Even with CoW there _IS_ a failure window. At a bare minimum, when updating the root of the tree which has multiple copies, you have a failure window. This window could admittedly be significantly reduced for multi-device setups if we actually parallelized writes properly, but it would still be there. Thus, no matter when a disconnect happens, after a crash you get either uncorrupted old version or uncorrupted new version. No scrub is ever needed for this reason on single device or on RAID1 that didn't run degraded. The whole conversation started regarding a RAID1 array that's functionally guaranteed to run degraded on a regular basis. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is it safe to use btrfs on top of different types of devices?
On 2017-10-18 09:53, Peter Grandi wrote: I forget sometimes that people insist on storing large volumes of data on unreliable storage... Here obviously "unreliable" is used on the sense of storage that can work incorrectly, not in the sense of storage that can fail. Um, in what world is a device randomly dropping off the bus (this is the primary issue with USB for storage) not a failure? Yes, it's not a catastrophic failure (for BTRFS at least), and it's transient (the kernel will re-enumerate the device when it resets the bus), but that doesn't change the fact that the service that is supposed to be provided by the device failed. To clarify more concretely, when I say 'unreliable' in reference to computers technology (and for that matter, almost anything else), I mean something that encounters non-trivial error states, either correctable or uncorrectable, at a frequency above that which is deemed reasonable for the designed function of the device. In my opinion the unreliability of the storage is the exact reason for wanting to use raid1. And I think any problem one encounters with an unreliable disk can likely happen with more reliable ones as well, only less frequently, so if I don't feel comfortable using raid1 on an unreliable medium then I wouldn't trust it on a more reliable one either. Oh please, please a bit less silliness would be welcome here. In a previous comment on this tedious thread I had written: > If the block device abstraction layer and lower layers work > correctly, Btrfs does not have problems of that sort when > adding new devices; conversely if the block device layer and > lower layers do not work correctly, no mainline Linux > filesystem I know can cope with that. > Note: "work correctly" does not mean "work error-free". The last line is very important and I added it advisedly. You seem to be using "unreliable" in two completely different meanings, without realizing it, as both "working incorrectly" and "reporting a failure". They are really very different. And you seem to be using the term 'failure' to only mean 'catastrophic failure'. Strictly speaking, even that is 'working incorrectly', albeit in a much more specific and permanent manner than just returning errors. Even looking at things that way though, Zoltan's assessment that reliability is essentially a measure of error rate is correct. Internal SATA devices absolutely can randomly drop off the bus just like many USB storage devices do, but it almost never happens (it's a statistical impossibility if there are no hardware or firmware issues), so they are more reliable in that respect. The "working incorrectly" general case is the so called "bizantine generals problem" and (depending on assumptions) it is insoluble. Btrfs has some limited ability to detect (and sometimes recover from) "working incorrectly" storage layers, but don't expect too much from that. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Unmountable fs. No root for superblock generation
I am unable to mount one my my filesystems. The superblock thinks the latest generation is 2220927 but I can't seem to find a root with that number. I can find 2220926 and 2220928 but not 2220927. Is there anything that I can do to recover this FS? # btrfs check /dev/Cached/Backups checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 Csum didn't match Couldn't setup extent tree Couldn't open file system # btrfs-find-root -g 2220927 /dev/Cached/Backups Couldn't setup extent tree Couldn't setup device tree Superblock thinks the generation is 2220927 Superblock thinks the level is 2 Found tree root at 159057884577792 gen 2220927 level 2 Well block 101489031790592(gen: 2220928 level: 2) seems good, but generation/level doesn't match, want gen: 2220927 level: 2 # btrfs check --tree-root 159057884577792 /dev/Cached/Backups checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 Csum didn't match Couldn't setup extent tree Couldn't open file system # btrfs check --tree-root 101489031790592 /dev/Cached/Backups parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 Ignoring transid failure parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 Ignoring transid failure parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 Ignoring transid failure Checking filesystem on /dev/Cached/Backups UUID: 1b213dfd-6486-47d8-8459-bc5825882023 checking extents parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116480660914176 wanted
Re: Is it safe to use btrfs on top of different types of devices?
> [ ... ] After all, btrfs would just have to discard one copy > of each chunk. [ ... ] One more thing that is not clear to me > is the replication profile of a volume. I see that balance can > convert chunks between profiles, for example from single to > raid1, but I don't see how the default profile for new chunks > can be set or quiered. [ ... ] My impression is that the design rationale and aims for Btrfs two-level allocation (in other fields known as a "BIBOP" scheme) were not fully shared among Btrfs developers, that perhaps it could have benefited from some further reflection on its implications, and that its behaviour may have evolved "opportunistically", maybe without much worrying as to conceptual integrity. (I am trying to be euphemistic) So while I am happy with the "Rodeh" core of Btrfs (COW, sbuvolumes, checksums), the RAID-profile functionality and especially the multi-device layer is not something I find particularly to my taste. (I am trying to be euphemistic) So when it comes to allocation, RAID-profiles, multiple devices, I usually expect some random "surprising functionality". (I am trying to be euphemistic) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is it safe to use btrfs on top of different types of devices?
>> I forget sometimes that people insist on storing large >> volumes of data on unreliable storage... Here obviously "unreliable" is used on the sense of storage that can work incorrectly, not in the sense of storage that can fail. > In my opinion the unreliability of the storage is the exact > reason for wanting to use raid1. And I think any problem one > encounters with an unreliable disk can likely happen with more > reliable ones as well, only less frequently, so if I don't > feel comfortable using raid1 on an unreliable medium then I > wouldn't trust it on a more reliable one either. Oh please, please a bit less silliness would be welcome here. In a previous comment on this tedious thread I had written: > If the block device abstraction layer and lower layers work > correctly, Btrfs does not have problems of that sort when > adding new devices; conversely if the block device layer and > lower layers do not work correctly, no mainline Linux > filesystem I know can cope with that. > Note: "work correctly" does not mean "work error-free". The last line is very important and I added it advisedly. You seem to be using "unreliable" in two completely different meanings, without realizing it, as both "working incorrectly" and "reporting a failure". They are really very different. The "working incorrectly" general case is the so called "bizantine generals problem" and (depending on assumptions) it is insoluble. Btrfs has some limited ability to detect (and sometimes recover from) "working incorrectly" storage layers, but don't expect too much from that. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: test: add new cli-test for subvol get/set-default
On Wed, Oct 18, 2017 at 11:00:43AM +0900, Misono, Tomohiro wrote: > Add new test to check functionality of subvol get/set-default. > > Signed-off-by: Tomohiro MisonoThanks, applied with the following diff to fix style and failures when the test is not run as root initially: - no command shortcuts - the subvolume id for set-default should be read from rootid - add missing SUDO_HELPER - prepare_test_dev without the device size (unless justified) --- a/tests/cli-tests/008-subvolume-get-set-default/test.sh +++ b/tests/cli-tests/008-subvolume-get-set-default/test.sh @@ -3,7 +3,7 @@ check_default_id() { - id=$(run_check_stdout $SUDO_HELPER "$TOP/btrfs" sub get-def .) \ + id=$(run_check_stdout $SUDO_HELPER "$TOP/btrfs" subvolume get-default .) \ || { echo "$id"; exit 1; } if $(echo "$id" | grep -vq "ID $1"); then _fail "subvolume get-default: default id is not $1, but $id" @@ -16,7 +16,7 @@ check_prereq mkfs.btrfs check_prereq btrfs setup_root_helper -prepare_test_dev 2g +prepare_test_dev run_check "$TOP/mkfs.btrfs" -f "$TEST_DEV" run_check_mount_test_dev @@ -25,21 +25,23 @@ cd "$TEST_MNT" check_default_id 5 # check "subvol set-default " -run_check "$TOP/btrfs" subvol create sub -run_check $SUDO_HELPER "$TOP/btrfs" subvol set-default 257 . -check_default_id 257 +run_check $SUDO_HELPER "$TOP/btrfs" subvolume create sub +id=$(run_check_stdout "$TOP/btrfs" inspect-internal rootid sub) +run_check $SUDO_HELPER "$TOP/btrfs" subvolume set-default "$id" . +check_default_id "$id" run_mustfail "set-default to non existent id" \ - $SUDO_HELPER "$TOP/btrfs" subvol set-default 100 . + $SUDO_HELPER "$TOP/btrfs" subvolume set-default 100 . # check "subvol set-default " -run_check "$TOP/btrfs" subvol create sub2 -run_check $SUDO_HELPER "$TOP/btrfs" subvol set-default ./sub2 -check_default_id 258 +run_check $SUDO_HELPER "$TOP/btrfs" subvolume create sub2 +id=$(run_check_stdout "$TOP/btrfs" inspect-internal rootid sub2) +run_check $SUDO_HELPER "$TOP/btrfs" subvolume set-default ./sub2 +check_default_id "$id" -run_check mkdir sub2/dir +run_check $SUDO_HELPER mkdir sub2/dir run_mustfail "set-default to normal directory" \ - $SUDO_HELPER "$TOP/btrfs" subvol set-default ./sub2/dir + $SUDO_HELPER "$TOP/btrfs" subvolume set-default ./sub2/dir cd .. run_check_umount_test_dev -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: ref-verify: Fix NULL vs IS_ERR() check in walk_down_tree()
On Wed, Oct 18, 2017 at 10:36:35AM +0300, Dan Carpenter wrote: > read_tree_block() returns error pointers, and never NULL and so I have > updated the error handling. > > Fixes: 74739121b4c7 ("Btrfs: add a extent ref verify tool") > Signed-off-by: Dan CarpenterThanks, I've folded the fix into the original commit and added credits. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Rootdir refactor and small bug fixes
On 18.10.2017 11:00, Qu Wenruo wrote: > First 3 patches are small bug fixes which can be applied even we don't > touch the functionality of --rootdir. > > The last two patches will refactor --rootdir related functions (mainly > size_sourcedir and make_image) to mkfs/rootdir.[ch]. > And rename them to btrfs_mkfs_size_dir() and btrfs_mkfs_fill_dir() > respectively. > Functionality is not changed at all, so it will still shrink the device > or using the first 1M reserved space. > > This moved about 700 lines, which reduced about 1/3 of original mkfs.c. > > And by moving this ancient code to its own files, I also fixed several > small nits exposed by checkpatch script. > > This provides a clean environment for later rootdir rework. > > Qu Wenruo (5): > btrfs-progs: Avoid BUG_ON for chunk allocation when ENOSPC happens > btrfs-progs: mkfs: Fix overwritten return value for mkfs > btrfs-progs: mkfs: Error out gracefully for --rootdir > btrfs-progs: mkfs: Move image creation of rootdir to its own files > btrfs-progs: mkfs: Move source dir size calculation to its own files Reviewed-by: Nikolay Borisov> > Makefile | 4 +- > extent-tree.c | 3 +- > mkfs/main.c| 710 +-- > mkfs/rootdir.c | 735 > + > mkfs/rootdir.h | 32 +++ > volumes.c | 18 +- > 6 files changed, 792 insertions(+), 710 deletions(-) > create mode 100644 mkfs/rootdir.c > create mode 100644 mkfs/rootdir.h > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is it safe to use btrfs on top of different types of devices?
On Wed, Oct 18, 2017 at 07:30:55AM -0400, Austin S. Hemmelgarn wrote: > On 2017-10-17 16:21, Adam Borowski wrote: > > > > It's a single-device filesystem, thus disconnects are obviously fatal. > > > > But, > > > > they never caused even a single bit of damage (as scrub goes), thus > > > > proving > > > > btrfs handles this kind of disconnects well. Unlike times past, the > > > > kernel > > > > doesn't get confused thus no reboot is needed, merely an unmount, > > > > "service > > > > nbd-client restart", mount, restart the rebuild jobs. > > > That's expected behavior though. _Single_ device BTRFS has nothing to get > > > out of sync most of the time, the only time there's any possibility of an > > > issue is when you die after writing the first copy of a block that's in a > > > dup profile chunk, but even that is not very likely to cause problems > > > (you'll just lose at most the last worth of data). > > > > How come? In a DUP profile, the writes are: chunk 1, chunk2, barrier, > > superblock. The two prior writes may be arbitrarily reordered -- both > > between each other or even individual sectors inside the chunks, but unless > > the disk lies about barriers, there's no way to have any corruption, thus > > running scrub is not needed. > If the device dies after writing chunk 1 but before the barrier, you end up > needing scrub. How much of a failure window is present is largely a > function of how fast the device is, but there is a failure window there. CoW is there to ensure there is _no_ failure window. The new content doesn't matter until there are live pointers to it -- from the filesystem's point of view we merely scribbled something on an unused part of the block device. Only after all pieces are in place (as ensured by the barrier), the superblock is updated with a reference to the new metadata->data chain. Thus, no matter when a disconnect happens, after a crash you get either uncorrupted old version or uncorrupted new version. No scrub is ever needed for this reason on single device or on RAID1 that didn't run degraded. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Imagine there are bandits in your house, your kid is bleeding out, ⢿⡄⠘⠷⠚⠋⠀ the house is on fire, and seven big-ass trumpets are playing in the ⠈⠳⣄ sky. Your cat demands food. The priority should be obvious... -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Best strategie to remove devices from pool
On 2017-10-17 13:58, Cloud Admin wrote: Hi, I want to remove two devices from a BTRFS RAID 1 pool. It should be enough free space to do it, but what is the best strategie. Remove both device in one call 'btrfs dev rem /dev/sda1 /dev/sdb1' (for example) or should it be better in two separate calls? What is faster? Are there other constraints to think about? Ideally, delete them all with a single operation. Internally the delete command uses some numeric trickery together with a balance operation to migrate the data off of the devices being removed. If you do them one at a time, you will end up moving at least some data twice, and thus wasting time. Other than that, there's not much to worry ab out, though keep in mind that deleting devices from an array can take a long time, especially if the array is mostly full. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is it safe to use btrfs on top of different types of devices?
On 2017-10-17 16:21, Adam Borowski wrote: On Tue, Oct 17, 2017 at 03:19:09PM -0400, Austin S. Hemmelgarn wrote: On 2017-10-17 13:06, Adam Borowski wrote: The thing is, reliability guarantees required vary WILDLY depending on your particular use cases. On one hand, there's "even an one-minute downtime would cost us mucho $$$s, can't have that!" -- on the other, "it died? Okay, we got backups, lemme restore it after the weekend". Yes, but if you are in the second case, you arguably don't need replication, and would be better served by improving the reliability of your underlying storage stack than trying to work around it's problems. Even in that case, your overall reliability is still constrained by the least reliable component (in more idiomatic terms 'a chain is only as strong as it's weakest link'). MD can handle this case well, there's no reason btrfs shouldn't do that too. A RAID is not akin to serially connected chain, it's a parallel connected chain: while pieces of the broken second chain hanging down from the first don't make it strictly more resilient than having just a single chain, in general case it _is_ more reliable even if the other chain is weaker. My chain analogy is supposed to be relating to the storage stack as a whole, RAID is a single link in the chain, with whatever filesystem above it, and whatever storage drivers and hardware below. Don't we have a patchset that deals with marking a device as failed at runtime floating on the mailing list? I did not look at those patches yet, but they are a step in this direction. There were some disagreements on whether the device should be released (that is, the node closed) immediately when we know it's failed, or should be held open until remount. Using replication with a reliable device and a questionable device is essentially the same as trying to add redundancy to a machine by adding an extra linkage that doesn't always work and can get in the way of the main linkage it's supposed to be protecting from failure. Yes, it will work most of the time, but the system is going to be less reliable than it is without the 'redundancy'. That's the current state of btrfs, but the design is sound, and reaching more than parity with MD is a matter of implementation. Indeed, however MD is still not perfectly reliable in this situation (though they are exponentially better than BTRFS at the moment). Thus, I switched the machine to NBD (albeit it sucks on 100Mbit eth). Alas, the network driver allocates memory with GFP_NOIO which causes NBD disconnects (somehow, this doesn't ever happen on swap where GFP_NOIO would be obvious but on regular filesystem where throwing out userspace memory is safe). The disconnects happen around once per week. Somewhat off-topic, but you might try looking at ATAoE as an alternative, it's more reliable in my experience (if you've got a reliable network), gives better performance (there's less protocol overhead than NBD, and it runs on top of layer 2 instead of layer 4) I've tested it -- not on the Odroid-U2 but on Pine64 (fully working GbE). NBD delivers 108MB/sec in a linear transfer, ATAoE is lucky to break 40MB/sec, same target (Qnap-253a, spinning rust), both in default configuration without further tuning. NBD is over IPv6 for that extra 20 bytes per packet overhead. Interesting, I've seen the the exact opposite in terms of performance. Also, NBD can be encrypted or arbitrarily routed. Yes, though if you're on a local network, neither should matter :). It's a single-device filesystem, thus disconnects are obviously fatal. But, they never caused even a single bit of damage (as scrub goes), thus proving btrfs handles this kind of disconnects well. Unlike times past, the kernel doesn't get confused thus no reboot is needed, merely an unmount, "service nbd-client restart", mount, restart the rebuild jobs. That's expected behavior though. _Single_ device BTRFS has nothing to get out of sync most of the time, the only time there's any possibility of an issue is when you die after writing the first copy of a block that's in a dup profile chunk, but even that is not very likely to cause problems (you'll just lose at most the last worth of data). How come? In a DUP profile, the writes are: chunk 1, chunk2, barrier, superblock. The two prior writes may be arbitrarily reordered -- both between each other or even individual sectors inside the chunks, but unless the disk lies about barriers, there's no way to have any corruption, thus running scrub is not needed. If the device dies after writing chunk 1 but before the barrier, you end up needing scrub. How much of a failure window is present is largely a function of how fast the device is, but there is a failure window there. The moment you add another device though, that simplicity goes out the window. RAID1 doesn't seem less simple to me: if the new superblock has been successfully written on at least one disk, barriers imply
[PATCH 0/5] Rootdir refactor and small bug fixes
First 3 patches are small bug fixes which can be applied even we don't touch the functionality of --rootdir. The last two patches will refactor --rootdir related functions (mainly size_sourcedir and make_image) to mkfs/rootdir.[ch]. And rename them to btrfs_mkfs_size_dir() and btrfs_mkfs_fill_dir() respectively. Functionality is not changed at all, so it will still shrink the device or using the first 1M reserved space. This moved about 700 lines, which reduced about 1/3 of original mkfs.c. And by moving this ancient code to its own files, I also fixed several small nits exposed by checkpatch script. This provides a clean environment for later rootdir rework. Qu Wenruo (5): btrfs-progs: Avoid BUG_ON for chunk allocation when ENOSPC happens btrfs-progs: mkfs: Fix overwritten return value for mkfs btrfs-progs: mkfs: Error out gracefully for --rootdir btrfs-progs: mkfs: Move image creation of rootdir to its own files btrfs-progs: mkfs: Move source dir size calculation to its own files Makefile | 4 +- extent-tree.c | 3 +- mkfs/main.c| 710 +-- mkfs/rootdir.c | 735 + mkfs/rootdir.h | 32 +++ volumes.c | 18 +- 6 files changed, 792 insertions(+), 710 deletions(-) create mode 100644 mkfs/rootdir.c create mode 100644 mkfs/rootdir.h -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] btrfs-progs: Avoid BUG_ON for chunk allocation when ENOSPC happens
When passing directory larger than block device using --rootdir parameter, we get the following backtrace: -- extent-tree.c:2693: btrfs_reserve_extent: BUG_ON `ret` triggered, value -28 ./mkfs.btrfs(+0x1a05d)[0x557939e6b05d] ./mkfs.btrfs(btrfs_reserve_extent+0xb5a)[0x557939e710c8] ./mkfs.btrfs(+0xb0b6)[0x557939e5c0b6] ./mkfs.btrfs(main+0x15d5)[0x557939e5de04] /usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f83b101af6a] ./mkfs.btrfs(_start+0x2a)[0x557939e5af5a] -- Nothing special, just BUG_ON() abusing from ancient code. Fix them by using correct return. Signed-off-by: Qu Wenruo--- extent-tree.c | 3 ++- volumes.c | 18 ++ 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/extent-tree.c b/extent-tree.c index 525a237e5923..055582c36da6 100644 --- a/extent-tree.c +++ b/extent-tree.c @@ -2690,7 +2690,8 @@ int btrfs_reserve_extent(struct btrfs_trans_handle *trans, search_start, search_end, hint_byte, ins, trans->alloc_exclude_start, trans->alloc_exclude_nr, data); - BUG_ON(ret); + if (ret < 0) + return ret; clear_extent_dirty(>free_space_cache, ins->objectid, ins->objectid + ins->offset - 1); return ret; diff --git a/volumes.c b/volumes.c index 2209e5a9100b..e1ee27d5f3ce 100644 --- a/volumes.c +++ b/volumes.c @@ -1032,11 +1032,13 @@ again: info->chunk_root->root_key.objectid, BTRFS_FIRST_CHUNK_TREE_OBJECTID, key.offset, calc_size, _offset, 0); - BUG_ON(ret); + if (ret < 0) + goto out_chunk_map; device->bytes_used += calc_size; ret = btrfs_update_device(trans, device); - BUG_ON(ret); + if (ret < 0) + goto out_chunk_map; map->stripes[index].dev = device; map->stripes[index].physical = dev_offset; @@ -1075,16 +1077,24 @@ again: map->ce.size = *num_bytes; ret = insert_cache_extent(>mapping_tree.cache_tree, >ce); - BUG_ON(ret); + if (ret < 0) + goto out_chunk_map; if (type & BTRFS_BLOCK_GROUP_SYSTEM) { ret = btrfs_add_system_chunk(info, , chunk, btrfs_chunk_item_size(num_stripes)); - BUG_ON(ret); + if (ret < 0) + goto out_chunk; } kfree(chunk); return ret; + +out_chunk_map: + kfree(map); +out_chunk: + kfree(chunk); + return ret; } /* -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] btrfs-progs: mkfs: Error out gracefully for --rootdir
--rootdir option will start a transaction to fill the fs, however if something goes wrong, from ENOSPC to lack of permission, we won't commit transaction and cause BUG_ON trigger by uncommitted transaction: -- extent buffer leak: start 29392896 len 16384 extent_io.c:579: free_extent_buffer: BUG_ON `eb->flags & EXTENT_DIRTY` triggered, value 1 -- The root fix is to introduce btrfs_abort_transaction() in btrfs-progs, however in this particular case, we can workaround it by force committing the transaction. Since during mkfs, the magic of btrfs is set to an invalid one, without setting fs_info->finalize_on_close() the fs is never able to be mounted. So even we force to commit wrong transaction we won't screw up things worse. Signed-off-by: Qu Wenruo--- mkfs/main.c | 13 + 1 file changed, 13 insertions(+) diff --git a/mkfs/main.c b/mkfs/main.c index 5817f114c1a1..8c332aa1e12a 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -1073,6 +1073,19 @@ static int make_image(const char *source_dir, struct btrfs_root *root) printf("Making image is completed.\n"); return 0; fail: + /* +* Since we don't have btrfs_abort_transaction() yet, uncommitted trans +* will trigger a BUG_ON(). +* +* However before mkfs is fully finished, the magic number is invalid, +* so even we commit transaction here, the fs still can't be mounted. +* +* To do a graceful error out, here we commit transaction as a +* workaround. +* Since we have already hit some problem, the return value doesn't +* matter now. +*/ + btrfs_commit_transaction(trans, root); while (!list_empty(_head.list)) { dir_entry = list_entry(dir_head.list.next, struct directory_name_entry, list); -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] btrfs-progs: mkfs: Fix overwritten return value for mkfs
For mkfs failure, especially --rootdir errors like EPERM/ENOSPC, the out branch will overwrite return value, causing wrong status code. Signed-off-by: Qu Wenruo--- mkfs/main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mkfs/main.c b/mkfs/main.c index 1b4cabc1ef90..5817f114c1a1 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -1423,6 +1423,7 @@ int main(int argc, char **argv) int zero_end = 1; int fd = -1; int ret; + int close_ret; int i; int mixed = 0; int nodesize_forced = 0; @@ -1938,9 +1939,9 @@ raid_groups: */ fs_info->finalize_on_close = 1; out: - ret = close_ctree(root); + close_ret = close_ctree(root); - if (!ret) { + if (!close_ret) { optind = saved_optind; dev_cnt = argc - optind; while (dev_cnt-- > 0) { -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] btrfs-progs: mkfs: Move image creation of rootdir to its own files
In fact, --rootdir option is getting more and more independent from normal mkfs code. So move image creation function, make_image() and its related code to mkfs/rootdir.[ch], and rename the function to btrfs_mkfs_fill_dir(). Signed-off-by: Qu Wenruo--- Makefile | 4 +- mkfs/main.c| 652 +-- mkfs/rootdir.c | 672 + mkfs/rootdir.h | 30 +++ 4 files changed, 706 insertions(+), 652 deletions(-) create mode 100644 mkfs/rootdir.c create mode 100644 mkfs/rootdir.h diff --git a/Makefile b/Makefile index d0657aaea0f5..12747547766f 100644 --- a/Makefile +++ b/Makefile @@ -113,7 +113,7 @@ cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \ cmds-property.o cmds-fi-usage.o cmds-inspect-dump-tree.o \ cmds-inspect-dump-super.o cmds-inspect-tree-stats.o cmds-fi-du.o \ - mkfs/common.o + mkfs/common.o mkfs/rootdir.o libbtrfs_objects = send-stream.o send-utils.o kernel-lib/rbtree.o btrfs-list.o \ kernel-lib/crc32c.o messages.o \ uuid-tree.o utils-lib.o rbtree-utils.o @@ -123,7 +123,7 @@ libbtrfs_headers = send-stream.h send-utils.h send.h kernel-lib/rbtree.h btrfs-l extent-cache.h extent_io.h ioctl.h ctree.h btrfsck.h version.h convert_objects = convert/main.o convert/common.o convert/source-fs.o \ convert/source-ext2.o convert/source-reiserfs.o -mkfs_objects = mkfs/main.o mkfs/common.o +mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o image_objects = image/main.o all_objects = $(objects) $(cmds_objects) $(libbtrfs_objects) $(convert_objects) \ $(mkfs_objects) $(image_objects) diff --git a/mkfs/main.c b/mkfs/main.c index 8c332aa1e12a..693a9d85f6b6 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -24,17 +24,12 @@ #include "ioctl.h" #include #include -#include -#include /* #include included via androidcompat.h */ #include #include #include #include #include -#include -#include -#include #include #include #include "ctree.h" @@ -45,20 +40,11 @@ #include "list_sort.h" #include "help.h" #include "mkfs/common.h" +#include "mkfs/rootdir.h" #include "fsfeatures.h" -int path_cat_out(char *out, const char *p1, const char *p2); - -static u64 index_cnt = 2; static int verbose = 1; -struct directory_name_entry { - const char *dir_name; - const char *path; - ino_t inum; - struct list_head list; -}; - struct mkfs_allocation { u64 data; u64 metadata; @@ -415,583 +401,6 @@ static char *parse_label(const char *input) return strdup(input); } -static int add_directory_items(struct btrfs_trans_handle *trans, - struct btrfs_root *root, u64 objectid, - ino_t parent_inum, const char *name, - struct stat *st, int *dir_index_cnt) -{ - int ret; - int name_len; - struct btrfs_key location; - u8 filetype = 0; - - name_len = strlen(name); - - location.objectid = objectid; - location.offset = 0; - location.type = BTRFS_INODE_ITEM_KEY; - - if (S_ISDIR(st->st_mode)) - filetype = BTRFS_FT_DIR; - if (S_ISREG(st->st_mode)) - filetype = BTRFS_FT_REG_FILE; - if (S_ISLNK(st->st_mode)) - filetype = BTRFS_FT_SYMLINK; - if (S_ISSOCK(st->st_mode)) - filetype = BTRFS_FT_SOCK; - if (S_ISCHR(st->st_mode)) - filetype = BTRFS_FT_CHRDEV; - if (S_ISBLK(st->st_mode)) - filetype = BTRFS_FT_BLKDEV; - if (S_ISFIFO(st->st_mode)) - filetype = BTRFS_FT_FIFO; - - ret = btrfs_insert_dir_item(trans, root, name, name_len, - parent_inum, , - filetype, index_cnt); - if (ret) - return ret; - ret = btrfs_insert_inode_ref(trans, root, name, name_len, -objectid, parent_inum, index_cnt); - *dir_index_cnt = index_cnt; - index_cnt++; - - return ret; -} - -static int fill_inode_item(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - struct btrfs_inode_item *dst, struct stat *src) -{ - u64 blocks = 0; - u64 sectorsize = root->fs_info->sectorsize; - - /* -* btrfs_inode_item has some reserved fields -* and represents on-disk inode entry, so -* zero everything to prevent information leak -*/ - memset(dst, 0, sizeof (*dst)); - - btrfs_set_stack_inode_generation(dst, trans->transid); - btrfs_set_stack_inode_size(dst, src->st_size); -
[PATCH 5/5] btrfs-progs: mkfs: Move source dir size calculation to its own files
Also rename the function from size_sourcedir() to mkfs_size_dir(). Signed-off-by: Qu Wenruo--- mkfs/main.c| 66 ++ mkfs/rootdir.c | 63 +++ mkfs/rootdir.h | 2 ++ 3 files changed, 67 insertions(+), 64 deletions(-) diff --git a/mkfs/main.c b/mkfs/main.c index 693a9d85f6b6..e2ebe3ce069f 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -31,7 +31,6 @@ #include #include #include -#include #include "ctree.h" #include "disk-io.h" #include "volumes.h" @@ -448,67 +447,6 @@ static int create_chunks(struct btrfs_trans_handle *trans, return ret; } -/* - * This ignores symlinks with unreadable targets and subdirs that can't - * be read. It's a best-effort to give a rough estimate of the size of - * a subdir. It doesn't guarantee that prepopulating btrfs from this - * tree won't still run out of space. - */ -static u64 global_total_size; -static u64 fs_block_size; -static int ftw_add_entry_size(const char *fpath, const struct stat *st, - int type) -{ - if (type == FTW_F || type == FTW_D) - global_total_size += round_up(st->st_size, fs_block_size); - - return 0; -} - -static u64 size_sourcedir(const char *dir_name, u64 sectorsize, - u64 *num_of_meta_chunks_ret, u64 *size_of_data_ret) -{ - u64 dir_size = 0; - u64 total_size = 0; - int ret; - u64 default_chunk_size = SZ_8M; - u64 allocated_meta_size = SZ_8M; - u64 allocated_total_size = 20 * SZ_1M; /* 20MB */ - u64 num_of_meta_chunks = 0; - u64 num_of_data_chunks = 0; - u64 num_of_allocated_meta_chunks = - allocated_meta_size / default_chunk_size; - - global_total_size = 0; - fs_block_size = sectorsize; - ret = ftw(dir_name, ftw_add_entry_size, 10); - dir_size = global_total_size; - if (ret < 0) { - error("ftw subdir walk of %s failed: %s", dir_name, - strerror(errno)); - exit(1); - } - - num_of_data_chunks = (dir_size + default_chunk_size - 1) / - default_chunk_size; - - num_of_meta_chunks = (dir_size / 2) / default_chunk_size; - if (((dir_size / 2) % default_chunk_size) != 0) - num_of_meta_chunks++; - if (num_of_meta_chunks <= num_of_allocated_meta_chunks) - num_of_meta_chunks = 0; - else - num_of_meta_chunks -= num_of_allocated_meta_chunks; - - total_size = allocated_total_size + -(num_of_data_chunks * default_chunk_size) + -(num_of_meta_chunks * default_chunk_size); - - *num_of_meta_chunks_ret = num_of_meta_chunks; - *size_of_data_ret = num_of_data_chunks * default_chunk_size; - return total_size; -} - static int zero_output_file(int out_fd, u64 size) { int loop_num; @@ -1079,8 +1017,8 @@ int main(int argc, char **argv) goto error; } - source_dir_size = size_sourcedir(source_dir, sectorsize, -_of_meta_chunks, _of_data); + source_dir_size = btrfs_mkfs_size_dir(source_dir, sectorsize, + _of_meta_chunks, _of_data); if(block_count < source_dir_size) block_count = source_dir_size; ret = zero_output_file(fd, block_count); diff --git a/mkfs/rootdir.c b/mkfs/rootdir.c index 2cc8a3ac06d8..83a3191d2bd7 100644 --- a/mkfs/rootdir.c +++ b/mkfs/rootdir.c @@ -24,6 +24,7 @@ #include #include #include +#include #include "ctree.h" #include "internal.h" #include "disk-io.h" @@ -33,6 +34,15 @@ #include "mkfs/rootdir.h" #include "send-utils.h" +/* + * This ignores symlinks with unreadable targets and subdirs that can't + * be read. It's a best-effort to give a rough estimate of the size of + * a subdir. It doesn't guarantee that prepopulating btrfs from this + * tree won't still run out of space. + */ +static u64 global_total_size; +static u64 fs_block_size; + static u64 index_cnt = 2; static int add_directory_items(struct btrfs_trans_handle *trans, @@ -670,3 +680,56 @@ fail: out: return ret; } + +static int ftw_add_entry_size(const char *fpath, const struct stat *st, + int type) +{ + if (type == FTW_F || type == FTW_D) + global_total_size += round_up(st->st_size, fs_block_size); + + return 0; +} + +u64 btrfs_mkfs_size_dir(const char *dir_name, u64 sectorsize, + u64 *num_of_meta_chunks_ret, u64 *size_of_data_ret) +{ + u64 dir_size = 0; + u64 total_size = 0; + int ret; + u64 default_chunk_size = SZ_8M; + u64 allocated_meta_size = SZ_8M; + u64 allocated_total_size = 20 * SZ_1M; /* 20MB */ + u64
[PATCH] Btrfs: ref-verify: Fix NULL vs IS_ERR() check in walk_down_tree()
read_tree_block() returns error pointers, and never NULL and so I have updated the error handling. Fixes: 74739121b4c7 ("Btrfs: add a extent ref verify tool") Signed-off-by: Dan Carpenterdiff --git a/fs/btrfs/ref-verify.c b/fs/btrfs/ref-verify.c index f65d78cf3c7e..34878699d363 100644 --- a/fs/btrfs/ref-verify.c +++ b/fs/btrfs/ref-verify.c @@ -584,7 +584,9 @@ static int walk_down_tree(struct btrfs_root *root, struct btrfs_path *path, gen = btrfs_node_ptr_generation(path->nodes[level], path->slots[level]); eb = read_tree_block(fs_info, block_bytenr, gen); - if (!eb || !extent_buffer_uptodate(eb)) { + if (IS_ERR(eb)) + return PTR_ERR(eb); + if (!extent_buffer_uptodate(eb)) { free_extent_buffer(eb); return -EIO; } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: Fix bug for misused dev_t when lookup in dev state hash table.
On 18.10.2017 06:43, Gu, Jinxiang wrote: > Hi, > >> -Original Message- >> From: Nikolay Borisov [mailto:nbori...@suse.com] >> Sent: Tuesday, October 17, 2017 9:36 PM >> To: Gu, Jinxiang/顾 金香; linux-btrfs@vger.kernel.org; >> h...@lst.de >> Subject: Re: [PATCH] btrfs: Fix bug for misused dev_t when lookup in dev >> state hash table. >> >> >> >> On 17.10.2017 14:34, Gu Jinxiang wrote: >>> From: Gu JinXiang >>> >>> Fix bug of commit 74d46992e0d9 >>> ("block: replace bi_bdev with a gendisk pointer and partitions index"). >>> >>> In this modify, use bio_dev(bio) to find dev state in function >>> __btrfsic_submit_bio. But when dev_state added to hashtable, it is >>> using dev_t of block_device. >> >> This is rather incomprehensible. So bio_dev(bio) actually returns the dev_t >> of the device to which this bio is submitted >> and the same dev_t should be used when btrfsic_dev_state_hashtable_add is >> called? What am I missing in here? >> > > bio_dev(bio) returns a dev_t of part0 which is different from dev_t in > block_device(bd_dev). > bd_dev in block_device represents the exact partition. > block_device.bd_dev = bio->bi_partno (same as block_device.bd_partno) + > bio_dev(bio). > > When add a dev_state into hashtable it is using the exact partition's dev_t. > So when lookup it, it should also use the exact partition's dev_t. Right, ok. Can you please put this explanation into the changelog of the patch and resend > >> >>> >>> Reproduce of this bug: >>> Use MOUNT_OPTIONS="-o check_int" when run btrfs/001 in xfstest. >>> Then there will be WARNING like below. >>> WARNING: >>> btrfs: attempt to write superblock which references block M @29523968 (sda7 >>> /654400/2) which is never written! >>> >>> Signed-off-by: Gu JinXiang >>> --- >>> fs/btrfs/check-integrity.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c >>> index fb07e3c22b9a..02f9eb83173f 100644 >>> --- a/fs/btrfs/check-integrity.c >>> +++ b/fs/btrfs/check-integrity.c >>> @@ -2803,7 +2803,7 @@ static void __btrfsic_submit_bio(struct bio *bio) >>> mutex_lock(_mutex); >>> /* since btrfsic_submit_bio() is also called before >>> * btrfsic_mount(), this might return NULL */ >>> - dev_state = btrfsic_dev_state_lookup(bio_dev(bio)); >>> + dev_state = btrfsic_dev_state_lookup(bio_dev(bio) + bio->bi_partno); >> >> So this function looks up in btrfsic_dev_state_hashtable. And stuff in this >> hashtable ias added via >> btrfsic_dev_state_hashtable_add function which seems to be only using the >> dev_t (after your other patch is applied): >> >> static void btrfsic_dev_state_hashtable_add( >> >> struct btrfsic_dev_state *ds, >> >> struct btrfsic_dev_state_hashtable *h) >> >> { >> >> const unsigned int hashval = >> >> (((unsigned int)((uintptr_t)ds->bdev->bd_dev)) & >> >> (BTRFSIC_DEV2STATE_HASHTABLE_SIZE - 1)); >> >> >> >> list_add(>collision_resolving_node, h->table + hashval); >> >> } >> >> >> So how come your change is correct since you are passing the dev_t + >> partition number? >> >>> if (NULL != dev_state && >>> (bio_op(bio) == REQ_OP_WRITE) && bio_has_data(bio)) { >>> unsigned int i = 0; >>> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html