Re: How can btrfs take 23sec to stat 23K files from an SSD?
On Sun, Jul 22, 2012 at 11:42:03PM -0700, Marc MERLIN wrote: I just realized that the older thread got a bit confusing, so I'll keep problems separate and make things simpler :) Since yesterday, I tried other kernels, including noprempt, volprempt and preempt for 3.4.4. I also tried a default 3.2.0 kernel from debian (all amd64), but that did not help. I'm still seeing close to 25 seconds to scan 15K files. How can it possibly be so slow? More importantly how I can provide useful debug information. - I don't think it's a problem with the kernel since I tried 4 kernels, including a default debian one. - Alignement seem ok, I made sure cylinders was divisible by 512: /dev/sda2 5022725293055926214144 83 Linux - I tried another brand new btrfs, and thing are even slower now. gandalfthegreat:/mnt/mnt2# mount -o ssd,discard,noatime /dev/sda2 /mnt/mnt2 gandalfthegreat:/mnt/mnt2# reset_cache gandalfthegreat:/mnt/mnt2# time du -sh src/ 514Msrc/ real0m29.584s gandalfthegreat:/mnt/mnt2# find src/| wc -l 15261 This is bad enough that there ought to be a way to debug this, right? Can you suggest something? Thanks, Marc On an _unencrypted_ partition on the SSD, running du -sh on a directory with 15K files, takes 23 seconds on unencrypted SSD and 4 secs on encrypted spinning drive, both with a similar btrfs filesystem, and the same kernel (3.4.4). Unencrypted btrfs on SSD: gandalfthegreat:~# mount -o compress=lzo,discard,nossd,space_cache,noatime /dev/sda2 /mnt/mnt2 gandalfthegreat:/mnt/mnt2# echo 3 /proc/sys/vm/drop_caches; time du -sh src 514M src real 0m22.667s Encrypted btrfs on spinning drive of the same src directory: gandalfthegreat:/var/local# echo 3 /proc/sys/vm/drop_caches; time du -sh src 514M src real 0m3.881s I've run this many times and get the same numbers. I've tried deadline and noop on /dev/sda (the SSD) and du is just as slow. I also tried with: - space_cache and nospace_cache - ssd and nossd - noatime didn't seem to help even though I was hopeful on this one. In all cases, I get: gandalfthegreat:/mnt/mnt2# echo 3 /proc/sys/vm/drop_caches; time du -sh src 514M src real 0m22.537s I'm having the same slow speed on 2 btrfs filesystems on the same SSD. One is encrypted, the other one isnt: Label: 'btrfs_pool1' uuid: d570c40a-4a0b-4d03-b1c9-cff319fc224d Total devices 1 FS bytes used 144.74GB devid1 size 441.70GB used 195.04GB path /dev/dm-0 Label: 'boot' uuid: 84199644-3542-430a-8f18-a5aa58959662 Total devices 1 FS bytes used 2.33GB devid1 size 25.00GB used 5.04GB path /dev/sda2 If instead of stating a bunch of files, I try reading a big file, I do get speeds that are quite fast (253MB/s and 423MB/s). 22 seconds for 15K files on an SSD is super slow and being 5 times slower than a spinning disk with the same data. What's going on? Thanks, Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How can btrfs take 23sec to stat 23K files from an SSD?
Am Montag, 23. Juli 2012 schrieb Marc MERLIN: I just realized that the older thread got a bit confusing, so I'll keep problems separate and make things simpler :) On an _unencrypted_ partition on the SSD, running du -sh on a directory with 15K files, takes 23 seconds on unencrypted SSD and 4 secs on encrypted spinning drive, both with a similar btrfs filesystem, and the same kernel (3.4.4). Unencrypted btrfs on SSD: gandalfthegreat:~# mount -o compress=lzo,discard,nossd,space_cache,noatime /dev/sda2 /mnt/mnt2 gandalfthegreat:/mnt/mnt2# echo 3 /proc/sys/vm/drop_caches; time du -sh src 514M src real 0m22.667s Encrypted btrfs on spinning drive of the same src directory: gandalfthegreat:/var/local# echo 3 /proc/sys/vm/drop_caches; time du -sh src 514M src real 0m3.881s find is fast, du is much slower: merkaba:~ echo 3 /proc/sys/vm/drop_caches ; time ( find /usr | wc -l ) 404166 ( find /usr | wc -l; ) 0,03s user 0,07s system 1% cpu 9,212 total merkaba:~ echo 3 /proc/sys/vm/drop_caches ; time ( du -sh /usr ) 11G /usr ( du -sh /usr; ) 1,00s user 19,07s system 41% cpu 48,886 total Now I try to find something with less files. merkaba:~ find /usr/share/doc | wc -l 50715 merkaba:~ echo 3 /proc/sys/vm/drop_caches ; time ( find /usr/share/doc | wc -l ) 50715 ( find /usr/share/doc | wc -l; ) 0,00s user 0,02s system 1% cpu 1,398 total merkaba:~ echo 3 /proc/sys/vm/drop_caches ; time ( du -sh /usr/share/doc ) 606M/usr/share/doc ( du -sh /usr/share/doc; ) 0,20s user 3,63s system 35% cpu 10,691 total merkaba:~ echo 3 /proc/sys/vm/drop_caches ; time du -sh /usr/share/doc 606M/usr/share/doc du -sh /usr/share/doc 0,19s user 3,54s system 35% cpu 10,386 total Anyway thats still much faster than your measurements. merkaba:~ df -hT /usr DateisystemTyp Größe Benutzt Verf. Verw% Eingehängt auf /dev/dm-0 btrfs 19G 11G 5,6G 67% / merkaba:~ btrfs fi sh failed to read /dev/sr0 Label: 'debian' uuid: […] Total devices 1 FS bytes used 10.25GB devid1 size 18.62GB used 18.62GB path /dev/dm-0 Btrfs Btrfs v0.19 merkaba:~ btrfs fi df / Data: total=15.10GB, used=9.59GB System, DUP: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.75GB, used=670.43MB Metadata: total=8.00MB, used=0.00 merkaba:~ grep btrfs /proc/mounts /dev/dm-0 / btrfs rw,noatime,compress=lzo,ssd,space_cache,inode_cache 0 0 Somewhat aged BTRFS filesystem on ThinkPad T520, Intel SSD 320, kernel 3.5. Ciao, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Btrfs-progs: search subvolumes with proper objectid
Btrfs's subvolume/snapshot is limited to [BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID], so just apply the range. Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com --- btrfs-list.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/btrfs-list.c b/btrfs-list.c index c53d016..ac6507a 100644 --- a/btrfs-list.c +++ b/btrfs-list.c @@ -634,11 +634,13 @@ static int __list_subvol_search(int fd, struct root_lookup *root_lookup) sk-max_type = BTRFS_ROOT_BACKREF_KEY; sk-min_type = BTRFS_ROOT_BACKREF_KEY; + sk-min_objectid = BTRFS_FIRST_FREE_OBJECTID; + /* * set all the other params to the max, we'll take any objectid * and any trans */ - sk-max_objectid = (u64)-1; + sk-max_objectid = BTRFS_LAST_FREE_OBJECTID; sk-max_offset = (u64)-1; sk-max_transid = (u64)-1; @@ -690,7 +692,7 @@ static int __list_subvol_search(int fd, struct root_lookup *root_lookup) if (sk-min_type BTRFS_ROOT_BACKREF_KEY) { sk-min_type = BTRFS_ROOT_BACKREF_KEY; sk-min_offset = 0; - } else if (sk-min_objectid (u64)-1) { + } else if (sk-min_objectid BTRFS_LAST_FREE_OBJECTID) { sk-min_objectid++; sk-min_type = BTRFS_ROOT_BACKREF_KEY; sk-min_offset = 0; -- 1.6.5.2 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Btrfs-progs: show generation in command btrfs subvol list
This adds the ability to show root's generation when we use btrfs subvol list. Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com --- btrfs-list.c | 61 - 1 files changed, 55 insertions(+), 6 deletions(-) diff --git a/btrfs-list.c b/btrfs-list.c index ac6507a..05360dc 100644 --- a/btrfs-list.c +++ b/btrfs-list.c @@ -57,6 +57,9 @@ struct root_info { /* the dir id we're in from ref_tree */ u64 dir_id; + /* generation when the root is created or last updated */ + u64 gen; + /* path from the subvol we live in to this root, including the * root's name. This is null until we do the extra lookup ioctl. */ @@ -194,6 +197,19 @@ static int add_root(struct root_lookup *root_lookup, return 0; } +static int update_root(struct root_lookup *root_lookup, u64 root_id, u64 gen) +{ + struct root_info *ri; + + ri = tree_search(root_lookup-root, root_id); + if (!ri || ri-root_id != root_id) { + fprintf(stderr, could not find subvol %llu\n, root_id); + return -ENOENT; + } + ri-gen = gen; + return 0; +} + /* * for a given root_info, search through the root_lookup tree to construct * the full path name to it. @@ -615,11 +631,15 @@ static int __list_subvol_search(int fd, struct root_lookup *root_lookup) struct btrfs_ioctl_search_key *sk = args.key; struct btrfs_ioctl_search_header *sh; struct btrfs_root_ref *ref; + struct btrfs_root_item *ri; unsigned long off = 0; int name_len; char *name; u64 dir_id; + u8 type; + u64 gen = 0; int i; + int get_gen = 0; root_lookup_init(root_lookup); memset(args, 0, sizeof(args)); @@ -644,6 +664,7 @@ static int __list_subvol_search(int fd, struct root_lookup *root_lookup) sk-max_offset = (u64)-1; sk-max_transid = (u64)-1; +again: /* just a big number, doesn't matter much */ sk-nr_items = 4096; @@ -665,7 +686,7 @@ static int __list_subvol_search(int fd, struct root_lookup *root_lookup) sh = (struct btrfs_ioctl_search_header *)(args.buf + off); off += sizeof(*sh); - if (sh-type == BTRFS_ROOT_BACKREF_KEY) { + if (!get_gen sh-type == BTRFS_ROOT_BACKREF_KEY) { ref = (struct btrfs_root_ref *)(args.buf + off); name_len = btrfs_stack_root_ref_name_len(ref); name = (char *)(ref + 1); @@ -673,6 +694,11 @@ static int __list_subvol_search(int fd, struct root_lookup *root_lookup) add_root(root_lookup, sh-objectid, sh-offset, dir_id, name, name_len); + } else if (get_gen sh-type == BTRFS_ROOT_ITEM_KEY) { + ri = (struct btrfs_root_item *)(args.buf + off); + gen = btrfs_root_generation(ri); + + update_root(root_lookup, sh-objectid, gen); } off += sh-len; @@ -689,17 +715,38 @@ static int __list_subvol_search(int fd, struct root_lookup *root_lookup) /* this iteration is done, step forward one root for the next * ioctl */ - if (sk-min_type BTRFS_ROOT_BACKREF_KEY) { - sk-min_type = BTRFS_ROOT_BACKREF_KEY; + if (get_gen) + type = BTRFS_ROOT_ITEM_KEY; + else + type = BTRFS_ROOT_BACKREF_KEY; + + if (sk-min_type type) { + sk-min_type = type; sk-min_offset = 0; } else if (sk-min_objectid BTRFS_LAST_FREE_OBJECTID) { sk-min_objectid++; - sk-min_type = BTRFS_ROOT_BACKREF_KEY; + sk-min_type = type; sk-min_offset = 0; } else break; } + if (!get_gen) { + memset(args, 0, sizeof(args)); + + sk-tree_id = 1; + sk-max_type = BTRFS_ROOT_ITEM_KEY; + sk-min_type = BTRFS_ROOT_ITEM_KEY; + + sk-min_objectid = BTRFS_FIRST_FREE_OBJECTID; + + sk-max_objectid = BTRFS_LAST_FREE_OBJECTID; + sk-max_offset = (u64)-1; + sk-max_transid = (u64)-1; + + get_gen = 1; + goto again; + } return 0; } @@ -781,13 +828,15 @@ int list_subvols(int fd, int print_parent, int get_default) resolve_root(root_lookup, entry, parent_id, level, path); if (print_parent) { -
Re: [PATCH v3 1/1] Btrfs: Check INCOMPAT flags on remount and add helper function
We don't need a helper for every incompatibility bit, let's do it in a more generic way as suggested below [modulo syntax errors]: On Fri, Jul 20, 2012 at 05:16:41PM -0500, Mitch Harder wrote: --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3103,6 +3103,19 @@ void __btrfs_abort_transaction(struct btrfs_trans_handle *trans, struct btrfs_root *root, const char *function, unsigned int line, int errno); +static inline void btrfs_chk_lzo_incompat(struct btrfs_root *root) +{ btrfs_set_fs_incompat(struct btrfs_fs_info *fs_info, u64 flag) { + struct btrfs_super_block *disk_super; + u64 features; + + disk_super = root-fs_info-super_copy; disk_super = fs_info-super_copy; + features = btrfs_super_incompat_flags(disk_super); + if (!(features BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO)) { + features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO; if (!(features flag)) { features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO; + btrfs_set_super_incompat_flags(disk_super, features); + } +} + #define btrfs_abort_transaction(trans, root, errno) \ do { \ __btrfs_abort_transaction(trans, root, __func__,\ diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 17facea..d5fd69e 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1042,11 +1042,9 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, u64 newer_than, unsigned long max_to_defrag) { struct btrfs_root *root = BTRFS_I(inode)-root; - struct btrfs_super_block *disk_super; struct file_ra_state *ra = NULL; unsigned long last_index; u64 isize = i_size_read(inode); - u64 features; u64 last_len = 0; u64 skip = 0; u64 defrag_end = 0; @@ -1233,11 +1231,8 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, mutex_unlock(inode-i_mutex); } - disk_super = root-fs_info-super_copy; - features = btrfs_super_incompat_flags(disk_super); if (range-compress_type == BTRFS_COMPRESS_LZO) { - features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO; - btrfs_set_super_incompat_flags(disk_super, features); + btrfs_chk_lzo_incompat(root); btrfs_set_fs_incompat(fs_info, BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO); } ret = defrag_count; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 26da344..32c2bd9 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -401,6 +401,7 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) compress_type = lzo; info-compress_type = BTRFS_COMPRESS_LZO; btrfs_set_opt(info-mount_opt, COMPRESS); + btrfs_chk_lzo_incompat(root); btrfs_set_fs_incompat(fs_info, BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO); } else if (strncmp(args[0].from, no, 2) == 0) { compress_type = no; info-compress_type = BTRFS_COMPRESS_NONE; -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4] Btrfs: Check INCOMPAT flags on remount and add helper function
In support of the recently added capability to remount with lzo compression, provide a helper function to check the compression INCOMPAT flags when remounting with lzo compression, and set the flags if necessary. Also, implement the new helper function when defragmenting with explicit lzo compression and when setting the default subvolume. Signed-off-by: Mitch Harder mitch.har...@sabayonlinux.org --- v1-v2 - Remove extraneous formatting change. v2-v3 - Consolidate into a single patch - Convert helper function to a static inline function. v3-v4 - Per feedback from Li Zefan, change function name from _chk_ to _set_ - Per feedback from David Sterba, make the helper function more generic. - The more generic function can also be implemented in the INCOMPAT check made for setting the default subvolume. fs/btrfs/ctree.h | 17 + fs/btrfs/ioctl.c | 16 ++-- fs/btrfs/super.c |1 + 3 files changed, 20 insertions(+), 14 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index a0ee2f8..5422e54 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3103,6 +3103,23 @@ void __btrfs_abort_transaction(struct btrfs_trans_handle *trans, struct btrfs_root *root, const char *function, unsigned int line, int errno); +#define btrfs_set_fs_incompat(__fs_info, opt) \ + __btrfs_set_fs_incompat((__fs_info), BTRFS_FEATURE_INCOMPAT_##opt) + +static inline void __btrfs_set_fs_incompat(struct btrfs_fs_info *fs_info, + u64 flag) +{ + struct btrfs_super_block *disk_super; + u64 features; + + disk_super = fs_info-super_copy; + features = btrfs_super_incompat_flags(disk_super); + if (!(features flag)) { + features |= flag; + btrfs_set_super_incompat_flags(disk_super, features); + } +} + #define btrfs_abort_transaction(trans, root, errno)\ do { \ __btrfs_abort_transaction(trans, root, __func__,\ diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 17facea..0d5d079 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1042,11 +1042,9 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, u64 newer_than, unsigned long max_to_defrag) { struct btrfs_root *root = BTRFS_I(inode)-root; - struct btrfs_super_block *disk_super; struct file_ra_state *ra = NULL; unsigned long last_index; u64 isize = i_size_read(inode); - u64 features; u64 last_len = 0; u64 skip = 0; u64 defrag_end = 0; @@ -1233,11 +1231,8 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, mutex_unlock(inode-i_mutex); } - disk_super = root-fs_info-super_copy; - features = btrfs_super_incompat_flags(disk_super); if (range-compress_type == BTRFS_COMPRESS_LZO) { - features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO; - btrfs_set_super_incompat_flags(disk_super, features); + btrfs_set_fs_incompat(root-fs_info, COMPRESS_LZO); } ret = defrag_count; @@ -2761,8 +2756,6 @@ static long btrfs_ioctl_default_subvol(struct file *file, void __user *argp) struct btrfs_path *path; struct btrfs_key location; struct btrfs_disk_key disk_key; - struct btrfs_super_block *disk_super; - u64 features; u64 objectid = 0; u64 dir_id; @@ -2813,12 +2806,7 @@ static long btrfs_ioctl_default_subvol(struct file *file, void __user *argp) btrfs_mark_buffer_dirty(path-nodes[0]); btrfs_free_path(path); - disk_super = root-fs_info-super_copy; - features = btrfs_super_incompat_flags(disk_super); - if (!(features BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL)) { - features |= BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL; - btrfs_set_super_incompat_flags(disk_super, features); - } + btrfs_set_fs_incompat(root-fs_info, DEFAULT_SUBVOL); btrfs_end_transaction(trans, root); return 0; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 26da344..75ee2c7 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -401,6 +401,7 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) compress_type = lzo; info-compress_type = BTRFS_COMPRESS_LZO; btrfs_set_opt(info-mount_opt, COMPRESS); + btrfs_set_fs_incompat(info, COMPRESS_LZO); } else if (strncmp(args[0].from, no, 2) == 0) { compress_type = no; info-compress_type = BTRFS_COMPRESS_NONE; -- 1.7.8.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to
Re: [PATCH] Xfstests: add btrfs snapshot function test
On Sat, Jul 21, 2012 at 11:46:00AM +0800, Liu Bo wrote: From: Zhou Bo zhoub-f...@cn.fujitsu.com This patch adds btrfs snapshot function test to xfstests. Signed-off-by: Zhou Bo zhoub-f...@cn.fujitsu.com --- 285 | 365 +++ 285.out |2 + group |1 + 3 files changed, 368 insertions(+), 0 deletions(-) create mode 100755 285 create mode 100644 285.out diff --git a/285 b/285 new file mode 100755 index 000..d247af3 --- /dev/null +++ b/285 @@ -0,0 +1,365 @@ +#! /bin/bash +# FS QA Test No. 285 +# +# Test btrfs's subvolume and snapshot function There already is one subvolume/snapshot test which is simple and basic. The new one is much more extensive and this needs a more verbose description +# +#--- +# Copyright (c) 2012 Fujitsu. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +#--- +# +# creator +owner=zhoub-f...@cn.fujitsu.com + +n=0 +seq=`basename $0` +echo QA output created by $seq + +here=`pwd` +tmp=/tmp/$$ +status=0 # success is the default! + +_cleanup() +{ +rm -f $tmp.* +} + +trap _cleanup ; exit \$status 0 1 2 3 15 + +# get standard environment, filters and checks +. ./common.rc +. ./common.filter + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch + +_scratch_mkfs_sized `expr 1024 \* 1024 \* 1024` /dev/null 21 Just curious, is there a reason why you create a 1G filesystem? This would imply a --mixed type of fs. +_scratch_mount + +_prepare_snapshot() +{ + _scratch_remount /dev/null + btrfs sub snap $SCRATCH_MNT $SCRATCH_MNT/basesnapshot /dev/null 2$here/$seq.full + btrfs sub snap -r $SCRATCH_MNT $SCRATCH_MNT/readonlysnapshot /dev/null 2$here/$seq.full here and below: please type full subcommands ie. btrfs subvolume snapshot although the short ones are allowed, using full names is future-proof when there could be another subcommand with the same short prefix. + _scratch_unmount /dev/null 2$here/$seq.full + VALID_SUBVOLUME=basesnapshot + VALID_RO_SUBVOLUME=readonlysnapshot + SNAPSHOTSTR=snapshot + FILE1=file1- + FILE2=file2- + MVFILE2=newfile2- + DIR1=dir1- + DIR2=dir2- + MVDIR2=newdir2- + MVSNAPSHOT=mvsnapshot- + SRCSUBVOL=srcsubvol- +} + +_parse_options() +{ + SOURCE_TARGET=$1 + case $SOURCE_TARGET in + 1) + SOURCE_SUBVOLUME=$VALID_SUBVOLUME + ;; + esac + SOURCE_READ=$2 + case $SOURCE_READ in + 1) + SOURCE_SUBVOLUME=$VALID_RO_SUBVOLUME + ;; + esac + DESTINATION_TARGET=$3 + case $DESTINATION_TARGET in + 1) + DESTINATION_SUBVOLUME=$SNAPSHOTSTR$n + ;; + esac + DESTINATION_READ=$4 + case $DESTINATION_READ in + 1) + SNAPSHOTOPT_STR=-r not that it matters much, SNAPSHOT_OPT_STR would look more consistent with other variable names, like MOUNT_OPT_STR + ;; + 2) + SNAPSHOTOPT_STR= + ;; + esac + MOUNT_OPT=$5 + case $MOUNT_OPT in + 1) + MOUNT_OPT_STR= + ;; + 2) + MOUNT_OPT_STR=-r + ;; + 3) + MOUNT_OPT_STR=-o nodatacow + ;; + esac + FILE_OPERATION_OPT=$6 + SNAPSHOT_ACTION_OPT=$7 + TEST_DIR1=$DIR1$n + TEST_DIR2=$DIR2$n + TEST_MVDIR2=$MVDIR2$n + TEST_FILE1=$FILE1$n + TEST_FILE2=$FILE2$n + TEST_MVFILE2=$MVFILE2$n + TEST_MVSNAPSHOT=$MVSNAPSHOT$n + SRC_SUBVOLUME=$SRCSUBVOL$n + n=$[n+1] +} + +_create_file() +{ + mkdir $SRC_SUBVOLUME/$TEST_DIR1 $SRC_SUBVOLUME/$TEST_DIR2 /dev/null + touch $SRC_SUBVOLUME/$TEST_FILE1 $SRC_SUBVOLUME/$TEST_FILE2 /dev/null +} + +_do_file_operation() +{ + btrfs filesystem balance $SCRATCH_MNT /dev/null 21 although 'btrfs filesystem balance /mnt' works, please use
Re: [PATCH v4] Btrfs: Check INCOMPAT flags on remount and add helper function
On Tue, Jul 24, 2012 at 12:58:43PM -0500, Mitch Harder wrote: In support of the recently added capability to remount with lzo compression, provide a helper function to check the compression INCOMPAT flags when remounting with lzo compression, and set the flags if necessary. Also, implement the new helper function when defragmenting with explicit lzo compression and when setting the default subvolume. Signed-off-by: Mitch Harder mitch.har...@sabayonlinux.org Thanks! Reviewed-by: David Sterba dste...@suse.cz -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs send/receive: if new inode ino is less than its new directory ino, incorrect path is sent
On Wed, Jul 18, 2012 at 7:45 PM, Alex Lyakas alex.bolshoy.bt...@gmail.com wrote: Hi Alexander, I am testing different scenarios in order to better understand the non-trivial magic of get_cur_path()/will_overwrite_ref()/did_overwrite_ref()/did_overwrite_first_ref(). I hit the following issue, when testing full-send: This is my source subvolume (inode numbers are written): tree -A --inodes --noreport /mnt/src/tmp/ /mnt/src/tmp/ └── [270] dir2 └── [268] file1_nod As you see, the ino(file1_nod) ino(dir2). It is very easy to achieve: first create the file, then the dir, and then move the file to dir. During send the following happens (I augmented the send code with many prints): file1_nod is sent first. Since its a new inode, it is sent as an orphan. When recording its reference, __record_new_ref() calls get_cur_path() for its parent (270). Then __get_cur_name_and_parent() is called on 270, which calls is_inode_existent(), which calls get_cur_inode_state(), and the state of the parent is will_create. So __get_cur_name_and_parent() creates an orphan name for it, and finally the new reference for 268 is recorded as: o270-136-0/file1_nod: [changed_cb:4102] key(256 INODE_ITEM 0) : NEW [changed_cb:4102] key(256 INODE_REF 256) : NEW [changed_cb:4102] key(268 INODE_ITEM 0) : NEW [send_create_inode:2407] NEW ino(268,135) type=010, path=[o268-135-0] [changed_cb:4102] key(268 INODE_REF 270) : NEW [get_cur_inode_state:1475] (270,136): L(EX,136) R(NE,18446744072099047770) sp=268 == will_create [is_inode_existent:1498] (270,136): NOT existent [__get_cur_name_and_parent:1918] ino(270,136) not existent = unique name [o270-136-0] [get_cur_path:2051] ino(0,0) cur_path=[o270-136-0] [__record_new_ref:2911] record new ref [o270-136-0/file1_nod] Then process_recorded_refs() sees that 268 is still orphan, so it sends rename to its valid place, but the problem is that its parent dir was not sent yet (and its parent dir is also an orphan): [process_recorded_refs:2601] ino(268,135): start with refs [28118.347602] [process_recorded_refs:2651] ino(268,135): new=1, did_overwrite_first_ref=0, is_orphan=1, valid_path=[o268-135-0] [28118.347605] [process_recorded_refs:2701] ino(268,135): is orphan, move it: [o268-135-0]=[o270-136-0/file1_nod] [28118.347610] [process_recorded_refs:2837] checking dir(270,136) [28118.347612] [process_recorded_refs:2869] ino(268,135) done with refs Now the parent dir is processed: [changed_cb:4102] key(270 INODE_ITEM 0) : NEW [send_create_inode:2407] NEW ino(270,136) type=04, path=[o270-136-0] [changed_cb:4102] key(270 INODE_REF 256) : NEW [get_cur_path:2051] ino(256,133) cur_path=[] [__record_new_ref:2911] record new ref [dir2] [process_recorded_refs:2601] ino(270,136): start with refs [process_recorded_refs:2651] ino(270,136): new=1, did_overwrite_first_ref=0, is_orphan=1, valid_path=[o270-136-0] [process_recorded_refs:2701] ino(270,136): is orphan, move it: [o270-136-0]=[dir2] [process_recorded_refs:2837] checking dir(256,133) [get_cur_inode_state:1475] (256,133): L(EX,133) R(NE,18446612135413283512) sp=270 == did_create [process_recorded_refs:2869] ino(270,136) done with refs Nothing special here, the parent is first sent as an orphan, and then renamed to its valid name, but it's too late. During receive: ERROR: rename o268-135-0 - o270-136-0/file1_nod failed. No such file or directory I am not yet sure where is the proper place to fix this, I just wanted to report it first. Basically, I think that when sending any kind of A_PATH, it is needed to ensure that path components exist, either as orphan or real path (by sending them out-of-order if needed?). But I am not yet sure where is the core place that should ensure this. Thanks, Alex. I have pushed a fix for this case. Basically, the solution is to postpone the processing of refs in not created dirs until the dir is created. Big thanks for investigating this one. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 6/6] Btrfs-progs: add btrfs send/receive commands
On Thu, Jul 19, 2012 at 3:25 PM, Alex Lyakas alex.bolshoy.bt...@gmail.com wrote: +static int process_link(const char *path, const char *lnk, void *user) +{ + int ret; + struct btrfs_receive *r = user; + char *full_path = path_cat(r-full_subvol_path, path); + + if (g_verbose = 1) + fprintf(stderr, link %s - %s\n, path, lnk); + + ret = link(lnk, full_path); + if (ret 0) { + ret = -errno; + fprintf(stderr, ERROR: link %s - %s failed. %s\n, path, + lnk, strerror(-ret)); + } Actually it has to be: char *full_link_path = path_cat(r-full_subvol_path, lnk); ... ret = link(full_path/*oldpath*/, full_link_path/*newpath*/); ... free(full_link_path); Thanks, Alex. Actually, the pathes got mixed up in-kernel. You'll find a pushed fix in the kernel repo. I also pushed a fix to btrfs-progs containing the full_link_path. Thanks again :) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
No/bad auto-detection of fs type for small volumes (related to mixed metadata/data?)
When I create a btrfs volume of size strictly less than 256 MiB then if I do mount /dev/sdb1 /mnt/test the kernel tries unsuccessfully to do the mount with many other file systems before successfully trying with btrfs. For volumes of size larger than or equal to 256 MiB it just mounts the volume without doing that. Why is this discrepancy? Another possibly related symptom is that the volume does not appear in /dev/disk/by-label and /dev/disk/by-uuid at all. This means that it is impossible to mount the volume by uuid or label. To make sure that this isn't a udev bug, I booted my system with init=/bin/bash in the kernel command line, and then I tried again to mount the volume. This time it would not mount it at all unless I explicitly specified the fs type. On the other hand, it could mount larger volumes without any issues. All the experiments were done in an initially zeroed out disk. I am using 3.4.6 kernel with btrfs from 3.5 and the latest btrfs-progs from git. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Upgrading from 2.6.38, how?
Firstly I know what I've been doing has been less than 100% safe, but I've been prepared to live with it. For about 2 years now (you know from around the time btrfs looked like RAID5/6 was just around the corner) I've had a server with a 5 disk RAID10 btrfs array. I realise there has been quite some change to the btrfs implementation since 2.6.38 but I'm hoping that there shouldn't be anything blocking me moving to a much more modern kernel. My proposed upgrade method is: Boot from a live CD with the latest kernel I can find so I can do a few tests: A - run the fsck in read only mode to confirm things look good B - mount read only, confirm that I can read files well C - mount read write, confirm working Install latest OS, upgrade to latest kernel, then repeat above steps. Any likely hiccups with the above procedure and suggested alternatives? -- Gareth Pye Level 2 Judge, Melbourne, Australia Australian MTG Forum: mtgau.com gar...@cerberos.id.au - www.rockpaperdynamite.wordpress.com Dear God, I would like to file a bug report -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Upgrading from 2.6.38, how?
On Wed, Jul 25, 2012 at 11:39 AM, Gareth Pye gar...@cerberos.id.au wrote: My proposed upgrade method is: Boot from a live CD with the latest kernel I can find so I can do a few tests: A - run the fsck in read only mode to confirm things look good B - mount read only, confirm that I can read files well C - mount read write, confirm working Install latest OS, upgrade to latest kernel, then repeat above steps. Any likely hiccups with the above procedure and suggested alternatives? I'd simply install the new OS on a new partition/subvol. This is what I did when upgrading from natty - oneiric - precise. IIRC there are some incompatibilites (e.g. space/inode cache disk format?) but newer kernels will just do the right thing, drop the old cache and create a new one. -- Fajar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html