Re: Set nodatacow per file?
On 02/13/2012 03:34 PM, Chester wrote: On Mon, Feb 13, 2012 at 1:17 AM, Ralf-Peter Rohbeck rohb...@yahoo.com wrote: Hello, is it possible to set nodatacow on a per-file basis? I couldn't find anything. If not, wouldn't that be a great feature to get around the performance issues with VM and database storage? Of course cloning should still cause COW. IIRC this is already a feature in btrfs. I didn't catch the whole talk, but Chris mentioned something like this at Scale 10x. I also remember seeing a patch for it a while back (I think it was from liu bo) that does this. You're right, and I've made the prog patches which is against linux-ulit(chattr/lsattr): http://www.spinics.net/lists/linux-btrfs/msg09604.html http://www.spinics.net/lists/linux-btrfs/msg09605.html but they are not merged yet. thanks, liubo Thanks, Ralf-Peter -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Set nodatacow per file?
On 02/13/2012 05:09 PM, Roman Mamedov wrote: On Mon, 13 Feb 2012 16:40:03 +0900 dimadole...@parallels.com wrote: Hello Ralf-Peter, Actually it is possible. Check out David's response to my question from some time ago: http://permalink.gmane.org/gmane.comp.file-systems.btrfs/14227 The nocow.c script he attached does just the thing you want. The script is really working. I needed nocow for different purpose but it did not occur to me to try it on VM image and see if the performance would improve. Sounds like a great idea. If you get around to try it, pls. post your impressions here. Thanks for the link, this is indeed interesting. I made a couple of small changes, i.e. I wanted a way to unset nocow and to check that changing flags really worked. gcc -o /usr/local/bin/nocow nocow.c ln -sf /usr/local/bin/nocow /usr/local/bin/cow Perhaps the support for setting this flag should be added to the 'btrfs' utility. Cool. Thanks Roma! I really wanted the feature to 'unset' the nocow and check the current state of flags. Will check it out today. I also think that it should definitely be included into the userspace btrfs utilities. best ~dima -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Cluster-devel] [PATCH 3/4] gfs2: Use generic handlers of O_SYNC AIO DIO
Hi, Acked-by: Steven Whitehouse swhit...@redhat.com That looks ok to me, Steve. On Fri, 2012-02-10 at 17:04 +0100, Jan Kara wrote: Use generic handlers to queue fsync() when AIO DIO is completed for O_SYNC file. Signed-off-by: Jan Kara j...@suse.cz --- fs/gfs2/aops.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index 501e5cb..9c381ff 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -1034,7 +1034,7 @@ static ssize_t gfs2_direct_IO(int rw, struct kiocb *iocb, rv = __blockdev_direct_IO(rw, iocb, inode, inode-i_sb-s_bdev, iov, offset, nr_segs, gfs2_get_block_direct, - NULL, NULL, 0); + NULL, NULL, DIO_SYNC_WRITES); out: gfs2_glock_dq_m(1, gh); gfs2_holder_uninit(gh); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can't read superblock (but could mount)
On Sat, Feb 11, 2012 at 07:27:25AM +0100, bt...@nentwig.biz wrote: Quoting Chris Mason chris.ma...@oracle.com: On Fri, Feb 10, 2012 at 05:18:42PM -0500, Chris Mason wrote: Ok, step one: Pull down the dangerdonteveruse branch of btrfs-progs: git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git dangerdonteveruse Run btrfs-debug-tree -r /dev/sda1 and send the output here please. Sorry, that's btrfs-debug-tree -R /dev/sda1 # ./btrfs-debug-tree -R /dev/sda1 root tree: 10229936128 level 1 chunk tree: 10364125184 level 1 extent tree key (EXTENT_TREE ROOT_ITEM 0) 10229944320 level 3 device tree key (DEV_TREE ROOT_ITEM 0) 10192654336 level 0 fs tree key (FS_TREE ROOT_ITEM 0) 10103791616 level 3 checksum tree key (CSUM_TREE ROOT_ITEM 0) 10103156736 level 2 data reloc tree key (DATA_RELOC_TREE ROOT_ITEM 0) 10027970560 level 0 Ok, I'm surprised our bad block wasn't one of the roots, we're looking for 9872289792. Could you please do: btrfs-debug-tree -b 9872289792 /dev/xxx Then please run the new fsck (without any args) on the device and send the output here. Before we mount with -o recovery or try to repair things, I just want to double check where this bad block lives. But we it is that the bootloader apparently is able to load (at least) the kernel (and initrd) from the partition? The btrfs-debug-tree command above should answer this, but I'd guess that syslinux either isn't checking the parent trans id or it just doesn't need to read this block to find the files. That's a good sign ;) -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Set nodatacow per file?
On 02/13/2012 05:09 PM, Roman Mamedov wrote: On Mon, 13 Feb 2012 16:40:03 +0900 dimadole...@parallels.com wrote: Hello Ralf-Peter, Actually it is possible. Check out David's response to my question from some time ago: http://permalink.gmane.org/gmane.comp.file-systems.btrfs/14227 The nocow.c script he attached does just the thing you want. The script is really working. I needed nocow for different purpose but it did not occur to me to try it on VM image and see if the performance would improve. Sounds like a great idea. If you get around to try it, pls. post your impressions here. Thanks for the link, this is indeed interesting. I made a couple of small changes, i.e. I wanted a way to unset nocow and to check that changing flags really worked. gcc -o /usr/local/bin/nocow nocow.c ln -sf /usr/local/bin/nocow /usr/local/bin/cow Perhaps the support for setting this flag should be added to the 'btrfs' utility. Hello Roman, I don't seem to be able to 'unset' the NOCOW flag. Looking at the code I would guess that it is supposed to alternate between 'cow' and 'nocow' states, but the condition printf(Remove NOCOW flag for %s\n, argv[1]); never shows for me. What should I do to make it working? Maybe it would be nice to have a switch to just check if nocow is set on file without actually changing the flag. thanks ~dima -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Several unhappy btrfs's after RAID meltdown
On Sun, Feb 12, 2012 at 10:31:34AM -0600, Ryan C. Underwood wrote: So, I examined the below filesystem, the one of the two that I would really like to restore. There is basically nothing but zeros, and very occasionally a sparse string of data, until exactly 0x20 offset, This matches start of an allocation cluster. ... at which point the data is suddenly very packed and looks like usual compressed data should. Is there a way one could de-LZO the data chunkwise and dump to another device so I could even get an idea what I am looking at? If the blocks are in right order, you can decompress the raw data from the format [4B total length] [4B compressed chunk length][chunk data] [another chunk] there is no signature of the compressed extent boundaries, but the lengths stored are always smaller than 128K, so it's hex values like 23 04 00 00 | 34 01 00 00 | lzo data... and shoud be detectable in the block sequence. What about a 'superblock' signature I can scan for? _BHRfS_M at offset 0x40 in a 4kb aligned block david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Set nodatacow per file?
On Mon, 13 Feb 2012 22:42:23 +0900 dima dole...@parallels.com wrote: gcc -o /usr/local/bin/nocow nocow.c ln -sf /usr/local/bin/nocow /usr/local/bin/cow I don't seem to be able to 'unset' the NOCOW flag. Looking at the code I would guess that it is supposed to alternate between 'cow' and 'nocow' states, but the condition printf(Remove NOCOW flag for %s\n, argv[1]); never shows for me. What should I do to make it working? Maybe it would be nice to have a switch to just check if nocow is set on file without actually changing the flag. If you place it in /usr/local/bin and also make a symlink nocow - cow as described in the quoted part above, you can then just run: # cow ./filename and the program will instead unset the NOCOW flag. Of course it remains to be a very basic program, I'm not sure if it's worth developing it further, or to add this to 'btrfs', as Liu Bo in the adjacent thread said that there are patches to chattr/lsattr for using the COW attribute. -- With respect, Roman ~~~ Stallman had a printer, with code he could not see. So he began to tinker, and set the software free. signature.asc Description: PGP signature
Re: Set nodatacow per file?
Hi, On Mon, Feb 13, 2012 at 04:40:03PM +0900, dima wrote: Actually it is possible. Check out David's response to my question from some time ago: http://permalink.gmane.org/gmane.comp.file-systems.btrfs/14227 this was a quick aid, please see attached file for an updated tool to set the file flags, now added 'z' for NOCOMPRESS flag, and supports chattr syntax plus all of the standard file flags. Setting and unsetting nocow is done like 'fileflags +C file' or -C for unseting. Without any + or - options it prints current state. david #include errno.h #include fcntl.h #include inttypes.h #include linux/fs.h #include linux/types.h #include stdio.h #include stdlib.h #include string.h #include sys/ioctl.h #ifndef FS_NOCOMP_FL #define FS_NOCOMP_FL0x0400 /* Don't compress */ #endif #ifndef FS_NOCOW_FL #define FS_NOCOW_FL 0x0080 /* Do not cow file */ #endif struct flags_name { unsigned long flag; const char short_name; const char *long_name; }; static const struct flags_name flags[]={ /* new */ { FS_NOCOW_FL, 'C', NOCOW }, { FS_NOCOMP_FL, 'z', Not_Compressed }, /* current */ { FS_SECRM_FL, 's', Secure_Deletion }, { FS_UNRM_FL, 'u' , Undelete }, { FS_SYNC_FL, 'S', Synchronous_Updates }, { FS_DIRSYNC_FL, 'D', Synchronous_Directory_Updates }, { FS_IMMUTABLE_FL, 'i', Immutable }, { FS_APPEND_FL, 'a', Append_Only }, { FS_NODUMP_FL, 'd', No_Dump }, { FS_NOATIME_FL, 'A', No_Atime }, { FS_COMPR_FL, 'c', Compression_Requested }, { FS_COMPRBLK_FL, 'B', Compressed_File }, { FS_DIRTY_FL, 'Z', Compressed_Dirty_File }, { FS_ECOMPR_FL, 'E', Compression_Error }, { FS_JOURNAL_DATA_FL, 'j', Journaled_Data }, { FS_INDEX_FL, 'I', Indexed_directory }, { FS_NOTAIL_FL, 't', No_Tailmerging }, { FS_TOPDIR_FL, 'T', Top_of_Directory_Hierarchies }, /* { EXT4_EXTENTS_FL, 'e', Extents }, */ /* { EXT4_HUGE_FILE_FL, 'h', Huge_file }, */ }; static unsigned long to_set, to_unset; unsigned long c2val(char c) { int i; for(i=0;isizeof(flags)/sizeof(flags[0]);i++) { if(flags[i].short_name==c) return flags[i].flag; } printf(Warning: flag '%c' not found\n, c); return 0; } void list_flags(unsigned long fflags) { int i; printf(Flags:\n); for(i=0;isizeof(flags)/sizeof(flags[0]);i++) { if(fflags flags[i].flag) { printf( %c %s\n, flags[i].short_name, flags[i].long_name); } } } void set_flag(const char* in) { int i; for(i=0;istrlen(in);i++) { to_set |= c2val(in[i]); printf( set %c\n, in[i]); } } void unset_flag(const char* in) { int i; for(i=0;istrlen(in);i++) { to_unset |= c2val(in[i]); printf( unset %c\n, in[i]); } } int main(int argc, char **argv) { int ret; int optind; int modify=0; if (argc 2) { printf(usage: %s [options] [--] file...\n, argv[0]); exit(1); } to_set = 0; to_unset = 0; for(optind=1;optindargc;optind++) { char *o=argv[optind]; if(o[0]=='-' o[1]=='-') { optind++; break; } else if(o[0]=='-') { unset_flag(o+1); modify=1; } else if(o[0]=='+') { set_flag(o+1); modify=1; } else break; } for(;optindargc;optind++) { unsigned long fflags; int fd = -1; if(fd!=-1) close(fd); fd = open(argv[optind], O_RDONLY); if (fd == -1) { perror(open()); continue; } ret = ioctl(fd, FS_IOC_GETFLAGS, fflags); if (ret == -1) { perror(ioctl(GETFLAGS)); continue; } if(modify) { fflags |= to_set; fflags = ~to_unset; ret = ioctl(fd, FS_IOC_SETFLAGS, fflags); if (ret == -1) { perror(ioctl(SETFLAGS)); continue; } } printf(File: %s\n, argv[optind]); list_flags(fflags); putchar('\n'); } return 0; }
Re: Set nodatacow per file?
The fileflags utility works great! Thanks! Am 13.02.2012 15:10, schrieb David Sterba: Hi, On Mon, Feb 13, 2012 at 04:40:03PM +0900, dima wrote: Actually it is possible. Check out David's response to my question from some time ago: http://permalink.gmane.org/gmane.comp.file-systems.btrfs/14227 this was a quick aid, please see attached file for an updated tool to set the file flags, now added 'z' for NOCOMPRESS flag, and supports chattr syntax plus all of the standard file flags. Setting and unsetting nocow is done like 'fileflags +C file' or -C for unseting. Without any + or - options it prints current state. david signature.asc Description: OpenPGP digital signature
Re: Set nodatacow per file?
On 02/13/2012 10:51 PM, Roman Mamedov wrote: On Mon, 13 Feb 2012 22:42:23 +0900 dimadole...@parallels.com wrote: gcc -o /usr/local/bin/nocow nocow.c ln -sf /usr/local/bin/nocow /usr/local/bin/cow I don't seem to be able to 'unset' the NOCOW flag. Looking at the code I would guess that it is supposed to alternate between 'cow' and 'nocow' states, but the condition printf(Remove NOCOW flag for %s\n, argv[1]); never shows for me. What should I do to make it working? Maybe it would be nice to have a switch to just check if nocow is set on file without actually changing the flag. If you place it in /usr/local/bin and also make a symlink nocow - cow as described in the quoted part above, you can then just run: # cow ./filename and the program will instead unset the NOCOW flag. Of course it remains to be a very basic program, I'm not sure if it's worth developing it further, or to add this to 'btrfs', as Liu Bo in the adjacent thread said that there are patches to chattr/lsattr for using the COW attribute. Thanks Roma! I overlooked the symlink. Works! ~dima -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS crash during mount
On Sat, Feb 11, 2012 at 03:44:08PM +0100, Daniel Kuhn wrote: The mount option -o recovery doesn't change anything, the segmentation fault still occurs. Any ideas? Sorry for the hassle, you should be able to get by this by zeroing the log root. Run btrfs-zero-log /dev/xxx -chris Daniel cwillu wrote: On Wed, Feb 8, 2012 at 4:19 PM, Daniel Kuhn che...@swissonline.ch wrote: After a forced power turn-off the filesystem of my primary boot partition cannot be mounted anymore, btrfs crashes during the mount process. I'm using OpenSuse 12.1 but I've also tried mounting with a newer kernel 3.2.2 (systemrescue cd) and with a usb-converter connected to another PC without success. The kernel log seems pretty specific about the crash location, see below. Best regards, Daniel Kuhn [ 66.476674] [ cut here ] [ 66.476684] kernel BUG at fs/btrfs/free-space-cache.c:1515! [ 66.476691] invalid opcode: [#1] SMP [ 66.476699] Modules linked in: tpm_tis tpm tpm_bios i2c_nforce2 serio_raw pcspkr floppy k10temp asus_atk0110 raid10 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 raid0 multipath linear ata_generic nouveau ttm drm_kms_helper drm i2c_algo_bit firewire_ohci i2c_core pata_acpi mxm_wmi forcedeth pata_marvell firewire_core pata_amd video wmi [ 66.476752] [ 66.476759] Pid: 1844, comm: mount Not tainted 3.2.2-alt250-i586 #2 System manufacturer System Product Name/M3N-HT DELUXE [ 66.476772] EIP: 0060:[c06f7b6f] EFLAGS: 00010206 CPU: 2 [ 66.476785] EIP is at remove_from_bitmap+0xa8/0x285 [ 66.476792] EAX: 6a92c000 EBX: ECX: 0005c000 EDX: 0002 [ 66.476799] ESI: f2f5baa8 EDI: f2f5ba8c EBP: f2f5ba48 ESP: f2f5b9ec [ 66.476805] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 66.476813] Process mount (pid: 1844, ti=f2f5a000 task=f2ff7080 task.ti=f2f5a000) [ 66.476818] Stack: [ 66.476822] f2f5ba2c 0385 f2f5ba58 f2750370 f2f5ba48 f2f5ba44 f2f5ba40 [ 66.476837] 0019 71bf 0002 71c0 0002 f3159600 073ba000 [ 66.476851] 0005c000 6a92c000 0002 f2f5baa8 f2750370 f2f5baa0 [ 66.476865] Call Trace: [ 66.476877] [c06f9bf4] btrfs_remove_free_space+0x34c/0x370 [ 66.476889] [c06bcfa3] btrfs_alloc_logged_file_extent+0x114/0x211 [ 66.476900] [c06af00c] ? btrfs_free_path+0x1b/0x1e [ 66.476909] [c06af00c] ? btrfs_free_path+0x1b/0x1e [ 66.476919] [c06f5afd] replay_one_extent+0x470/0x5f2 [ 66.476929] [c050ef9a] ? __fsnotify_inode_delete+0x8/0xa [ 66.476941] [c06f6f55] replay_one_buffer+0x1d6/0x229 [ 66.476950] [c06f2cfe] walk_down_log_tree+0x15b/0x2cd [ 66.476959] [c06f3062] walk_log_tree+0x71/0x188 [ 66.476968] [c06f5011] btrfs_recover_log_trees+0x24a/0x257 [snip] -o recovery under 3.2 or later should fix it up. You'll want to remain on 3.2 at that point, and then switch to 3.3 when that's released, and so on. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please help me solving BTRFS failure
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Mon, Feb 13, 2012 at 04:58:01PM +0300, Private Inf wrote: Hello Dave, According to this threadhttp://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg11548.htmlyou were able to fix your faulty BTRFS. Looks like I have the same problem as you do. It is described in this thread (user DeeKey): https://bugs.launchpad.net/ubuntu/+source/btrfs-tools/+bug/793410 Could you please share your knowledge how to solve the problem? Your help will be highly appreciated! Regards, Denis Kulandin Hey Denis, btrfs failures are a bitch! I've had them happen to me on multiple occasions and am much wiser about backups now. I'm cross posting this to the mailing list so it will be on record in case anyone else has this issue (and so smarter people may chime in). In the email you're referencing, I describe getting an older kernel (2.6.32 series) and patching btrfs to ignore checksums, since that tree is what's broken. Then mounting read-only and copying the data to another partition. I can't get to kerneltrap for some reason so below is the patch I created. I've never use Ubuntu so I can't give you any advice on custom kernels for that platform but as a btrfs user, you're probably used to compiling kernels every other week ;) The above might be purely academic though. There seems to be some on-the-fly repair magic build into recent kernels. If all else fails, you should grab a recent git of progs; I hear there's some tool that will copy data from a busted btrfs to another volume. Good luck! diff -ur linux-2.6.32.orig/fs/btrfs/compression.c linux-2.6.32/fs/btrfs/compression.c - --- linux-2.6.32.orig/fs/btrfs/compression.c 2011-08-02 16:11:53.514986277 -0400 +++ linux-2.6.32/fs/btrfs/compression.c 2011-08-02 16:12:58.621654825 -0400 @@ -140,8 +140,7 @@ wanted %u mirror %d\n, inode-i_ino, (unsigned long long)disk_start, csum, *cb_sum, cb-mirror_num); - - ret = -EIO; - - goto fail; + printk(btrfs ignoring compressed csum mismatch); } cb_sum++; diff -ur linux-2.6.32.orig/fs/btrfs/inode.c linux-2.6.32/fs/btrfs/inode.c - --- linux-2.6.32.orig/fs/btrfs/inode.c2011-08-02 16:11:53.514986277 -0400 +++ linux-2.6.32/fs/btrfs/inode.c 2011-08-02 16:13:32.821655813 -0400 @@ -1982,8 +1982,8 @@ csum = btrfs_csum_data(root, kaddr + offset, csum, end - start + 1); btrfs_csum_final(csum, (char *)csum); - - if (csum != private) - - goto zeroit; + if (csum != private printk_ratelimit()) + printk(KERN_INFO btrfs ignoring csum mismatch); kunmap_atomic(kaddr, KM_USER0); good: - -- - -=[dave]=- Entropy isn't what it used to be. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iF4EAREIAAYFAk85JhsACgkQXM0u5ajNnCjjiAD/aE51kI5IC4eHZp+TsffyFCOk 7L3FP5X3Uzj2BKA/0GEA/jm2fRcAUw4NO8mYJU84kqmjFDQdsKIZMJ5vjSzz3H5r =XrxN -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFB] add LZ4 compression method to btrfs
Hi, so here it is, LZ4 compression method inside btrfs. The patchset is based on top of current Chris' for-linus + Andi's snappy implementation + the fixes from Li Zefan. Passes xfstests and stresstests. I haven't measured performance on wide range of hardware or workloads, rather wanted to publish the patches before I get distracted again. I'd like to ask anyone willing and able to test this to share your results with us. At least an example from standalone benchmarks of snappy-c/snappy(upstream)/lzo/lz4: Silesia corpus (avg of 10 runs), AMD bulldozer box, 12G ram, 1Ghz cpu: lz4 = 739860 us ( 286 MB/s) 195930 us (1081 MB/s) 211957760 - 101630873 47.9% snappy 1.0.4 = 1050 ms ( 201 MB/s) 248 ms ( 853 MB/s) 211957760 - 104739310 49.4% snappy-c = 940111 us ( 225 MB/s) 299690 us ( 707 MB/s) 211957760 - 131060567 61.8% lzo 2.06 1x_1= 739421 us ( 286 MB/s) 436542 us ( 485 MB/s) 211957760 - 100576151 47.5% Silesia corpus (avg of 10 runs), Nehalem X7560, 2.3Ghz cpu: lz4 = 624170 us ( 339 MB/s) 200622 us (1056 MB/s) 211957760 - 101630873 47.9% snappy 1.0.4 = 1047 ms ( 202 MB/s) 265 ms ( 797 MB/s) 211957760 - 104739310 49.4% snappy-c = 836415 us ( 253 MB/s) 300567 us ( 705 MB/s) 211957760 - 131060567 61.8% lzo 2.06 1x_1= 639305 us ( 331 MB/s) 470840 us ( 450 MB/s) 211957760 - 100576151 47.5% * snappy 1.0.4 svn r58 * snappy-c as Andi sent it to mailinglist * lzo 2.0.6 1x_1 variant * lz4 r55 (r54 + bugfix in the hash table entry type) * compiled by gcc 4.7, -O2 pullable from: git://repo.or.cz/linux-2.6/btrfs-unstable.git dev/compression-squad david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] btrfs: add LZ4 compression method
Adjusted a few defines to compile. The on-stack context allocation is disabled and either LZ4_compress64kCtx or LZ4_compressCtx must be used. Origin: http://lz4.googlecode.com/svn/trunk Revision: 55 Signed-off-by: David Sterba dste...@suse.cz --- fs/btrfs/Makefile |2 +- fs/btrfs/lz4.c| 810 + fs/btrfs/lz4.h| 107 +++ 3 files changed, 918 insertions(+), 1 deletions(-) create mode 100644 fs/btrfs/lz4.c create mode 100644 fs/btrfs/lz4.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index f22fe03..11f8c4e 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -8,7 +8,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \ export.o tree-log.o free-space-cache.o zlib.o lzo.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ - reada.o backref.o ulist.o snappy.o + reada.o backref.o ulist.o snappy.o lz4.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/lz4.c b/fs/btrfs/lz4.c new file mode 100644 index 000..e41d0cf --- /dev/null +++ b/fs/btrfs/lz4.c @@ -0,0 +1,810 @@ +/* + LZ4 - Fast LZ compression algorithm + Copyright (C) 2011-2012, Yann Collet. + BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php) + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following disclaimer + in the documentation and/or other materials provided with the + distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ +/* + * With authors permission dual licensed as BSD/GPL for linux kernel + * + * Origin: http://lz4.googlecode.com/svn/trunk + * Revision: 55 + */ + +//** +// Tuning parameters +//** +// COMPRESSIONLEVEL : +// Increasing this value improves compression ratio +// Lowering this value reduces memory usage +// Reduced memory usage typically improves speed, due to cache effect (ex : L1 32KB for Intel, L1 64KB for AMD) +// Memory usage formula : N-2^(N+2) Bytes (examples : 12 - 16KB ; 17 - 512KB) +#define COMPRESSIONLEVEL 12 + +// NONCOMPRESSIBLE_CONFIRMATION : +// Decreasing this value will make the algorithm skip faster data segments considered incompressible +// This may decrease compression ratio dramatically, but will be faster on incompressible data +// Increasing this value will make the algorithm search more before declaring a segment incompressible +// This could improve compression a bit, but will be slower on incompressible data +// The default value (6) is recommended +#define NONCOMPRESSIBLE_CONFIRMATION 6 + +// BIG_ENDIAN_NATIVE_BUT_INCOMPATIBLE : +// This will provide a boost to performance for big endian cpu, but the resulting compressed stream will be incompatible with little-endian CPU. +// You can set this option to 1 in situations where data will stay within closed environment +// This option is useless on Little_Endian CPU (such as x86) +//#define BIG_ENDIAN_NATIVE_BUT_INCOMPATIBLE 1 + + + +//** +// CPU Feature Detection +//** +// 32 or 64 bits ? +#if (__x86_64__ || __x86_64 || __amd64__ || __amd64 || __ppc64__ || _WIN64 || __LP64__ || _LP64) // Detects 64 bits mode +#define ARCH64 1 +#else +#define ARCH64 0 +#endif + +// Little Endian or Big Endian ? +#if (__BIG_ENDIAN__ || __BIG_ENDIAN || _BIG_ENDIAN || _ARCH_PPC || __PPC__ || __PPC || PPC || __powerpc__ || __powerpc || powerpc || ((defined(__BYTE_ORDER__)(__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__))) ) +#define CPU_BIG_ENDIAN 1 +#else +// Little Endian assumed. PDP Endian and other very rare endian format are unsupported. +#endif + +//
[PATCH 3/4] btrfs: lz4: add wrapper for context size estimation
Currently 16K for 32bit and 64bit. Signed-off-by: David Sterba dste...@suse.cz --- fs/btrfs/lz4.c |9 + fs/btrfs/lz4.h |3 +++ 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/lz4.c b/fs/btrfs/lz4.c index e41d0cf..105ea9c 100644 --- a/fs/btrfs/lz4.c +++ b/fs/btrfs/lz4.c @@ -808,3 +808,12 @@ _output_error: return (int) (-(((char*)ip)-source)); } +int LZ4_context_size(void) +{ + return sizeof(struct refTables); +} +int LZ4_context64k_size(void) +{ + return sizeof(struct refTables); +} + diff --git a/fs/btrfs/lz4.h b/fs/btrfs/lz4.h index bbd5e12..b0f8cc7 100644 --- a/fs/btrfs/lz4.h +++ b/fs/btrfs/lz4.h @@ -102,6 +102,9 @@ int LZ4_compress64kCtx(void** ctx, char* dest, int isize); +int LZ4_context_size(void); +int LZ4_context64k_size(void); + #if defined (__cplusplus) } #endif -- 1.7.8 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] btrfs: lz4: add wrapper functions and enable it
Signed-off-by: David Sterba dste...@suse.cz --- fs/btrfs/Makefile |2 +- fs/btrfs/compression.c |1 + fs/btrfs/compression.h |1 + fs/btrfs/lz4_wrapper.c | 429 fs/btrfs/lzo.c |2 + fs/btrfs/super.c |3 - 6 files changed, 434 insertions(+), 4 deletions(-) create mode 100644 fs/btrfs/lz4_wrapper.c diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 11f8c4e..7bb7497 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -8,7 +8,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \ export.o tree-log.o free-space-cache.o zlib.o lzo.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ - reada.o backref.o ulist.o snappy.o lz4.o + reada.o backref.o ulist.o snappy.o lz4.o lz4_wrapper.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index f85f7fd..93b481e 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -731,6 +731,7 @@ struct btrfs_compress_op *btrfs_compress_op[] = { btrfs_zlib_compress, btrfs_lzo_compress, btrfs_snappy_compress, + btrfs_lz4_compress, }; int __init btrfs_init_compress(void) diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index 971a425..d8e8e73 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -80,5 +80,6 @@ struct btrfs_compress_op { extern struct btrfs_compress_op btrfs_zlib_compress; extern struct btrfs_compress_op btrfs_lzo_compress; extern struct btrfs_compress_op btrfs_snappy_compress; +extern struct btrfs_compress_op btrfs_lz4_compress; #endif diff --git a/fs/btrfs/lz4_wrapper.c b/fs/btrfs/lz4_wrapper.c new file mode 100644 index 000..bff9b1b --- /dev/null +++ b/fs/btrfs/lz4_wrapper.c @@ -0,0 +1,429 @@ +/* + * Copyright (C) 2008 Oracle. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#include linux/kernel.h +#include linux/slab.h +#include linux/vmalloc.h +#include linux/init.h +#include linux/err.h +#include linux/sched.h +#include linux/pagemap.h +#include linux/bio.h +#include lz4.h +#include compression.h + +#define LZ4_LEN4 +#define LZ4_CHUNK_SIZE (4096) +#define LZ4_MAX_WORKBUF2*LZ4_CHUNK_SIZE + +struct workspace { + void *mem; /* work memory for compression */ + void *buf; /* where compressed data goes */ + void *cbuf; /* where decompressed data goes */ + struct list_head list; +}; + +static void lz4_free_workspace(struct list_head *ws) +{ + struct workspace *workspace = list_entry(ws, struct workspace, list); + + vfree(workspace-buf); + vfree(workspace-cbuf); + vfree(workspace-mem); + kfree(workspace); +} + +static struct list_head *lz4_alloc_workspace(void) +{ + struct workspace *workspace; + + workspace = kzalloc(sizeof(*workspace), GFP_NOFS); + if (!workspace) + return ERR_PTR(-ENOMEM); + + workspace-mem = vmalloc(LZ4_context64k_size()); + workspace-buf = vmalloc(LZ4_MAX_WORKBUF); + workspace-cbuf = vmalloc(LZ4_MAX_WORKBUF); + if (!workspace-mem || !workspace-buf || !workspace-cbuf) + goto fail; + + INIT_LIST_HEAD(workspace-list); + + return workspace-list; +fail: + lz4_free_workspace(workspace-list); + return ERR_PTR(-ENOMEM); +} + +static inline void write_compress_length(char *buf, size_t len) +{ + __le32 dlen; + + dlen = cpu_to_le32(len); + memcpy(buf, dlen, LZ4_LEN); +} + +static inline size_t read_compress_length(char *buf) +{ + __le32 dlen; + + memcpy(dlen, buf, LZ4_LEN); + return le32_to_cpu(dlen); +} + +static int lz4_compress_pages(struct list_head *ws, + struct address_space *mapping, + u64 start, unsigned long len, + struct page **pages, + unsigned long nr_dest_pages, + unsigned long *out_pages, + unsigned long *total_in, + unsigned long *total_out,
Re: [3.2.1] BUG at fs/btrfs/inode.c:1588
2012/2/1 Kai Krakow hurikhan77+bt...@gmail.com: Just happened while writing a huge avi file to my usb3 backup disk: Same problem here, I try to give the filesystem history: a) three days ago I format a 219GB partition: 1) latest Linus' git kernel tree; 2) two nested subvolumes; 3) options: defaults,noatime,nobarrier,ssd,noacl,compress,subvol=root,autodefrag b) I copy ~90GB of data; c) I mount same as above, without compress; d) I copy other data, to ~140GB; e) run balance, after a while I had to poweroff; f) two days ago, I boot and it finishes the balance; g) I put in cron a snapshot every hour; h) today (after a lot of clean resume/suspend in RAM) I run VirtualBox and start an Ubuntu 12.04 install in Guest; i) near the end of installation VirtualBox get stuck (but everything else works) and kernel complain: [16661.706465] [ cut here ] [16661.706514] kernel BUG at fs/btrfs/inode.c:1588! [16661.706556] invalid opcode: [#1] SMP [16661.706597] CPU 0 [16661.706615] Modules linked in: zram(C) xfs exportfs usbhid hid binfmt_misc pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) rfcomm bnep joydev snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep uvcvideo r852 sm_common nand videobuf2_core snd_pcm videodev nand_ids btusb v4l2_compat_ioctl32 bluetooth mtd videobuf2_vmalloc videobuf2_memops nand_bch bch option usb_wwan nand_ecc usbserial psmouse snd_timer iwl3945 snd_page_alloc iwlegacy raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid0 linear dm_mirror dm_region_hash dm_log usb_storage sdhci_pci sdhci i915 drm_kms_helper mmc_core drm intel_agp intel_gtt sky2 agpgart [16661.707016] [16661.707016] Pid: 710, comm: btrfs-fixup-1 Tainted: G C O 3.3.0-rc3g+ #13 SAMSUNG ELECTRONICS CO., LTD. SQ45S70S/SQ45S70S [16661.707016] RIP: 0010:[811aedc5] [811aedc5] btrfs_writepage_fixup_worker+0x145/0x150 [16661.707016] RSP: :8800b9bb3df0 EFLAGS: 00010246 [16661.707016] RAX: RBX: ea0002185900 RCX: 02488000 [16661.707016] RDX: 8800897ae2a8 RSI: RDI: 8800897ae428 [16661.707016] RBP: 02488000 R08: 0008 R09: 8800b9bb3da8 [16661.707016] R10: 1000 R11: R12: 8800b75f5ff0 [16661.707016] R13: R14: 02488fff R15: 8800b75f5e70 [16661.707016] FS: () GS:8800bf40() knlGS: [16661.707016] CS: 0010 DS: ES: CR0: 8005003b [16661.707016] CR2: 7fa504fd40e0 CR3: ba515000 CR4: 06f0 [16661.707016] DR0: DR1: DR2: [16661.707016] DR3: DR6: 0ff0 DR7: 0400 [16661.707016] Process btrfs-fixup-1 (pid: 710, threadinfo 8800b9bb2000, task 8800373edcd0) [16661.707016] Stack: [16661.707016] 81046a30 880085a0d960 88008e3602a0 [16661.707016] 8800bf410b40 8800b9ce3840 880085a0d968 8800b9ce3858 [16661.707016] 8800b9ce3890 8800b9bb3e90 8800b9ce3858 811d6972 [16661.707016] Call Trace: [16661.707016] [81046a30] ? run_timer_softirq+0x220/0x220 [16661.707016] [811d6972] ? worker_loop+0xa2/0x500 [16661.707016] [811d68d0] ? btrfs_queue_worker+0x340/0x340 [16661.707016] [81056966] ? kthread+0x96/0xa0 [16661.707016] [8150b534] ? kernel_thread_helper+0x4/0x10 [16661.707016] [810568d0] ? kthread_freezable_should_stop+0x60/0x60 [16661.707016] [8150b530] ? gs_change+0xb/0xb [16661.707016] Code: 41 5f c3 0f 1f 00 41 b8 50 00 00 00 48 8d 4c 24 18 4c 89 f2 48 89 ee 4c 89 ff e8 d7 a4 01 00 eb b8 48 89 df e8 3d 01 ef ff eb 9c 0f 0b 66 0f 1f 84 00 00 00 00 00 41 55 4c 8d 6e 40 41 54 55 48 [16661.707016] RIP [811aedc5] btrfs_writepage_fixup_worker+0x145/0x150 [16661.707016] RSP 8800b9bb3df0 [16661.731968] ---[ end trace 9b36ae9483fc03e3 ]--- l) I can't remove ~/VirtualBox VMs/Ubuntu (command rm -rf doesn't return), but I can cleanly close others apps; m) booting from recovery partition I can mount BTRFS and rm the directory above; n) run btrfsck fs tree 454 refs 12 unresolved ref root 454 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 455 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 458 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 459 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 466 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 498 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 500 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 501 dir 844801 index 5 namelen 9 name snap-0212 error 600 unresolved ref root 503
[PATCH] btrfs-progs: convert: set label or copy from origin
Signed-off-by: David Sterba dste...@suse.cz --- convert.c | 46 ++ 1 files changed, 38 insertions(+), 8 deletions(-) diff --git a/convert.c b/convert.c index 13f3ece..3e74108 100644 --- a/convert.c +++ b/convert.c @@ -2332,7 +2332,8 @@ err: return ret; } -int do_convert(const char *devname, int datacsum, int packing, int noxattr) +int do_convert(const char *devname, int datacsum, int packing, int noxattr, + int copylabel, const char *fslabel) { int i, fd, ret; u32 blocksize; @@ -2424,6 +2425,17 @@ int do_convert(const char *devname, int datacsum, int packing, int noxattr) fprintf(stderr, error during create_ext2_image %d\n, ret); goto fail; } + memset(root-fs_info-super_copy.label, 0, BTRFS_LABEL_SIZE); + if (copylabel == 1) { + strncpy(root-fs_info-super_copy.label, + ext2_fs-super-s_volume_name, 16); + fprintf(stderr, copy label '%s'\n, + root-fs_info-super_copy.label); + } else if (copylabel == -1) { + strncpy(root-fs_info-super_copy.label, fslabel, BTRFS_LABEL_SIZE); + fprintf(stderr, set label to '%s'\n, fslabel); + } + printf(cleaning up system chunk.\n); ret = cleanup_sys_chunk(root, ext2_root); if (ret) { @@ -2812,11 +2824,13 @@ fail: static void print_usage(void) { - printf(usage: btrfs-convert [-d] [-i] [-n] [-r] device\n); - printf(\t-d disable data checksum\n); - printf(\t-i ignore xattrs and ACLs\n); - printf(\t-n disable packing of small files\n); - printf(\t-r roll back to ext2fs\n); + printf(usage: btrfs-convert [-d] [-i] [-n] [-r] [-l label] [-L] device\n); + printf(\t-d disable data checksum\n); + printf(\t-i ignore xattrs and ACLs\n); + printf(\t-n disable packing of small files\n); + printf(\t-r roll back to ext2fs\n); + printf(\t-l LABEL set filesystem label\n); + printf(\t-L use label from converted fs\n); } int main(int argc, char *argv[]) @@ -2826,9 +2840,12 @@ int main(int argc, char *argv[]) int noxattr = 0; int datacsum = 1; int rollback = 0; + int copylabel = 0; char *file; + char *fslabel = NULL; + while(1) { - int c = getopt(argc, argv, dinr); + int c = getopt(argc, argv, dinrl:L); if (c 0) break; switch(c) { @@ -2844,6 +2861,19 @@ int main(int argc, char *argv[]) case 'r': rollback = 1; break; + case 'l': + copylabel = -1; + fslabel = strdup(optarg); + if (strlen(fslabel) BTRFS_LABEL_SIZE) { + fprintf(stderr, + warning: label too long, trimmed to %d bytes\n, + BTRFS_LABEL_SIZE); + fslabel[BTRFS_LABEL_SIZE]=0; + } + break; + case 'L': + copylabel = 1; + break; default: print_usage(); return 1; @@ -2864,7 +2894,7 @@ int main(int argc, char *argv[]) if (rollback) { ret = do_rollback(file, 0); } else { - ret = do_convert(file, datacsum, packing, noxattr); + ret = do_convert(file, datacsum, packing, noxattr, copylabel, fslabel); } if (ret) return 1; -- 1.7.6.233.gd79bc -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Cross-subvolume reflink copy (BTRFS_IOC_CLONE over subvolume boundaries)
It's been nearly a year since the patches needed to implement a reflinked copy between subvolumes have been posted (http://permalink.gmane.org/gmane.comp.file-systems.btrfs/9865 ) and I still get Invalid cross-device link error with Linux 3.2.4 while I try to do a cp --reflink between subvolumes. This is a *very* useful feature to have (think offline file-level deduplication for one thing). From what I was able to find in the archives, the only objection (userland operation crossing subvolume boundaries) was rebutted by Chris Mason. Is there something else that I missed? Regards, -- Hubert Kario QBS - Quality Business Software 02-656 Warszawa, ul. Ksawerów 30/85 tel. +48 (22) 646-61-51, 646-74-24 www.qbs.com.pl -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3 v2] xfstests: add btrfs online defragments QA test
On 02/14/2012 01:53 AM, Christoph Hellwig wrote: This still needs a bit more work: +test_path=`pwd` +progs_dir=$test_path/src/btrfs_online_defragment/ this isn't actually used. +tmp=tmp/$$ +defrag_args=$test_path/${seq}.args Just hardcode the arguments, preferably without the args file indirection. +_create_file() +{ +CNT=11999 +FILESIZE=48000 +if [ $DEFRAG_TARGET = 1 ];then +for i in `seq $CNT -1 0`; do +dd if=/dev/zero of=$SCRATCH_MNT/tmp_file bs=4k count=1 \ + conv=notrunc seek=$i oflag=sync /dev/null +done +# get md5sum +md5sum $SCRATCH_MNT/tmp_file /tmp/checksum +elif [ $DEFRAG_TARGET = 2 ];then +mkdir $SCRATCH_MNT/tmp_dir +for i in `seq $CNT -1 0`; do +dd if=/dev/zero of=$SCRATCH_MNT/tmp_dir/tmp_file bs=4k \ +count=1 conv=notrunc seek=$i oflag=sync /dev/null +done +# get md5sum +md5sum $SCRATCH_MNT/tmp_dir/tmp_file /tmp/checksum +elif [ $DEFRAG_TARGET = 3 ];then +for i in `seq $CNT -1 0`; do +dd if=/dev/zero of=$SCRATCH_MNT/tmp_file bs=4k count=1 \ +conv=notrunc seek=$i oflag=sync /dev/null +done +# get md5sum +md5sum $SCRATCH_MNT/tmp_file /tmp/checksum +fi +} It seems like each of these cases should be a different function. +_btrfs_online_defrag() +{ +str= +if [ $FILE_RANGE = 2 ];then +str=$str -s -1 -l $((FILESIZE / 2)) +elif [ $FILE_RANGE = 3 ];then +str=$str -s $((FILESIZE + 1)) -l $((FILESIZE / 2)) +HAVE_DEFRAG=1 +elif [ $FILE_RANGE = 4 ];then +str=$str -l -1 +elif [ $FILE_RANGE = 5 ];then +str=$str -l $((FILESIZE + 1)) +elif [ $FILE_RANGE = 6 ];then +str=$str -l $((FILESIZE / 2)) +fi + +if [ $DEFRAG_COMPRESS = 2 ];then +str=$str -c +fi + +if [ $FLUSH = 2 ];then +str=$str -f +fi + +if [ $THRESH = 2 ];then +str=$str -t -1 +elif [ $THRESH = 3 ];then +str=$str -t $PAGESIZE +fi + +if [ $str != ]; then +btrfs filesystem defragment $str $SCRATCH_MNT/tmp_file +else +if [ $DEFRAG_TARGET = 1 ];then +btrfs filesystem defragment $SCRATCH_MNT/tmp_file +elif [ $DEFRAG_TARGET = 2 ];then +btrfs filesystem defragment $SCRATCH_MNT/tmp_dir +elif [ $DEFRAG_TARGET = 3 ];then +btrfs filesystem defragment $SCRATCH_MNT +fi +fi +ret_val=$? +sync +if [ $ret_val -ne 20 ];then +echo btrfs filesystem defragment failed! err is $ret_val +fi +} +_fsck() +{ +btrfsck $SCRATCH_DEV /dev/null 21 +ret_val=$? +if [ $ret_val -ne 0 ]; then +echo btrfsck _FAIL_! err is $ret_val +fi +} This should use the generic xfstests fsck invocation wrappers. +_parse_options() Please don't use an option parser but just call the low-level file creation functions directly. +_cleanup_defrag() +{ +rm -fr $SCRATCH_MNT/* +umount $SCRATCH_MNT /dev/null 21 +} No need to remove everything as the scratch filesystem gets recreated every time. OK, I'll update this more carefully ;) and thanks for your time! thanks, liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/21] Btrfs: restriper
On Fri, Jan 6, 2012 at 9:30 AM, Ilya Dryomov idryo...@gmail.com wrote: This is a respin of restriper patch series which adds an initial implementation of restriper (it's a clever name for relocation framework that allows to do selective profile changing and selective balancing with some goodies like pausing/resuming and reporting progress to the user). See userspace cover patch for usage examples. I just tried merging this on my kernel 3.2.1 and it seems to work nicely. I compiled your btrfs-progs, made an LVM snapshot and launched a balance on my 300 GB filesystem converted from ext4. It worked for converting my metadata from single to dup, however, it didn't succeed converting my system from single to dup. Here is the command I used: root /usr/src/kernel/btrfs-progs # ./btrfs balance start -f -v -sconvert=dup -mconvert=dup /mnt/btrfs/ Dumping filters: flags 0xe, state 0x0, force is on METADATA (flags 0x100): converting, target=32, soft is off SYSTEM (flags 0x100): converting, target=32, soft is off Done, had to relocate 28 out of 220 chunks Then tried: root /usr/src/kernel/btrfs-progs # ./btrfs balance start -f -v -sconvert=dup /mnt/btrfs/ Dumping filters: flags 0xa, state 0x0, force is on SYSTEM (flags 0x100): converting, target=32, soft is off Done, had to relocate 0 out of 239 chunks Before: root /usr/src/kernel/btrfs-progs # ./btrfs fi df /mnt/btrfs/ Data: total=217.33GB, used=211.68GB System: total=32.00MB, used=28.00KB Metadata: total=19.39GB, used=14.18GB And here is the result after balance: root /usr/src/kernel/btrfs-progs # ./btrfs fi df /mnt/btrfs/ Data: total=226.33GB, used=225.56GB System: total=32.00MB, used=36.00KB Metadata, DUP: total=4.75GB, used=297.77MB I also used the status command which was working correctly. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cross-subvolume reflink copy (BTRFS_IOC_CLONE over subvolume boundaries)
On Mon, Feb 13, 2012 at 6:52 PM, Hubert Kario h...@qbs.com.pl wrote: It's been nearly a year since the patches needed to implement a reflinked copy between subvolumes have been posted (http://permalink.gmane.org/gmane.comp.file-systems.btrfs/9865 ) and I still get Invalid cross-device link error with Linux 3.2.4 while I try to do a cp --reflink between subvolumes. I am still keeping this patch up-to-date in my personal kernel repo. Here is the diff from current for-linus BTRFS-Allow-cross-subvolume-BTRFS_IOC_CLONE.patch diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 0b06a5c..05dc644 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2203,6 +2203,7 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, { struct inode *inode = fdentry(file)-d_inode; struct btrfs_root *root = BTRFS_I(inode)-root; + struct btrfs_root *srcroot; struct file *src_file; struct inode *src; struct btrfs_trans_handle *trans; @@ -2245,6 +2246,7 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, } src = src_file-f_dentry-d_inode; + srcroot = BTRFS_I(src)-root; ret = -EINVAL; if (src == inode) @@ -2264,11 +2266,11 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, goto out_fput; ret = -EXDEV; - if (src-i_sb != inode-i_sb || BTRFS_I(src)-root != root) + if (src-i_sb != inode-i_sb) goto out_fput; ret = -ENOMEM; - buf = vmalloc(btrfs_level_size(root, 0)); + buf = vmalloc(btrfs_level_size(srcroot, 0)); if (!buf) goto out_fput; @@ -2338,13 +2340,13 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, * note the key will change type as we walk through the * tree. */ - ret = btrfs_search_slot(NULL, root, key, path, 0, 0); + ret = btrfs_search_slot(NULL, srcroot, key, path, 0, 0); if (ret 0) goto out; nritems = btrfs_header_nritems(path-nodes[0]); if (path-slots[0] = nritems) { - ret = btrfs_next_leaf(root, path); + ret = btrfs_next_leaf(srcroot, path); if (ret 0) goto out; if (ret 0) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Add the snappy-c compressor to lib v2
On Thu, Jan 12, 2012 at 6:28 PM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com This is a C port of the google snappy compressor. It has roughly comparable compression to LZO, but is significantly faster on many file types. For example it beats all other compressors on already compressed data. I ported the original C++ code over to C and did some changes to make it better fit into the kernel. It preallocates the worst case memory consumption now. While the code being larger than lzo it is still reasonable (about 5K on x86). Decompression needs very little memory, Compression currently 192K on 64bit and 128K on 32bit. For comparison LZO compression needs 128K on 64bit and 64K on 32bit. [This could be lowered significantly by not preallocating for most use cases, typically the footprint is much lower. The original C++ version only allocated most of this when (rarely) needed, but this is more problematic in the kernel] There are some minor divergences from the Linux coding standards: in particular I kept the C++/C99 style mixed statement/declarations. This was mainly to not diverge too much from the reference C++ source, so that improvements from there can be easily ported. There are some other left overs from the google style, but very little now. Performance: The compressor performs best on 64bit-LE systems, but is also quite good on 32bit. I haven't tested BE, but I don't expect that to add a lot of overhead. Here is some performance data (32bit, Nehalem): c/b = cycles/byte; lower numbers are better. x86-64 executable: (compression minimally slower than qlz, but much better at decompression, lzo is left in the dust): snappy: emacs-gtk: 11007968 b: ratio 0.38: comp 8.13 uncomp 2.65 c/b lzo : emacs-gtk: 11007968 b: ratio 0.33: comp 12.74 uncomp 4.70 c/b zlib1 : emacs-gtk: 11007968 b: ratio 0.27: comp 49.96 uncomp 13.14 c/b zlib3 : emacs-gtk: 11007968 b: ratio 0.26: comp 64.17 uncomp 12.33 c/b lzf : emacs-gtk: 11007968 b: ratio 0.37: comp 9.85 uncomp 4.33 c/b qlz : emacs-gtk: 11007968 b: ratio 0.34: comp 7.51 uncomp 6.28 c/b fastlz: emacs-gtk: 11007968 b: ratio 0.37: comp 10.73 uncomp 4.97 c/b Compressed data (beats everything else): snappy: udev-151.tar.gz: 634842 b: ratio 1.00: comp 0.99 uncomp 0.33 c/b lzo : udev-151.tar.gz: 634842 b: ratio 1.00: comp 41.44 uncomp 0.66 c/b zlib1 : udev-151.tar.gz: 634842 b: ratio 1.00: comp 116.99 uncomp 3.94 c/b zlib3 : udev-151.tar.gz: 634842 b: ratio 1.00: comp 117.68 uncomp 3.94 c/b lzf : udev-151.tar.gz: 634842 b: ratio 1.03: comp 16.32 uncomp 1.14 c/b qlz : udev-151.tar.gz: 634842 b: ratio 1.00: comp 10.42 uncomp 0.42 c/b fastlz: udev-151.tar.gz: 634842 b: ratio 1.03: comp 19.35 uncomp 2.07 c/b Text file (compression somewhat slower than qlz, but decompression much better, lzo is much worse): snappy: manual.txt: 445343 b: ratio 0.47: comp 12.01 uncomp 3.12 c/b lzo : manual.txt: 445343 b: ratio 0.44: comp 16.32 uncomp 7.53 c/b zlib1 : manual.txt: 445343 b: ratio 0.35: comp 56.37 uncomp 15.59 c/b zlib3 : manual.txt: 445343 b: ratio 0.31: comp 73.45 uncomp 13.99 c/b lzf : manual.txt: 445343 b: ratio 0.46: comp 13.43 uncomp 5.47 c/b qlz : manual.txt: 445343 b: ratio 0.44: comp 9.16 uncomp 8.19 c/b fastlz: manual.txt: 445343 b: ratio 0.46: comp 14.22 uncomp 7.28 c/b As you can see snappy is a good all-around compressor. On 64bit the compression is even faster and beats everything else easily: snappy: emacs-gtk: 11007968 b: ratio 0.38: comp 4.90 uncomp 2.65 c/b lzo : emacs-gtk: 11007968 b: ratio 0.33: comp 11.24 uncomp 4.46 c/b zlib1 : emacs-gtk: 11007968 b: ratio 0.27: comp 41.67 uncomp 11.13 c/b zlib3 : emacs-gtk: 11007968 b: ratio 0.26: comp 51.80 uncomp 10.54 c/b lzf : emacs-gtk: 11007968 b: ratio 0.37: comp 8.79 uncomp 4.05 c/b qlz : emacs-gtk: 11007968 b: ratio 0.34: comp 5.44 uncomp 5.46 c/b fastlz: emacs-gtk: 11007968 b: ratio 0.37: comp 9.91 uncomp 4.77 c/b On 64bit it's now nearly as fast as qlz on the text file too: snappy: manual.txt: 445343 b: ratio 0.47: comp 7.79 uncomp 3.47 c/b lzo : manual.txt: 445343 b: ratio 0.44: comp 15.46 uncomp 7.27 c/b zlib1 : manual.txt: 445343 b: ratio 0.35: comp 45.79 uncomp 12.78 c/b zlib3 : manual.txt: 445343 b: ratio 0.31: comp 60.52 uncomp 11.72 c/b lzf : manual.txt: 445343 b: ratio 0.46: comp 12.62 uncomp 5.30 c/b qlz : manual.txt: 445343 b: ratio 0.44: comp 6.81 uncomp 7.65 c/b fastlz: manual.txt: 445343 b: ratio 0.46: comp 13.75 uncomp 6.52 c/b Overall it's a good alternative to lzo, with the only drawback being the somewhat higher memory use. v2: Some minor performance improvements and cleanups. 32bit compression should be a few percent faster now. Signed-off-by: Andi Kleen a...@linux.intel.com --- include/linux/snappy.h | 26 + lib/Kconfig | 6 + lib/Makefile | 4 + lib/snappy.c | 1300