Re: Feeback on RAID1 feature of Btrfs
On Tue, Dec 18, 2012 at 6:13 AM, Hugo Mills h...@carfax.org.uk wrote: On Tue, Dec 18, 2012 at 01:20:20PM +0200, Brendan Hide wrote: On 2012/12/17 06:23 PM, Hugo Mills wrote: On Mon, Dec 17, 2012 at 04:51:33PM +0100, Sebastien Luttringer wrote: Hello, snip I get the feeling that RAID1 only allow one disk removing. Which is more a RAID5 feature. The RAID-1 support in btrfs makes exactly two copies of each item of data, so you can lose at most one disk from the array safely. Lose any more, and you're likely to have lost data, as you've found out. I'm afraid Btrfs raid1 will not be working before the end of the world. It does work (as you demonstrated with the first disk being removed) -- but just not as you thought it should. Now, you can argue that RAID-1 isn't a good name to use here, but there's no good name in RAID terminology to describe what we actually have here. Technically, btrfs's RAID1 implementation is much closer to RAID1E than traditional RAID1. See http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID_1E or http://pic.dhe.ibm.com/infocenter/director/v5r2/index.jsp?topic=/serveraid_9.00/fqy0_craid1e.html Perhaps a new name, as with ZFS, might be appropriate. RAID-Z and RAID-Z2, for example, could not adequately be described by any existing RAID terminology and, technically, RAID-Z still isn't a RAID in the classical sense. Yeah, we did have a naming scheme proposed, with combinations of nCmSpP, where n is the number of copies held, m the number of stripes, and p the number of parity stripes. So btrfs RAID-1 is 2C, RAID-5 on 5 disks would be 4S1P, and RAID-10 on 4 disks would be 2C2S. ...yes. something like this is not only reflects reality better,, and actually transfers information in consistent way (vs RAID-XYZ... meaningless ENUM!) you could maybe do something like: 2C2S : -1S : 0 ...or similar, showing: {normal} {OFFSET max degraded [rel boundary]} {OFFSET current} ... which instantly makes the useful boundaries known, along with the active panic level i should be experiencing :) I'd prefer to see that than some non-standard RAID-18KTHXBYE formulation. ^^^ this. the term RAID conjures expectations that run afoul of btrfs's reality and should thus simply be avoided altogether IMO, unless you wish/must explicitly correlate some similarity X, there is no need to even mention the work RAID, because it carries no information. -- C Anthony -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Feeback on RAID1 feature of Btrfs
On 2012/12/17 06:23 PM, Hugo Mills wrote: On Mon, Dec 17, 2012 at 04:51:33PM +0100, Sebastien Luttringer wrote: Hello, snip I get the feeling that RAID1 only allow one disk removing. Which is more a RAID5 feature. The RAID-1 support in btrfs makes exactly two copies of each item of data, so you can lose at most one disk from the array safely. Lose any more, and you're likely to have lost data, as you've found out. I'm afraid Btrfs raid1 will not be working before the end of the world. It does work (as you demonstrated with the first disk being removed) -- but just not as you thought it should. Now, you can argue that RAID-1 isn't a good name to use here, but there's no good name in RAID terminology to describe what we actually have here. Technically, btrfs's RAID1 implementation is much closer to RAID1E than traditional RAID1. See http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID_1E or http://pic.dhe.ibm.com/infocenter/director/v5r2/index.jsp?topic=/serveraid_9.00/fqy0_craid1e.html Perhaps a new name, as with ZFS, might be appropriate. RAID-Z and RAID-Z2, for example, could not adequately be described by any existing RAID terminology and, technically, RAID-Z still isn't a RAID in the classical sense. -- Brendan Hide 083 448 3867 http://swiftspirit.co.za/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Feeback on RAID1 feature of Btrfs
On Tue, Dec 18, 2012 at 01:20:20PM +0200, Brendan Hide wrote: On 2012/12/17 06:23 PM, Hugo Mills wrote: On Mon, Dec 17, 2012 at 04:51:33PM +0100, Sebastien Luttringer wrote: Hello, snip I get the feeling that RAID1 only allow one disk removing. Which is more a RAID5 feature. The RAID-1 support in btrfs makes exactly two copies of each item of data, so you can lose at most one disk from the array safely. Lose any more, and you're likely to have lost data, as you've found out. I'm afraid Btrfs raid1 will not be working before the end of the world. It does work (as you demonstrated with the first disk being removed) -- but just not as you thought it should. Now, you can argue that RAID-1 isn't a good name to use here, but there's no good name in RAID terminology to describe what we actually have here. Technically, btrfs's RAID1 implementation is much closer to RAID1E than traditional RAID1. See http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID_1E or http://pic.dhe.ibm.com/infocenter/director/v5r2/index.jsp?topic=/serveraid_9.00/fqy0_craid1e.html Perhaps a new name, as with ZFS, might be appropriate. RAID-Z and RAID-Z2, for example, could not adequately be described by any existing RAID terminology and, technically, RAID-Z still isn't a RAID in the classical sense. Yeah, we did have a naming scheme proposed, with combinations of nCmSpP, where n is the number of copies held, m the number of stripes, and p the number of parity stripes. So btrfs RAID-1 is 2C, RAID-5 on 5 disks would be 4S1P, and RAID-10 on 4 disks would be 2C2S. I'd prefer to see that than some non-standard RAID-18KTHXBYE formulation. Plenty of room for shed-painting here, though. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- I believe that it's closely correlated with --- the aeroswine coefficient. signature.asc Description: Digital signature
Feeback on RAID1 feature of Btrfs
Hello, I'm testing Btrfs RAID1 feature on 3 disks of ~10GB. Last one is not exactly 10GB (would be too easy). About the test machine, it's a kvm vm running an up-to-date archlinux with linux 3.7 and btrfs-progs 0.19.20121005. #uname -a Linux seblu-btrfs-1 3.7.0-1-ARCH #1 SMP PREEMPT Tue Dec 11 15:05:50 CET 2012 x86_64 GNU/Linux Filesystem was created with : # mkfs.btrfs -L test -d raid1 -m raid1 /dev/vd[bcd] I downloaded a lot of linux kernel tarball and untared in into this filesystem until it tell me enough: drwxr-xr-x 1 root root 330 2007-10-09 20:31 linux-2.6.23 -rw-r--r-- 1 root root 44M 2007-10-09 20:48 linux-2.6.23.tar.bz2 drwxr-xr-x 1 root root 344 2008-01-24 22:58 linux-2.6.24 -rw-r--r-- 1 root root 45M 2008-01-24 23:16 linux-2.6.24.tar.bz2 drwxr-xr-x 1 root root 352 2008-04-17 02:49 linux-2.6.25 Some output of btrfs tools # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 9.76GB path /dev/vdd devid2 size 10.00GB used 10.00GB path /dev/vdc devid1 size 9.77GB used 9.77GB path /dev/vdb # btrfs fi df . Data, RAID1: total=13.50GB, used=10.56GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=1.25GB, used=1.01GB Metadata: total=8.00MB, used=0.00 I have choosen a _raid1_ filesystem, so I expect every disk be able to die and fs still usuable without data lost. I used chksfv[1] to make checksum of tarball on disk and check my data are safe. I killed the first disk via libvirt. # virsh detach-disk seblu-btrfs-1 vdb Btrfs detect missing disk \o/ # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 8.01GB path /dev/vdd devid2 size 10.00GB used 9.26GB path /dev/vdc *** Some devices missing Listing all files and cksum on tarballs are good. # cksfv -f ~/checksum Everything OK We have raid1, so we can expect kill one disk (the smaller). Let's go. # virsh detach-disk seblu-btrfs-1 vdd # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 8.01GB path /dev/vdd devid2 size 10.00GB used 9.26GB path /dev/vdc *** Some devices missing It doesn't see the drive go away... Listing doesn't works. # find /mnt/test . ./pkg ^C # cksfv -f ~/checksum linux-2.6.30.tar.bz2 OK linux-2.6.31.tar.bz2 Input/output error linux-2.6.32.tar.bz2 Input/output error linux-2.6.33.tar.bz2 Input/output error So data and metada seems broken. Filesystem is become read only # touch séb touch: cannot touch ‘séb’: Read-only file system and finally I got a traceback in dmesg [ 804.910405] [ cut here ] [ 804.910491] WARNING: at fs/btrfs/super.c:246 __btrfs_abort_transaction+0xdf/0x100 [btrfs]() [ 804.910499] Hardware name: Bochs [ 804.910501] btrfs: Transaction aborted [ 804.910502] Modules linked in: cirrus ttm i2c_piix4 drm_kms_helper psmouse intel_agp drm evdev intel_gtt serio_raw syscopyarea sysfillrect sysimgblt i2c_core processor button pcspkr microcode btrfs crc32c libcrc32c zlib_deflate ata_generic pata_acpi virtio_balloon virtio_blk virtio_net ata_piix libata uhci_hcd virtio_pci virtio_ring virtio usbcore usb_common scsi_mod floppy [ 804.910575] Pid: 2077, comm: touch Not tainted 3.7.0-1-ARCH #1 [ 804.910577] Call Trace: [ 804.910606] [8105742f] warn_slowpath_common+0x7f/0xc0 [ 804.910610] [81057526] warn_slowpath_fmt+0x46/0x50 [ 804.910619] [a015ff6f] __btrfs_abort_transaction+0xdf/0x100 [btrfs] [ 804.910635] [a01802c0] ? verify_parent_transid+0x170/0x170 [btrfs] [ 804.910648] [a016402a] __btrfs_cow_block+0x48a/0x510 [btrfs] [ 804.910661] [a0164227] btrfs_cow_block+0xf7/0x230 [btrfs] [ 804.910672] [a016752b] push_leaf_right+0x11b/0x190 [btrfs] [ 804.910681] [a0167cb1] split_leaf+0x621/0x740 [btrfs] [ 804.910690] [a0160d52] ? leaf_space_used+0xd2/0x110 [btrfs] [ 804.910710] [a01b9455] ? btrfs_set_lock_blocking_rw+0xb5/0x120 [btrfs] [ 804.910720] [a016894c] btrfs_search_slot+0x89c/0x900 [btrfs] [ 804.910730] [a0169fcc] btrfs_insert_empty_items+0x7c/0xe0 [btrfs] [ 804.910742] [a017c313] insert_with_overflow+0x43/0x120 [btrfs] [ 804.910753] [a017c4ac] btrfs_insert_dir_item+0xbc/0x200 [btrfs] [ 804.910767] [a0195c6b] btrfs_add_link+0xeb/0x300 [btrfs] [ 804.910781] [a0196f64] btrfs_create+0x1a4/0x210 [btrfs] [ 804.910799] [81217a0c] ? security_inode_permission+0x1c/0x30 [ 804.910813] [8118fd56] vfs_create+0xb6/0x120 [ 804.910833] [8119281d]
Re: Feeback on RAID1 feature of Btrfs
Hello Sebastien, with btrfs raid1 you get two copies of each extent on separate drives. That means you can lose one drive only, no matter how many drives are in the set. It's not traditional raid1, which is probably what you are confusing it with. Having raid1 with more than 2n-redudancy is not currently possible with btrfs, though rumour has it this capability may land in 3.8 Regards, Bart On Mon, Dec 17, 2012 at 4:51 PM, Sebastien Luttringer sebastien.luttrin...@smartjog.com wrote: Hello, I'm testing Btrfs RAID1 feature on 3 disks of ~10GB. Last one is not exactly 10GB (would be too easy). About the test machine, it's a kvm vm running an up-to-date archlinux with linux 3.7 and btrfs-progs 0.19.20121005. #uname -a Linux seblu-btrfs-1 3.7.0-1-ARCH #1 SMP PREEMPT Tue Dec 11 15:05:50 CET 2012 x86_64 GNU/Linux Filesystem was created with : # mkfs.btrfs -L test -d raid1 -m raid1 /dev/vd[bcd] I downloaded a lot of linux kernel tarball and untared in into this filesystem until it tell me enough: drwxr-xr-x 1 root root 330 2007-10-09 20:31 linux-2.6.23 -rw-r--r-- 1 root root 44M 2007-10-09 20:48 linux-2.6.23.tar.bz2 drwxr-xr-x 1 root root 344 2008-01-24 22:58 linux-2.6.24 -rw-r--r-- 1 root root 45M 2008-01-24 23:16 linux-2.6.24.tar.bz2 drwxr-xr-x 1 root root 352 2008-04-17 02:49 linux-2.6.25 Some output of btrfs tools # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 9.76GB path /dev/vdd devid2 size 10.00GB used 10.00GB path /dev/vdc devid1 size 9.77GB used 9.77GB path /dev/vdb # btrfs fi df . Data, RAID1: total=13.50GB, used=10.56GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=1.25GB, used=1.01GB Metadata: total=8.00MB, used=0.00 I have choosen a _raid1_ filesystem, so I expect every disk be able to die and fs still usuable without data lost. I used chksfv[1] to make checksum of tarball on disk and check my data are safe. I killed the first disk via libvirt. # virsh detach-disk seblu-btrfs-1 vdb Btrfs detect missing disk \o/ # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 8.01GB path /dev/vdd devid2 size 10.00GB used 9.26GB path /dev/vdc *** Some devices missing Listing all files and cksum on tarballs are good. # cksfv -f ~/checksum Everything OK We have raid1, so we can expect kill one disk (the smaller). Let's go. # virsh detach-disk seblu-btrfs-1 vdd # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 8.01GB path /dev/vdd devid2 size 10.00GB used 9.26GB path /dev/vdc *** Some devices missing It doesn't see the drive go away... Listing doesn't works. # find /mnt/test . ./pkg ^C # cksfv -f ~/checksum linux-2.6.30.tar.bz2 OK linux-2.6.31.tar.bz2 Input/output error linux-2.6.32.tar.bz2 Input/output error linux-2.6.33.tar.bz2 Input/output error So data and metada seems broken. Filesystem is become read only # touch séb touch: cannot touch ‘séb’: Read-only file system and finally I got a traceback in dmesg [ 804.910405] [ cut here ] [ 804.910491] WARNING: at fs/btrfs/super.c:246 __btrfs_abort_transaction+0xdf/0x100 [btrfs]() [ 804.910499] Hardware name: Bochs [ 804.910501] btrfs: Transaction aborted [ 804.910502] Modules linked in: cirrus ttm i2c_piix4 drm_kms_helper psmouse intel_agp drm evdev intel_gtt serio_raw syscopyarea sysfillrect sysimgblt i2c_core processor button pcspkr microcode btrfs crc32c libcrc32c zlib_deflate ata_generic pata_acpi virtio_balloon virtio_blk virtio_net ata_piix libata uhci_hcd virtio_pci virtio_ring virtio usbcore usb_common scsi_mod floppy [ 804.910575] Pid: 2077, comm: touch Not tainted 3.7.0-1-ARCH #1 [ 804.910577] Call Trace: [ 804.910606] [8105742f] warn_slowpath_common+0x7f/0xc0 [ 804.910610] [81057526] warn_slowpath_fmt+0x46/0x50 [ 804.910619] [a015ff6f] __btrfs_abort_transaction+0xdf/0x100 [btrfs] [ 804.910635] [a01802c0] ? verify_parent_transid+0x170/0x170 [btrfs] [ 804.910648] [a016402a] __btrfs_cow_block+0x48a/0x510 [btrfs] [ 804.910661] [a0164227] btrfs_cow_block+0xf7/0x230 [btrfs] [ 804.910672] [a016752b] push_leaf_right+0x11b/0x190 [btrfs] [ 804.910681] [a0167cb1] split_leaf+0x621/0x740 [btrfs] [ 804.910690] [a0160d52] ? leaf_space_used+0xd2/0x110 [btrfs] [ 804.910710] [a01b9455] ? btrfs_set_lock_blocking_rw+0xb5/0x120 [btrfs] [ 804.910720]
Re: Feeback on RAID1 feature of Btrfs
On Mon, Dec 17, 2012 at 04:51:33PM +0100, Sebastien Luttringer wrote: Hello, I'm testing Btrfs RAID1 feature on 3 disks of ~10GB. Last one is not exactly 10GB (would be too easy). About the test machine, it's a kvm vm running an up-to-date archlinux with linux 3.7 and btrfs-progs 0.19.20121005. #uname -a Linux seblu-btrfs-1 3.7.0-1-ARCH #1 SMP PREEMPT Tue Dec 11 15:05:50 CET 2012 x86_64 GNU/Linux Filesystem was created with : # mkfs.btrfs -L test -d raid1 -m raid1 /dev/vd[bcd] I downloaded a lot of linux kernel tarball and untared in into this filesystem until it tell me enough: drwxr-xr-x 1 root root 330 2007-10-09 20:31 linux-2.6.23 -rw-r--r-- 1 root root 44M 2007-10-09 20:48 linux-2.6.23.tar.bz2 drwxr-xr-x 1 root root 344 2008-01-24 22:58 linux-2.6.24 -rw-r--r-- 1 root root 45M 2008-01-24 23:16 linux-2.6.24.tar.bz2 drwxr-xr-x 1 root root 352 2008-04-17 02:49 linux-2.6.25 Some output of btrfs tools # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 9.76GB path /dev/vdd devid2 size 10.00GB used 10.00GB path /dev/vdc devid1 size 9.77GB used 9.77GB path /dev/vdb # btrfs fi df . Data, RAID1: total=13.50GB, used=10.56GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=1.25GB, used=1.01GB Metadata: total=8.00MB, used=0.00 I have choosen a _raid1_ filesystem, so I expect every disk be able to die and fs still usuable without data lost. This is where your expectations and reality have their differences. I used chksfv[1] to make checksum of tarball on disk and check my data are safe. I killed the first disk via libvirt. # virsh detach-disk seblu-btrfs-1 vdb Btrfs detect missing disk \o/ # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 8.01GB path /dev/vdd devid2 size 10.00GB used 9.26GB path /dev/vdc *** Some devices missing Listing all files and cksum on tarballs are good. # cksfv -f ~/checksum Everything OK We have raid1, so we can expect kill one disk (the smaller). Let's go. # virsh detach-disk seblu-btrfs-1 vdd # btrfs fi sh Label: 'test' uuid: 7d72c625-4dd7-4db0-b4a2-075e26572b99 Total devices 3 FS bytes used 11.57GB devid3 size 10.00GB used 8.01GB path /dev/vdd devid2 size 10.00GB used 9.26GB path /dev/vdc *** Some devices missing [snip] Yes, this is the expected behaviour. I get the feeling that RAID1 only allow one disk removing. Which is more a RAID5 feature. The RAID-1 support in btrfs makes exactly two copies of each item of data, so you can lose at most one disk from the array safely. Lose any more, and you're likely to have lost data, as you've found out. I'm afraid Btrfs raid1 will not be working before the end of the world. It does work (as you demonstrated with the first disk being removed) -- but just not as you thought it should. Now, you can argue that RAID-1 isn't a good name to use here, but there's no good name in RAID terminology to describe what we actually have here. I believe that Chris has done some work on allowing (tunable) multiple copies of data in RAID-1 mode, but that's not been published yet. It was part of the extended RAID work that he was working on. I think we're hoping for RAID-5 and RAID-6 to arrive in this merge window, but that's been said before (and it hasn't been stable enough for release before). I think Chris was going to leave the n-copies RAID-1 feature for a later release, and just get RAID-5/6 out first. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Le Corbusier's plan for improving Paris involved the --- assassination of the city, and its rebirth as tower blocks. signature.asc Description: Digital signature