> -----Original Message-----
> From: [email protected] <linux-btrfs-
> [email protected]> On Behalf Of Chris Murphy
> Sent: Thursday, 17 January 2019 5:15 AM
> To: Stefan K <[email protected]>
> Cc: Linux Btrfs <[email protected]>
> Subject: Re: question about creating a raid10
> 
> On Wed, Jan 16, 2019 at 7:58 AM Stefan K <[email protected]> wrote:
> >
> >  :(
> > that means when one jbod fail its there is no guarantee that it works
> > fine? like in zfs? well that sucks Didn't anyone think to program it that 
> > way?
> 
> The mirroring is a function of the block group, not the block device.
> And yes that's part of the intentional design and why it's so flexible. A real
> raid10 isn't as flexible, so to enforce the allocation of specific block group
> stripes to specific block devices would add complexity to the allocator while
> reducing flexibility. It's not impossible, it'd just come with caveats like no
> three device
> raid10 like now; and you'd have to figure out what to do if the user adds one
> new device instead of two at a time, and what if any new device isn't the
> same size as existing devices or if you add two devices that aren't the same
> size. Do you refuse to add such devices?
> What limitations do we run into when rebalancing? It's way more
> complicated.
> 
> Btrfs raid10 really should not be called raid10. It sets up the wrong user
> expectation entirely. It's more like raid0+1, except even that is deceptive
> because in theory a legit raid0+1 you can lose multiple drives on one side of
> the mirror (but not both); but with Btrfs raid10 you really can't lose more
> than one drive. And therefore it does not scale. The probability of downtime
> increases as drives are added; whereas with a real raid10 downtime doesn't
> change.
> 
> In your case you're better off with raid0'ing the two drives in each enclosure
> (whether it's a feature of the enclosure or doing it with mdadm or LVM). And
> then using Btrfs raid1 on top of the resulting virtual block devices. Or do
> mdadm/LVM raid10, and format it Btrfs. Or yeah, use ZFS.


What I've done is create separate lvm volumes and groups, and assigned the lvm 
groups to separate physical storage so I don't accidently get two btrfs mirrors 
on the same device.
It's a bit complicated (especially since I'm using caching with lvm) but it 
works very well.

vm-server ~ # lvs
  LV        VG Attr       LSize  Pool              Origin            Data%  
Meta%  Move Log Cpy%Sync Convert
  backup-a  a  Cwi-aoC--- <2.73t [cache-backup-a]  [backup-a_corig]  99.99  
24.63           0.00
  lvol0     a  -wi-a----- 44.00m
  lvol1     a  -wi-a----- 44.00m
  storage-a a  Cwi-aoC--- <3.64t [cache-storage-a] [storage-a_corig] 99.98  
24.76           0.00
  backup-b  b  Cwi-aoC--- <2.73t [cache-backup-b]  [backup-b_corig]  99.99  
24.75           0.00
  storage-b b  Cwi-aoC--- <3.64t [cache-storage-b] [storage-b_corig] 99.99  
24.66           0.00
  storage-c c  -wi-a----- <3.64t

vm-server ~ # vgs
  VG #PV #LV #SN Attr   VSize  VFree
  a    3   4   0 wz--n-  6.70t    0
  b    3   2   0 wz--n-  6.70t 8.00m
  c    1   1   0 wz--n- <3.64t    0

vm-server ~ # pvs
  PV         VG Fmt  Attr PSize    PFree
  /dev/sda4  a  lvm2 a--  <342.00g    0
  /dev/sdb4  b  lvm2 a--  <342.00g 8.00m
  /dev/sdc1  a  lvm2 a--    <2.73t    0
  /dev/sdd1  b  lvm2 a--    <2.73t    0
  /dev/sdf1  a  lvm2 a--    <3.64t    0
  /dev/sdh1  c  lvm2 a--    <3.64t    0
  /dev/sdi1  b  lvm2 a--    <3.64t    0

vm-server ~ # btrfs fi sh
Label: 'Root'  uuid: 58d27dbd-7c1e-4ef7-8d43-e93df1537b08
        Total devices 2 FS bytes used 55.26GiB
        devid   13 size 100.00GiB used 65.03GiB path /dev/sdb1
        devid   14 size 100.00GiB used 65.03GiB path /dev/sda1

Label: 'Boot'  uuid: 8f63cd03-67b2-47cd-85ce-ca355769c123
        Total devices 2 FS bytes used 66.11MiB
        devid    1 size 1.00GiB used 356.00MiB path /dev/sdb6
        devid    2 size 1.00GiB used 0.00B path /dev/sda6

Label: 'Storage'  uuid: 1438fdc5-8b2a-47b3-8a5b-eb74cde3df42
        Total devices 4 FS bytes used 2.85TiB
        devid    1 size 3.61TiB used 3.19TiB path /dev/mapper/b-storage--b
        devid    2 size 3.42TiB used 3.02TiB path /dev/mapper/a-storage--a
        devid    3 size 279.40GiB used 173.00GiB path /dev/sdg1
        devid    4 size 279.40GiB used 172.00GiB path /dev/sde1

Label: 'Backup'  uuid: 21e59d66-3e88-4fc9-806f-69bde58be6a3
        Total devices 2 FS bytes used 1.31TiB
        devid    1 size 2.73TiB used 1.31TiB path /dev/mapper/a-backup--a
        devid    2 size 2.73TiB used 1.31TiB path /dev/mapper/b-backup--b

Reply via email to