Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-20 14:41, Hugo Mills wrote: On Fri, Jul 20, 2018 at 09:38:14PM +0300, Andrei Borzenkov wrote: 20.07.2018 20:16, Goffredo Baroncelli пишет: [snip] Limiting the number of disk per raid, in BTRFS would be quite simple to implement in the "chunk allocator" You mean that currently RAID5 stripe size is equal to number of disks? Well, I suppose nobody is using btrfs with disk pools of two or three digits size. But they are (even if not very many of them) -- we've seen at least one person with something like 40 or 50 devices in the array. They'd definitely got into /dev/sdac territory. I don't recall what RAID level they were using. I think it was either RAID-1 or -10. That's the largest I can recall seeing mention of, though. I've talked to at least two people using it on 100+ disks in a SAN situation. In both cases however, BTRFS itself was only seeing about 20 devices and running in raid0 mode on them, with each of those being a RAID6 volume configured on the SAN node holding the disks for it. From what I understood when talking to them, they actually got rather good performance in this setup, though maintenance was a bit of a pain. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On Fri, Jul 20, 2018 at 09:38:14PM +0300, Andrei Borzenkov wrote: > 20.07.2018 20:16, Goffredo Baroncelli пишет: [snip] > > Limiting the number of disk per raid, in BTRFS would be quite simple to > > implement in the "chunk allocator" > > > > You mean that currently RAID5 stripe size is equal to number of disks? > Well, I suppose nobody is using btrfs with disk pools of two or three > digits size. But they are (even if not very many of them) -- we've seen at least one person with something like 40 or 50 devices in the array. They'd definitely got into /dev/sdac territory. I don't recall what RAID level they were using. I think it was either RAID-1 or -10. That's the largest I can recall seeing mention of, though. Hugo. -- Hugo Mills | Have found Lost City of Atlantis. High Priest is hugo@... carfax.org.uk | winning at quoits. http://carfax.org.uk/ | PGP: E2AB1DE4 | Terry Pratchett signature.asc Description: Digital signature
Re: [PATCH 0/4] 3- and 4- copy RAID1
20.07.2018 20:16, Goffredo Baroncelli пишет: > On 07/20/2018 07:17 AM, Andrei Borzenkov wrote: >> 18.07.2018 22:42, Goffredo Baroncelli пишет: >>> On 07/18/2018 09:20 AM, Duncan wrote: Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as excerpted: > On 07/17/2018 11:12 PM, Duncan wrote: >> Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as >> excerpted: >> >>> On 07/15/2018 04:37 PM, waxhead wrote: >> >>> Striping and mirroring/pairing are orthogonal properties; mirror and >>> parity are mutually exclusive. >> >> I can't agree. I don't know whether you meant that in the global >> sense, >> or purely in the btrfs context (which I suspect), but either way I >> can't agree. >> >> In the pure btrfs context, while striping and mirroring/pairing are >> orthogonal today, Hugo's whole point was that btrfs is theoretically >> flexible enough to allow both together and the feature may at some >> point be added, so it makes sense to have a layout notation format >> flexible enough to allow it as well. > > When I say orthogonal, It means that these can be combined: i.e. you can > have - striping (RAID0) > - parity (?) > - striping + parity (e.g. RAID5/6) > - mirroring (RAID1) > - mirroring + striping (RAID10) > > However you can't have mirroring+parity; this means that a notation > where both 'C' ( = number of copy) and 'P' ( = number of parities) is > too verbose. Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on top of mirroring or mirroring on top of raid5/6, much as raid10 is conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 on top of raid0. >>> And what about raid 615156156 (raid 6 on top of raid 1 on top of raid 5 on >>> top of) ??? >>> >>> Seriously, of course you can combine a lot of different profile; however >>> the only ones that make sense are the ones above. >> >> RAID50 (striping across RAID5) is common. > > Yeah someone else report that. But other than reducing the number of disk per > raid5 (increasing the ration number of disks/number of parity disks), which > other advantages has ? It allows distributing IO across virtually unlimited number of disks while confining failure domain to manageable size. > Limiting the number of disk per raid, in BTRFS would be quite simple to > implement in the "chunk allocator" > You mean that currently RAID5 stripe size is equal to number of disks? Well, I suppose nobody is using btrfs with disk pools of two or three digits size. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-20 13:13, Goffredo Baroncelli wrote: On 07/19/2018 09:10 PM, Austin S. Hemmelgarn wrote: On 2018-07-19 13:29, Goffredo Baroncelli wrote: [...] So until now you are repeating what I told: the only useful raid profile are - striping - mirroring - striping+paring (even limiting the number of disk involved) - striping+mirroring No, not quite. At least, not in the combinations you're saying make sense if you are using standard terminology. RAID05 and RAID06 are not the same thing as 'striping+parity' as BTRFS implements that case, and can be significantly more optimized than the trivial implementation of just limiting the number of disks involved in each chunk (by, you know, actually striping just like what we currently call raid10 mode in BTRFS does). Could you provide more information ? Just parity by itself is functionally equivalent to a really stupid implementation of 2 or more copies of the data. Setups with only one disk more than the number of parities in RAID5 and RAID6 are called degenerate for this very reason. All sane RAID5/6 implementations do striping across multiple devices internally, and that's almost always what people mean when talking about striping plus parity. What I'm referring to is different though. Just like RAID10 used to be implemented as RAID1 on top of RAID0, RAID05 is RAID0 on top of RAID5. That is, you're striping your data across multiple RAID5 arrays instead of using one big RAID5 array to store it all. As I mentioned, this mitigates the scaling issues inherent in RAID5 when it comes to rebuilds (namely, the fact that device failure rates go up faster for larger arrays than rebuild times do). Functionally, such a setup can be implemented in BTRFS by limiting RAID5/6 stripe width, but that will have all kinds of performance limitations compared to actually striping across all of the underlying RAID5 chunks. In fact, it will have the exact same performance limitations you're calling out BTRFS single mode for below. RAID15 and RAID16 are a similar case to RAID51 and RAID61, except they might actually make sense in BTRFS to provide a backup means of rebuilding blocks that fail checksum validation if both copies fail. If you need further redundancy, it is easy to implement a parity3 and parity4 raid profile instead of stacking a raid6+raid1 I think you're misunderstanding what I mean here. RAID15/16 consist of two layers: * The top layer is regular RAID1, usually limited to two copies. * The lower layer is RAID5 or RAID6. This means that the lower layer can validate which of the two copies in the upper layer is correct when they don't agree. This happens only because there is a redundancy greater than 1. Anyway BTRFS has the checksum, which helps a lot in this area The checksum helps, but what do you do when all copies fail the checksum? Or, worse yet, what do you do with both copies have the 'right' checksum, but different data? Yes, you could have one more copy, but that just reduces the chances of those cases happening, it doesn't eliminate them. Note that I'm not necessarily saying it makes sense to have support for this in BTRFS, just that it's a real-world counter-example to your statement that only those combinations make sense. In the case of BTRFS, these would make more sense than RAID51 and RAID61, but they still aren't particularly practical. For classic RAID though, they're really important, because you don't have checksumming (unless you have T10 DIF capable hardware and a RAID implementation that understands how to work with it, but that's rare and expensive) and it makes it easier to resize an array than having three copies (you only need 2 new disks for RAID15 or RAID16 to increase the size of the array, but you need 3 for 3-copy RAID1 or RAID10). It doesn't really provide significantly better redundancy (they can technically sustain more disk failures without failing completely than simple two-copy RAID1 can, but just like BTRFS raid10, they can't reliably survive more than one (or two if you're using RAID6 as the lower layer) disk failure), so it does not do the same thing that higher-order parity does. The fact that you can combine striping and mirroring (or pairing) makes sense because you could have a speed gain (see below). [] As someone else pointed out, md/lvm-raid10 already work like this. What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty much works this way except with huge (gig size) chunks. As implemented in BTRFS, raid1 doesn't have striping. The argument is that because there's only two copies, on multi-device btrfs raid1 with 4+ devices of equal size so chunk allocations tend to alternate device pairs, it's effectively striped at the macro level, with the 1 GiB device-level chunks effectively being huge individual device strips of 1 GiB. The striping concept is based to the fact that if the "stripe size" is small
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 07/20/2018 07:17 AM, Andrei Borzenkov wrote: > 18.07.2018 22:42, Goffredo Baroncelli пишет: >> On 07/18/2018 09:20 AM, Duncan wrote: >>> Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as >>> excerpted: >>> On 07/17/2018 11:12 PM, Duncan wrote: > Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as > excerpted: > >> On 07/15/2018 04:37 PM, waxhead wrote: > >> Striping and mirroring/pairing are orthogonal properties; mirror and >> parity are mutually exclusive. > > I can't agree. I don't know whether you meant that in the global > sense, > or purely in the btrfs context (which I suspect), but either way I > can't agree. > > In the pure btrfs context, while striping and mirroring/pairing are > orthogonal today, Hugo's whole point was that btrfs is theoretically > flexible enough to allow both together and the feature may at some > point be added, so it makes sense to have a layout notation format > flexible enough to allow it as well. When I say orthogonal, It means that these can be combined: i.e. you can have - striping (RAID0) - parity (?) - striping + parity (e.g. RAID5/6) - mirroring (RAID1) - mirroring + striping (RAID10) However you can't have mirroring+parity; this means that a notation where both 'C' ( = number of copy) and 'P' ( = number of parities) is too verbose. >>> >>> Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on >>> top of mirroring or mirroring on top of raid5/6, much as raid10 is >>> conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 >>> on top of raid0. >> And what about raid 615156156 (raid 6 on top of raid 1 on top of raid 5 on >> top of) ??? >> >> Seriously, of course you can combine a lot of different profile; however the >> only ones that make sense are the ones above. > > RAID50 (striping across RAID5) is common. Yeah someone else report that. But other than reducing the number of disk per raid5 (increasing the ration number of disks/number of parity disks), which other advantages has ? Limiting the number of disk per raid, in BTRFS would be quite simple to implement in the "chunk allocator" > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 07/19/2018 09:10 PM, Austin S. Hemmelgarn wrote: > On 2018-07-19 13:29, Goffredo Baroncelli wrote: [...] >> >> So until now you are repeating what I told: the only useful raid profile are >> - striping >> - mirroring >> - striping+paring (even limiting the number of disk involved) >> - striping+mirroring > > No, not quite. At least, not in the combinations you're saying make sense if > you are using standard terminology. RAID05 and RAID06 are not the same thing > as 'striping+parity' as BTRFS implements that case, and can be significantly > more optimized than the trivial implementation of just limiting the number of > disks involved in each chunk (by, you know, actually striping just like what > we currently call raid10 mode in BTRFS does). Could you provide more information ? >> >>> >>> RAID15 and RAID16 are a similar case to RAID51 and RAID61, except they >>> might actually make sense in BTRFS to provide a backup means of rebuilding >>> blocks that fail checksum validation if both copies fail. >> If you need further redundancy, it is easy to implement a parity3 and >> parity4 raid profile instead of stacking a raid6+raid1 > I think you're misunderstanding what I mean here. > > RAID15/16 consist of two layers: > * The top layer is regular RAID1, usually limited to two copies. > * The lower layer is RAID5 or RAID6. > > This means that the lower layer can validate which of the two copies in the > upper layer is correct when they don't agree. This happens only because there is a redundancy greater than 1. Anyway BTRFS has the checksum, which helps a lot in this area > It doesn't really provide significantly better redundancy (they can > technically sustain more disk failures without failing completely than simple > two-copy RAID1 can, but just like BTRFS raid10, they can't reliably survive > more than one (or two if you're using RAID6 as the lower layer) disk > failure), so it does not do the same thing that higher-order parity does. >> The fact that you can combine striping and mirroring (or pairing) makes sense because you could have a speed gain (see below). [] >>> >>> As someone else pointed out, md/lvm-raid10 already work like this. >>> What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty >>> much works this way except with huge (gig size) chunks. >> >> As implemented in BTRFS, raid1 doesn't have striping. > > The argument is that because there's only two copies, on multi-device > btrfs raid1 with 4+ devices of equal size so chunk allocations tend to > alternate device pairs, it's effectively striped at the macro level, with > the 1 GiB device-level chunks effectively being huge individual device > strips of 1 GiB. The striping concept is based to the fact that if the "stripe size" is small enough you have a speed benefit because the reads may be performed in parallel from different disks. >>> That's not the only benefit of striping though. The other big one is that >>> you now have one volume that's the combined size of both of the original >>> devices. Striping is arguably better for this even if you're using a large >>> stripe size because it better balances the wear across the devices than >>> simple concatenation. >> >> Striping means that the data is interleaved between the disks with a >> reasonable "block unit". Otherwise which would be the difference between >> btrfs-raid0 and btrfs-single ? > Single mode guarantees that any file less than the chunk size in length will > either be completely present or completely absent if one of the devices > fails. BTRFS raid0 mode does not provide any such guarantee, and in fact > guarantees that all files that are larger than the stripe unit size (however > much gets put on one disk before moving to the next) will all lose data if a > device fails. > > Stupid as it sounds, this matters for some people. I think that even better would be having different filesystems. >> >>> With a "stripe size" of 1GB, it is very unlikely that this would happens. >>> That's a pretty big assumption. There are all kinds of access patterns >>> that will still distribute the load reasonably evenly across the >>> constituent devices, even if they don't parallelize things. >>> >>> If, for example, all your files are 64k or less, and you only read whole >>> files, there's no functional difference between RAID0 with 1GB blocks and >>> RAID0 with 64k blocks. Such a workload is not unusual on a very busy >>> mail-server. >> >> I fully agree that 64K may be too much for some workload, however I have to >> point out that I still find difficult to imagine that you can take advantage >> of parallel read from multiple disks with a 1GB stripe unit for a *common >> workload*. Pay attention that btrfs inline in the metadata the small files, >> so even if the file is smaller than 64k, a 64k read (or more) will be >>
Re: [PATCH 0/4] 3- and 4- copy RAID1
On Thu, Jul 19, 2018 at 07:47:23AM -0400, Austin S. Hemmelgarn wrote: > > So this special level will be used for RAID56 for now? > > Or it will also be possible for metadata usage just like current RAID1? > > > > If the latter, the metadata scrub problem will need to be considered more. > > > > For more copies RAID1, it's will have higher possibility one or two > > devices missing, and then being scrubbed. > > For metadata scrub, inlined csum can't ensure it's the latest one. > > > > So for such RAID1 scrub, we need to read out all copies and compare > > their generation to find out the correct copy. > > At least from the changeset, it doesn't look like it's addressed yet. > > > > And this also reminds me that current scrub is not as flex as balance, I > > really like we could filter block groups to scrub just like balance, and > > do scrub in a block group basis, other than devid basis. > > That's to say, for a block group scrub, we don't really care which > > device we're scrubbing, we just need to ensure all device in this block > > is storing correct data. > > > This would actually be rather useful for non-parity cases too. Being > able to scrub only metadata when the data chunks are using a profile > that provides no rebuild support would be great for performance. > > On the same note, it would be _really_ nice to be able to scrub a subset > of the volume's directory tree, even if it were only per-subvolume. https://github.com/kdave/drafts/blob/master/btrfs/scrub-subvolume.txt https://github.com/kdave/drafts/blob/master/btrfs/scrub-custom.txt The idea is to build in-memory tree of block ranges that span the given subvolume or files and run scrub only there. The selective scrub on the block groups of a given type would be a special case of the above. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On Thu, Jul 19, 2018 at 03:27:17PM +0800, Qu Wenruo wrote: > On 2018年07月14日 02:46, David Sterba wrote: > > Hi, > > > > I have some goodies that go into the RAID56 problem, although not > > implementing all the remaining features, it can be useful independently. > > > > This time my hackweek project > > > > https://hackweek.suse.com/17/projects/do-something-about-btrfs-and-raid56 > > > > aimed to implement the fix for the write hole problem but I spent more > > time with analysis and design of the solution and don't have a working > > prototype for that yet. > > > > This patchset brings a feature that will be used by the raid56 log, the > > log has to be on the same redundancy level and thus we need a 3-copy > > replication for raid6. As it was easy to extend to higher replication, > > I've added a 4-copy replication, that would allow triple copy raid (that > > does not have a standardized name). > > So this special level will be used for RAID56 for now? > Or it will also be possible for metadata usage just like current RAID1? It's a new profile usable in the same way as is raid1, ie. for the data or metadata. The patch that adds support to btrfs-progs has an mkfs example. The raid56 will use that to store the log, essentially data forcibly stored on the n-copy raid1 chunk and used only for logging. > If the latter, the metadata scrub problem will need to be considered more. > > For more copies RAID1, it's will have higher possibility one or two > devices missing, and then being scrubbed. > For metadata scrub, inlined csum can't ensure it's the latest one. > > So for such RAID1 scrub, we need to read out all copies and compare > their generation to find out the correct copy. > At least from the changeset, it doesn't look like it's addressed yet. Nothing like this is implemented in the patches, but I don't understand how this differs from the current raid1 and one missing device. Sure we can't have 2 missing devices so the existing copy is automatically considered correct and up to date. There are more corner case recovery scenario when there could be 3 copies slightly out of date due to device loss and scrub attempt, so yes this would need to be addressed. > And this also reminds me that current scrub is not as flex as balance, I > really like we could filter block groups to scrub just like balance, and > do scrub in a block group basis, other than devid basis. > That's to say, for a block group scrub, we don't really care which > device we're scrubbing, we just need to ensure all device in this block > is storing correct data. Right, a subset of the balance filters would be nice. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
18.07.2018 22:42, Goffredo Baroncelli пишет: > On 07/18/2018 09:20 AM, Duncan wrote: >> Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as >> excerpted: >> >>> On 07/17/2018 11:12 PM, Duncan wrote: Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as excerpted: > On 07/15/2018 04:37 PM, waxhead wrote: > Striping and mirroring/pairing are orthogonal properties; mirror and > parity are mutually exclusive. I can't agree. I don't know whether you meant that in the global sense, or purely in the btrfs context (which I suspect), but either way I can't agree. In the pure btrfs context, while striping and mirroring/pairing are orthogonal today, Hugo's whole point was that btrfs is theoretically flexible enough to allow both together and the feature may at some point be added, so it makes sense to have a layout notation format flexible enough to allow it as well. >>> >>> When I say orthogonal, It means that these can be combined: i.e. you can >>> have - striping (RAID0) >>> - parity (?) >>> - striping + parity (e.g. RAID5/6) >>> - mirroring (RAID1) >>> - mirroring + striping (RAID10) >>> >>> However you can't have mirroring+parity; this means that a notation >>> where both 'C' ( = number of copy) and 'P' ( = number of parities) is >>> too verbose. >> >> Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on >> top of mirroring or mirroring on top of raid5/6, much as raid10 is >> conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 >> on top of raid0. > And what about raid 615156156 (raid 6 on top of raid 1 on top of raid 5 on > top of) ??? > > Seriously, of course you can combine a lot of different profile; however the > only ones that make sense are the ones above. RAID50 (striping across RAID5) is common. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
Hugo Mills wrote: On Wed, Jul 18, 2018 at 08:39:48AM +, Duncan wrote: Duncan posted on Wed, 18 Jul 2018 07:20:09 + as excerpted: Perhaps it's a case of coder's view (no code doing it that way, it's just a coincidental oddity conditional on equal sizes), vs. sysadmin's view (code or not, accidental or not, it's a reasonably accurate high-level description of how it ends up working most of the time with equivalent sized devices).) Well, it's an *accurate* observation. It's just not a particularly *useful* one. :) Hugo. A bit off topic perhaps - but I've got to give it a go: Pretty please with sugar, nuts, a cherry and chocolate sprinkles dipped in syrup and coated with icecream on top , would it not be about time to update your online btrfs-usage calculator (which is insanely useful in so many ways) to support the new modes!? In fact it would have been a great- / even better as a- cli-tool. And yes, a while ago I toyed about porting it to C for own use mostly, but never got that far. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-19 13:29, Goffredo Baroncelli wrote: On 07/19/2018 01:43 PM, Austin S. Hemmelgarn wrote: On 2018-07-18 15:42, Goffredo Baroncelli wrote: On 07/18/2018 09:20 AM, Duncan wrote: Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as excerpted: On 07/17/2018 11:12 PM, Duncan wrote: Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as excerpted: [...] When I say orthogonal, It means that these can be combined: i.e. you can have - striping (RAID0) - parity (?) - striping + parity (e.g. RAID5/6) - mirroring (RAID1) - mirroring + striping (RAID10) However you can't have mirroring+parity; this means that a notation where both 'C' ( = number of copy) and 'P' ( = number of parities) is too verbose. Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on top of mirroring or mirroring on top of raid5/6, much as raid10 is conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 on top of raid0. And what about raid 615156156 (raid 6 on top of raid 1 on top of raid 5 on top of) ??? Seriously, of course you can combine a lot of different profile; however the only ones that make sense are the ones above. No, there are cases where other configurations make sense. RAID05 and RAID06 are very widely used, especially on NAS systems where you have lots of disks. The RAID5/6 lower layer mitigates the data loss risk of RAID0, and the RAID0 upper-layer mitigates the rebuild scalability issues of RAID5/6. In fact, this is pretty much the standard recommended configuration for large ZFS arrays that want to use parity RAID. This could be reasonably easily supported to a rudimentary degree in BTRFS by providing the ability to limit the stripe width for the parity profiles. Some people use RAID50 or RAID60, although they are strictly speaking inferior in almost all respects to RAID05 and RAID06. RAID01 is also used on occasion, it ends up having the same storage capacity as RAID10, but for some RAID implementations it has a different performance envelope and different rebuild characteristics. Usually, when it is used though, it's software RAID0 on top of hardware RAID1. RAID51 and RAID61 used to be used, but aren't much now. They provided an easy way to have proper data verification without always having the rebuild overhead of RAID5/6 and without needing to do checksumming. They are pretty much useless for BTRFS, as it can already tell which copy is correct. So until now you are repeating what I told: the only useful raid profile are - striping - mirroring - striping+paring (even limiting the number of disk involved) - striping+mirroring No, not quite. At least, not in the combinations you're saying make sense if you are using standard terminology. RAID05 and RAID06 are not the same thing as 'striping+parity' as BTRFS implements that case, and can be significantly more optimized than the trivial implementation of just limiting the number of disks involved in each chunk (by, you know, actually striping just like what we currently call raid10 mode in BTRFS does). RAID15 and RAID16 are a similar case to RAID51 and RAID61, except they might actually make sense in BTRFS to provide a backup means of rebuilding blocks that fail checksum validation if both copies fail. If you need further redundancy, it is easy to implement a parity3 and parity4 raid profile instead of stacking a raid6+raid1 I think you're misunderstanding what I mean here. RAID15/16 consist of two layers: * The top layer is regular RAID1, usually limited to two copies. * The lower layer is RAID5 or RAID6. This means that the lower layer can validate which of the two copies in the upper layer is correct when they don't agree. It doesn't really provide significantly better redundancy (they can technically sustain more disk failures without failing completely than simple two-copy RAID1 can, but just like BTRFS raid10, they can't reliably survive more than one (or two if you're using RAID6 as the lower layer) disk failure), so it does not do the same thing that higher-order parity does. The fact that you can combine striping and mirroring (or pairing) makes sense because you could have a speed gain (see below). [] As someone else pointed out, md/lvm-raid10 already work like this. What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty much works this way except with huge (gig size) chunks. As implemented in BTRFS, raid1 doesn't have striping. The argument is that because there's only two copies, on multi-device btrfs raid1 with 4+ devices of equal size so chunk allocations tend to alternate device pairs, it's effectively striped at the macro level, with the 1 GiB device-level chunks effectively being huge individual device strips of 1 GiB. The striping concept is based to the fact that if the "stripe size" is small enough you have a speed benefit because the reads may be performed in parallel from
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-19 03:27, Qu Wenruo wrote: On 2018年07月14日 02:46, David Sterba wrote: Hi, I have some goodies that go into the RAID56 problem, although not implementing all the remaining features, it can be useful independently. This time my hackweek project https://hackweek.suse.com/17/projects/do-something-about-btrfs-and-raid56 aimed to implement the fix for the write hole problem but I spent more time with analysis and design of the solution and don't have a working prototype for that yet. This patchset brings a feature that will be used by the raid56 log, the log has to be on the same redundancy level and thus we need a 3-copy replication for raid6. As it was easy to extend to higher replication, I've added a 4-copy replication, that would allow triple copy raid (that does not have a standardized name). So this special level will be used for RAID56 for now? Or it will also be possible for metadata usage just like current RAID1? If the latter, the metadata scrub problem will need to be considered more. For more copies RAID1, it's will have higher possibility one or two devices missing, and then being scrubbed. For metadata scrub, inlined csum can't ensure it's the latest one. So for such RAID1 scrub, we need to read out all copies and compare their generation to find out the correct copy. At least from the changeset, it doesn't look like it's addressed yet. And this also reminds me that current scrub is not as flex as balance, I really like we could filter block groups to scrub just like balance, and do scrub in a block group basis, other than devid basis. That's to say, for a block group scrub, we don't really care which device we're scrubbing, we just need to ensure all device in this block is storing correct data. This would actually be rather useful for non-parity cases too. Being able to scrub only metadata when the data chunks are using a profile that provides no rebuild support would be great for performance. On the same note, it would be _really_ nice to be able to scrub a subset of the volume's directory tree, even if it were only per-subvolume. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-18 15:42, Goffredo Baroncelli wrote: On 07/18/2018 09:20 AM, Duncan wrote: Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as excerpted: On 07/17/2018 11:12 PM, Duncan wrote: Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as excerpted: On 07/15/2018 04:37 PM, waxhead wrote: Striping and mirroring/pairing are orthogonal properties; mirror and parity are mutually exclusive. I can't agree. I don't know whether you meant that in the global sense, or purely in the btrfs context (which I suspect), but either way I can't agree. In the pure btrfs context, while striping and mirroring/pairing are orthogonal today, Hugo's whole point was that btrfs is theoretically flexible enough to allow both together and the feature may at some point be added, so it makes sense to have a layout notation format flexible enough to allow it as well. When I say orthogonal, It means that these can be combined: i.e. you can have - striping (RAID0) - parity (?) - striping + parity (e.g. RAID5/6) - mirroring (RAID1) - mirroring + striping (RAID10) However you can't have mirroring+parity; this means that a notation where both 'C' ( = number of copy) and 'P' ( = number of parities) is too verbose. Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on top of mirroring or mirroring on top of raid5/6, much as raid10 is conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 on top of raid0. And what about raid 615156156 (raid 6 on top of raid 1 on top of raid 5 on top of) ??? Seriously, of course you can combine a lot of different profile; however the only ones that make sense are the ones above. No, there are cases where other configurations make sense. RAID05 and RAID06 are very widely used, especially on NAS systems where you have lots of disks. The RAID5/6 lower layer mitigates the data loss risk of RAID0, and the RAID0 upper-layer mitigates the rebuild scalability issues of RAID5/6. In fact, this is pretty much the standard recommended configuration for large ZFS arrays that want to use parity RAID. This could be reasonably easily supported to a rudimentary degree in BTRFS by providing the ability to limit the stripe width for the parity profiles. Some people use RAID50 or RAID60, although they are strictly speaking inferior in almost all respects to RAID05 and RAID06. RAID01 is also used on occasion, it ends up having the same storage capacity as RAID10, but for some RAID implementations it has a different performance envelope and different rebuild characteristics. Usually, when it is used though, it's software RAID0 on top of hardware RAID1. RAID51 and RAID61 used to be used, but aren't much now. They provided an easy way to have proper data verification without always having the rebuild overhead of RAID5/6 and without needing to do checksumming. They are pretty much useless for BTRFS, as it can already tell which copy is correct. RAID15 and RAID16 are a similar case to RAID51 and RAID61, except they might actually make sense in BTRFS to provide a backup means of rebuilding blocks that fail checksum validation if both copies fail. The fact that you can combine striping and mirroring (or pairing) makes sense because you could have a speed gain (see below). [] As someone else pointed out, md/lvm-raid10 already work like this. What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty much works this way except with huge (gig size) chunks. As implemented in BTRFS, raid1 doesn't have striping. The argument is that because there's only two copies, on multi-device btrfs raid1 with 4+ devices of equal size so chunk allocations tend to alternate device pairs, it's effectively striped at the macro level, with the 1 GiB device-level chunks effectively being huge individual device strips of 1 GiB. The striping concept is based to the fact that if the "stripe size" is small enough you have a speed benefit because the reads may be performed in parallel from different disks. That's not the only benefit of striping though. The other big one is that you now have one volume that's the combined size of both of the original devices. Striping is arguably better for this even if you're using a large stripe size because it better balances the wear across the devices than simple concatenation. With a "stripe size" of 1GB, it is very unlikely that this would happens. That's a pretty big assumption. There are all kinds of access patterns that will still distribute the load reasonably evenly across the constituent devices, even if they don't parallelize things. If, for example, all your files are 64k or less, and you only read whole files, there's no functional difference between RAID0 with 1GB blocks and RAID0 with 64k blocks. Such a workload is not unusual on a very busy mail-server. At 1 GiB strip size it doesn't have the typical performance advantage of striping, but
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018年07月14日 02:46, David Sterba wrote: > Hi, > > I have some goodies that go into the RAID56 problem, although not > implementing all the remaining features, it can be useful independently. > > This time my hackweek project > > https://hackweek.suse.com/17/projects/do-something-about-btrfs-and-raid56 > > aimed to implement the fix for the write hole problem but I spent more > time with analysis and design of the solution and don't have a working > prototype for that yet. > > This patchset brings a feature that will be used by the raid56 log, the > log has to be on the same redundancy level and thus we need a 3-copy > replication for raid6. As it was easy to extend to higher replication, > I've added a 4-copy replication, that would allow triple copy raid (that > does not have a standardized name). So this special level will be used for RAID56 for now? Or it will also be possible for metadata usage just like current RAID1? If the latter, the metadata scrub problem will need to be considered more. For more copies RAID1, it's will have higher possibility one or two devices missing, and then being scrubbed. For metadata scrub, inlined csum can't ensure it's the latest one. So for such RAID1 scrub, we need to read out all copies and compare their generation to find out the correct copy. At least from the changeset, it doesn't look like it's addressed yet. And this also reminds me that current scrub is not as flex as balance, I really like we could filter block groups to scrub just like balance, and do scrub in a block group basis, other than devid basis. That's to say, for a block group scrub, we don't really care which device we're scrubbing, we just need to ensure all device in this block is storing correct data. Thanks, Qu > > The number of copies is fixed, so it's not N-copy for an arbitrary N. > This would complicate the implementation too much, though I'd be willing > to add a 5-copy replication for a small bribe. > > The new raid profiles and covered by an incompatibility bit, called > extended_raid, the (idealistic) plan is to stuff as many new > raid-related features as possible. The patch 4/4 mentions the 3- 4- copy > raid1, configurable stripe length, write hole log and triple parity. > If the plan turns out to be too ambitious, the ready and implemented > features will be split and merged. > > An interesting question is the naming of the extended profiles. I picked > something that can be easily understood but it's not a final proposal. > Years ago, Hugo proposed a naming scheme that described the > non-standard raid varieties of the btrfs flavor: > > https://marc.info/?l=linux-btrfs=136286324417767 > > Switching to this naming would be a good addition to the extended raid. > > Regarding the missing raid56 features, I'll continue working on them as > time permits in the following weeks/months, as I'm not aware of anybody > working on that actively enough so to speak. > > Anyway, git branches with the patches: > > kernel: git://github.com/kdave/btrfs-devel dev/extended-raid-ncopies > progs: git://github.com/kdave/btrfs-progs dev/extended-raid-ncopies > > David Sterba (4): > btrfs: refactor block group replication factor calculation to a helper > btrfs: add support for 3-copy replication (raid1c3) > btrfs: add support for 4-copy replication (raid1c4) > btrfs: add incompatibility bit for extended raid features > > fs/btrfs/ctree.h| 1 + > fs/btrfs/extent-tree.c | 45 +++--- > fs/btrfs/relocation.c | 1 + > fs/btrfs/scrub.c| 4 +- > fs/btrfs/super.c| 17 +++ > fs/btrfs/sysfs.c| 2 + > fs/btrfs/volumes.c | 84 ++--- > fs/btrfs/volumes.h | 6 +++ > include/uapi/linux/btrfs.h | 12 - > include/uapi/linux/btrfs_tree.h | 6 +++ > 10 files changed, 134 insertions(+), 44 deletions(-) > signature.asc Description: OpenPGP digital signature
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 07/18/2018 09:20 AM, Duncan wrote: > Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as > excerpted: > >> On 07/17/2018 11:12 PM, Duncan wrote: >>> Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as >>> excerpted: >>> On 07/15/2018 04:37 PM, waxhead wrote: >>> Striping and mirroring/pairing are orthogonal properties; mirror and parity are mutually exclusive. >>> >>> I can't agree. I don't know whether you meant that in the global >>> sense, >>> or purely in the btrfs context (which I suspect), but either way I >>> can't agree. >>> >>> In the pure btrfs context, while striping and mirroring/pairing are >>> orthogonal today, Hugo's whole point was that btrfs is theoretically >>> flexible enough to allow both together and the feature may at some >>> point be added, so it makes sense to have a layout notation format >>> flexible enough to allow it as well. >> >> When I say orthogonal, It means that these can be combined: i.e. you can >> have - striping (RAID0) >> - parity (?) >> - striping + parity (e.g. RAID5/6) >> - mirroring (RAID1) >> - mirroring + striping (RAID10) >> >> However you can't have mirroring+parity; this means that a notation >> where both 'C' ( = number of copy) and 'P' ( = number of parities) is >> too verbose. > > Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on > top of mirroring or mirroring on top of raid5/6, much as raid10 is > conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 > on top of raid0. And what about raid 615156156 (raid 6 on top of raid 1 on top of raid 5 on top of) ??? Seriously, of course you can combine a lot of different profile; however the only ones that make sense are the ones above. The fact that you can combine striping and mirroring (or pairing) makes sense because you could have a speed gain (see below). [] >>> >>> As someone else pointed out, md/lvm-raid10 already work like this. >>> What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty >>> much works this way except with huge (gig size) chunks. >> >> As implemented in BTRFS, raid1 doesn't have striping. > > The argument is that because there's only two copies, on multi-device > btrfs raid1 with 4+ devices of equal size so chunk allocations tend to > alternate device pairs, it's effectively striped at the macro level, with > the 1 GiB device-level chunks effectively being huge individual device > strips of 1 GiB. The striping concept is based to the fact that if the "stripe size" is small enough you have a speed benefit because the reads may be performed in parallel from different disks. With a "stripe size" of 1GB, it is very unlikely that this would happens. > At 1 GiB strip size it doesn't have the typical performance advantage of > striping, but conceptually, it's equivalent to raid10 with huge 1 GiB > strips/chunks. -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On Wed, Jul 18, 2018 at 08:39:48AM +, Duncan wrote: > Duncan posted on Wed, 18 Jul 2018 07:20:09 + as excerpted: > > >> As implemented in BTRFS, raid1 doesn't have striping. > > > > The argument is that because there's only two copies, on multi-device > > btrfs raid1 with 4+ devices of equal size so chunk allocations tend to > > alternate device pairs, it's effectively striped at the macro level, > > with the 1 GiB device-level chunks effectively being huge individual > > device strips of 1 GiB. > > > > At 1 GiB strip size it doesn't have the typical performance advantage of > > striping, but conceptually, it's equivalent to raid10 with huge 1 GiB > > strips/chunks. > > I forgot this bit... > > Similarly, multi-device single is regarded by some to be conceptually > equivalent to raid0 with really huge GiB strips/chunks. > > (As you may note, "the argument is" and "regarded by some" are distancing > phrases. I've seen the argument made on-list, but while I understand the > argument and agree with it to some extent, I'm still a bit uncomfortable > with it and don't normally make it myself, this thread being a noted > exception tho originally I simply repeated what someone else already said > in-thread, because I too agree it's stretching things a bit. But it does > appear to be a useful conceptual equivalency for some, and I do see the > similarity. > > Perhaps it's a case of coder's view (no code doing it that way, it's just > a coincidental oddity conditional on equal sizes), vs. sysadmin's view > (code or not, accidental or not, it's a reasonably accurate high-level > description of how it ends up working most of the time with equivalent > sized devices).) Well, it's an *accurate* observation. It's just not a particularly *useful* one. :) Hugo. -- Hugo Mills | I gave up smoking, drinking and sex once. It was the hugo@... carfax.org.uk | scariest 20 minutes of my life. http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-18 03:20, Duncan wrote: Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as excerpted: On 07/17/2018 11:12 PM, Duncan wrote: Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as excerpted: On 07/15/2018 04:37 PM, waxhead wrote: Striping and mirroring/pairing are orthogonal properties; mirror and parity are mutually exclusive. I can't agree. I don't know whether you meant that in the global sense, or purely in the btrfs context (which I suspect), but either way I can't agree. In the pure btrfs context, while striping and mirroring/pairing are orthogonal today, Hugo's whole point was that btrfs is theoretically flexible enough to allow both together and the feature may at some point be added, so it makes sense to have a layout notation format flexible enough to allow it as well. When I say orthogonal, It means that these can be combined: i.e. you can have - striping (RAID0) - parity (?) - striping + parity (e.g. RAID5/6) - mirroring (RAID1) - mirroring + striping (RAID10) However you can't have mirroring+parity; this means that a notation where both 'C' ( = number of copy) and 'P' ( = number of parities) is too verbose. Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on top of mirroring or mirroring on top of raid5/6, much as raid10 is conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 on top of raid0. While it's not possible today on (pure) btrfs (it's possible today with md/dm-raid or hardware-raid handling one layer), it's theoretically possible both for btrfs and in general, and it could be added to btrfs in the future, so a notation with the flexibility to allow parity and mirroring together does make sense, and having just that sort of flexibility is exactly why Hugo made the notation proposal he did. Tho a sensible use-case for mirroring+parity is a different question. I can see a case being made for it if one layer is hardware/firmware raid, but I'm not entirely sure what the use-case for pure-btrfs raid16 or 61 (or 15 or 51) might be, where pure mirroring or pure parity wouldn't arguably be a at least as good a match to the use-case. Perhaps one of the other experts in such things here might help with that. Question #2: historically RAID10 is requires 4 disks. However I am guessing if the stripe could be done on a different number of disks: What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is that every 64k, the data are stored on a different disk As someone else pointed out, md/lvm-raid10 already work like this. What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty much works this way except with huge (gig size) chunks. As implemented in BTRFS, raid1 doesn't have striping. The argument is that because there's only two copies, on multi-device btrfs raid1 with 4+ devices of equal size so chunk allocations tend to alternate device pairs, it's effectively striped at the macro level, with the 1 GiB device-level chunks effectively being huge individual device strips of 1 GiB. Actually, it also behaves like LVM and MD RAID10 for any number of devices greater than 2, though the exact placement may diverge because of BTRFS's concept of different chunk types. In LVM and MD RAID10, each block is stored as two copies, and what disks it ends up on is dependent on the block number modulo the number of disks (so, for 3 disks A, B, and C, block 0 is on A and B, block 1 is on C and A, and block 2 is on B and C, with subsequent blocks following the same pattern). In an idealized model of BTRFS with only one chunk type, you get exactly the same behavior (because BTRFS allocates chunks based on disk utilization, and prefers lower numbered disks to higher ones in the event of a tie). At 1 GiB strip size it doesn't have the typical performance advantage of striping, but conceptually, it's equivalent to raid10 with huge 1 GiB strips/chunks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-18 04:39, Duncan wrote: Duncan posted on Wed, 18 Jul 2018 07:20:09 + as excerpted: As implemented in BTRFS, raid1 doesn't have striping. The argument is that because there's only two copies, on multi-device btrfs raid1 with 4+ devices of equal size so chunk allocations tend to alternate device pairs, it's effectively striped at the macro level, with the 1 GiB device-level chunks effectively being huge individual device strips of 1 GiB. At 1 GiB strip size it doesn't have the typical performance advantage of striping, but conceptually, it's equivalent to raid10 with huge 1 GiB strips/chunks. I forgot this bit... Similarly, multi-device single is regarded by some to be conceptually equivalent to raid0 with really huge GiB strips/chunks. (As you may note, "the argument is" and "regarded by some" are distancing phrases. I've seen the argument made on-list, but while I understand the argument and agree with it to some extent, I'm still a bit uncomfortable with it and don't normally make it myself, this thread being a noted exception tho originally I simply repeated what someone else already said in-thread, because I too agree it's stretching things a bit. But it does appear to be a useful conceptual equivalency for some, and I do see the similarity. If the file is larger than the data chunk size, it _is_ striped, because it spans multiple chunks which are on separate devices. Otherwise, it's more similar to what in GlusterFS is called a 'distributed volume'. In such a Gluster volume, each file is entirely stored on one node (or you have a complete copy on N nodes where N is the number of replicas), with the selection of what node is used for the next file created being based on which node has the most free space. That said, the main reason I explain single and raid1 the way I do is that I've found it's a much simpler way to explain generically how they work to people who already have storage background but may not care about the specifics. Perhaps it's a case of coder's view (no code doing it that way, it's just a coincidental oddity conditional on equal sizes), vs. sysadmin's view (code or not, accidental or not, it's a reasonably accurate high-level description of how it ends up working most of the time with equivalent sized devices).) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
Duncan posted on Wed, 18 Jul 2018 07:20:09 + as excerpted: >> As implemented in BTRFS, raid1 doesn't have striping. > > The argument is that because there's only two copies, on multi-device > btrfs raid1 with 4+ devices of equal size so chunk allocations tend to > alternate device pairs, it's effectively striped at the macro level, > with the 1 GiB device-level chunks effectively being huge individual > device strips of 1 GiB. > > At 1 GiB strip size it doesn't have the typical performance advantage of > striping, but conceptually, it's equivalent to raid10 with huge 1 GiB > strips/chunks. I forgot this bit... Similarly, multi-device single is regarded by some to be conceptually equivalent to raid0 with really huge GiB strips/chunks. (As you may note, "the argument is" and "regarded by some" are distancing phrases. I've seen the argument made on-list, but while I understand the argument and agree with it to some extent, I'm still a bit uncomfortable with it and don't normally make it myself, this thread being a noted exception tho originally I simply repeated what someone else already said in-thread, because I too agree it's stretching things a bit. But it does appear to be a useful conceptual equivalency for some, and I do see the similarity. Perhaps it's a case of coder's view (no code doing it that way, it's just a coincidental oddity conditional on equal sizes), vs. sysadmin's view (code or not, accidental or not, it's a reasonably accurate high-level description of how it ends up working most of the time with equivalent sized devices).) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as excerpted: > On 07/17/2018 11:12 PM, Duncan wrote: >> Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as >> excerpted: >> >>> On 07/15/2018 04:37 PM, waxhead wrote: >> >>> Striping and mirroring/pairing are orthogonal properties; mirror and >>> parity are mutually exclusive. >> >> I can't agree. I don't know whether you meant that in the global >> sense, >> or purely in the btrfs context (which I suspect), but either way I >> can't agree. >> >> In the pure btrfs context, while striping and mirroring/pairing are >> orthogonal today, Hugo's whole point was that btrfs is theoretically >> flexible enough to allow both together and the feature may at some >> point be added, so it makes sense to have a layout notation format >> flexible enough to allow it as well. > > When I say orthogonal, It means that these can be combined: i.e. you can > have - striping (RAID0) > - parity (?) > - striping + parity (e.g. RAID5/6) > - mirroring (RAID1) > - mirroring + striping (RAID10) > > However you can't have mirroring+parity; this means that a notation > where both 'C' ( = number of copy) and 'P' ( = number of parities) is > too verbose. Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on top of mirroring or mirroring on top of raid5/6, much as raid10 is conceptually just raid0 on top of raid1, and raid01 is conceptually raid1 on top of raid0. While it's not possible today on (pure) btrfs (it's possible today with md/dm-raid or hardware-raid handling one layer), it's theoretically possible both for btrfs and in general, and it could be added to btrfs in the future, so a notation with the flexibility to allow parity and mirroring together does make sense, and having just that sort of flexibility is exactly why Hugo made the notation proposal he did. Tho a sensible use-case for mirroring+parity is a different question. I can see a case being made for it if one layer is hardware/firmware raid, but I'm not entirely sure what the use-case for pure-btrfs raid16 or 61 (or 15 or 51) might be, where pure mirroring or pure parity wouldn't arguably be a at least as good a match to the use-case. Perhaps one of the other experts in such things here might help with that. >>> Question #2: historically RAID10 is requires 4 disks. However I am >>> guessing if the stripe could be done on a different number of disks: >>> What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is >>> that every 64k, the data are stored on a different disk >> >> As someone else pointed out, md/lvm-raid10 already work like this. >> What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty >> much works this way except with huge (gig size) chunks. > > As implemented in BTRFS, raid1 doesn't have striping. The argument is that because there's only two copies, on multi-device btrfs raid1 with 4+ devices of equal size so chunk allocations tend to alternate device pairs, it's effectively striped at the macro level, with the 1 GiB device-level chunks effectively being huge individual device strips of 1 GiB. At 1 GiB strip size it doesn't have the typical performance advantage of striping, but conceptually, it's equivalent to raid10 with huge 1 GiB strips/chunks. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 07/17/2018 11:12 PM, Duncan wrote: > Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as > excerpted: > >> On 07/15/2018 04:37 PM, waxhead wrote: > >> Striping and mirroring/pairing are orthogonal properties; mirror and >> parity are mutually exclusive. > > I can't agree. I don't know whether you meant that in the global sense, > or purely in the btrfs context (which I suspect), but either way I can't > agree. > > In the pure btrfs context, while striping and mirroring/pairing are > orthogonal today, Hugo's whole point was that btrfs is theoretically > flexible enough to allow both together and the feature may at some point > be added, so it makes sense to have a layout notation format flexible > enough to allow it as well. When I say orthogonal, It means that these can be combined: i.e. you can have - striping (RAID0) - parity (?) - striping + parity (e.g. RAID5/6) - mirroring (RAID1) - mirroring + striping (RAID10) However you can't have mirroring+parity; this means that a notation where both 'C' ( = number of copy) and 'P' ( = number of parities) is too verbose. [...] > >> Question #2: historically RAID10 is requires 4 disks. However I am >> guessing if the stripe could be done on a different number of disks: >> What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is >> that every 64k, the data are stored on a different disk > > As someone else pointed out, md/lvm-raid10 already work like this. What > btrfs calls raid10 is somewhat different, but btrfs raid1 pretty much > works this way except with huge (gig size) chunks. As implemented in BTRFS, raid1 doesn't have striping. -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as excerpted: > On 07/15/2018 04:37 PM, waxhead wrote: > Striping and mirroring/pairing are orthogonal properties; mirror and > parity are mutually exclusive. I can't agree. I don't know whether you meant that in the global sense, or purely in the btrfs context (which I suspect), but either way I can't agree. In the pure btrfs context, while striping and mirroring/pairing are orthogonal today, Hugo's whole point was that btrfs is theoretically flexible enough to allow both together and the feature may at some point be added, so it makes sense to have a layout notation format flexible enough to allow it as well. In the global context, just to complete things and mostly for others reading as I feel a bit like a simpleton explaining to the expert here, just as raid10 is shorthand for raid1+0, aka raid0 layered on top of raid1 (normally preferred to raid01 due to rebuild characteristics, and as opposed to raid01, aka raid0+1, aka raid1 on top of raid0, sometimes recommended as btrfs raid1 on top of whatever raid0 here due to btrfs' data integrity characteristics and less optimized performance), so there's also raid51 and raid15, raid61 and raid16, etc, with or without the + symbols, involving mirroring and parity conceptually at two different levels altho they can be combined in a single implementation just as raid10 and raid01 commonly are. These additional layered-raid levels can be used for higher reliability, with differing rebuild and performance characteristics between the two forms depending on which is the top layer. > Question #1: for "parity" profiles, does make sense to limit the maximum > disks number where the data may be spread ? If the answer is not, we > could omit the last S. IMHO it should. As someone else already replied, btrfs doesn't currently have the ability to specify spread limit, but the idea if we're going to change the notation is to allow for the flexibility in the new notation so the feature can be added later without further notation changes. Why might it make sense to specify spread? At least two possible reasons: a) (stealing an already posted example) Consider a multi-device layout with two or more device sizes. Someone may want to limit the spread in ordered to keep performance and risk consistent as the smaller devices fill up, limiting further usage to a lower number of devices. If that lower number is specified as the spread originally it'll make things more consistent between the room on all devices case and the room on only some devices case. b) Limiting spread can change the risk and rebuild performance profiles. Stripes of full width mean all stripes have a strip on each device, so knock a device out and (assuming parity or mirroring) replace it, and all stripes are degraded and must be rebuilt. With less than maximum spread, some stripes won't be stripped to the replaced device, and won't be degraded or need rebuilt, tho assuming the same overall fill, a larger percentage of stripes that /do/ need rebuilt will be on the replaced device. So the risk profile is more "objects" (stripes/chunks/files) affected but less of each object, or less of the total affected, but more of each affected object. > Question #2: historically RAID10 is requires 4 disks. However I am > guessing if the stripe could be done on a different number of disks: > What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is > that every 64k, the data are stored on a different disk As someone else pointed out, md/lvm-raid10 already work like this. What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty much works this way except with huge (gig size) chunks. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
waxhead wrote: David Sterba wrote: An interesting question is the naming of the extended profiles. I picked something that can be easily understood but it's not a final proposal. Years ago, Hugo proposed a naming scheme that described the non-standard raid varieties of the btrfs flavor: https://marc.info/?l=linux-btrfs=136286324417767 Switching to this naming would be a good addition to the extended raid. As just a humble BTRFS user I agree and really think it is about time to move far away from the RAID terminology. However adding some more descriptive profile names (or at least some aliases) would be much better for the commoners (such as myself). ...snip... > Which would make the above table look like so: Old format / My Format / My suggested alias SINGLE / R0.S0.P0 / SINGLE DUP / R1.S1.P0 / DUP (or even MIRRORLOCAL1) RAID0 / R0.Sm.P0 / STRIPE RAID1 / R1.S0.P0 / MIRROR1 RAID1c3 / R2.S0.P0 / MIRROR2 RAID1c4 / R3.S0.P0 / MIRROR3 RAID10 / R1.Sm.P0 / STRIPE.MIRROR1 RAID5 / R1.Sm.P1 / STRIPE.PARITY1 RAID6 / R1.Sm.P2 / STRIPE.PARITY2 And i think this is much more readable, but others may disagree. And as a side note... from a (hobby) coders perspective this is probably simpler to parse as well. ...snap... ...and before someone else points this out that my suggestion has an ugly flaw , I got a bit copy / paste happy and messed up the RAID 5 and 6 like profiles. The below table are corrected and hopefully it make the point why using the word 'replicas' is easier to understand than 'copies' even if I messed it up :) Old format / My Format / My suggested alias SINGLE / R0.S0.P0 / SINGLE DUP / R1.S1.P0 / DUP (or even MIRRORLOCAL1) RAID0 / R0.Sm.P0 / STRIPE RAID1 / R1.S0.P0 / MIRROR1 RAID1c3 / R2.S0.P0 / MIRROR2 RAID1c4 / R3.S0.P0 / MIRROR3 RAID10 / R1.Sm.P0 / STRIPE.MIRROR1 RAID5 / R0.Sm.P1 / STRIPE.PARITY1 RAID6 / R0.Sm.P2 / STRIPE.PARITY2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 2018-07-16 14:29, Goffredo Baroncelli wrote: On 07/15/2018 04:37 PM, waxhead wrote: David Sterba wrote: An interesting question is the naming of the extended profiles. I picked something that can be easily understood but it's not a final proposal. Years ago, Hugo proposed a naming scheme that described the non-standard raid varieties of the btrfs flavor: https://marc.info/?l=linux-btrfs=136286324417767 Switching to this naming would be a good addition to the extended raid. As just a humble BTRFS user I agree and really think it is about time to move far away from the RAID terminology. However adding some more descriptive profile names (or at least some aliases) would be much better for the commoners (such as myself). For example: Old format / New Format / My suggested alias SINGLE / 1C / SINGLE DUP / 2CD / DUP (or even MIRRORLOCAL1) RAID0 / 1CmS / STRIPE RAID1 / 2C / MIRROR1 RAID1c3 / 3C / MIRROR2 RAID1c4 / 4C / MIRROR3 RAID10 / 2CmS / STRIPE.MIRROR1 Striping and mirroring/pairing are orthogonal properties; mirror and parity are mutually exclusive. What about RAID1 -> MIRROR1 RAID10 -> MIRROR1S RAID1c3 -> MIRROR2 RAID1c3+striping -> MIRROR2S and so on... RAID5 / 1CmS1P / STRIPE.PARITY1 RAID6 / 1CmS2P / STRIPE.PARITY2 To me these should be called something like RAID5 -> PARITY1S RAID6 -> PARITY2S The S final is due to the fact that usually RAID5/6 spread the data on all available disks Question #1: for "parity" profiles, does make sense to limit the maximum disks number where the data may be spread ? If the answer is not, we could omit the last S. IMHO it should. Currently, there is no ability to cap the number of disks that striping can happen across. Ideally, that will change in the future, in which case not only the S will be needed, but also a number indicating how wide the stripe is. Question #2: historically RAID10 is requires 4 disks. However I am guessing if the stripe could be done on a different number of disks: What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is that every 64k, the data are stored on a different disk This is what MD and LVM RAID10 do. They work somewhat differently from what BTRFS calls raid10 (actually, what we currently call raid1 works almost identically to MD and LVM RAID10 when more than 3 disks are involved, except that the chunk size is 1G or larger). Short of drastic internal changes to how that profile works, this isn't likely to happen. In spite of both of these, there is practical need for indicating the stripe width. Depending on the configuration of the underlying storage, it's fully possible (and sometimes even certain) that you will see chunks with differing stripe widths, so properly reporting the stripe width (in devices, not bytes) is useful for monitoring purposes). Consider for example a 6-device array using what's currently called a raid10 profile where 2 of the disks are smaller than the other four. On such an array, chunks will span all six disks (resulting in 2 copies striped across 3 disks each) until those two smaller disks are full, at which point new chunks will span only the remaining four disks (resulting in 2 copies striped across 2 disks each). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On 07/15/2018 04:37 PM, waxhead wrote: > David Sterba wrote: >> An interesting question is the naming of the extended profiles. I picked >> something that can be easily understood but it's not a final proposal. >> Years ago, Hugo proposed a naming scheme that described the >> non-standard raid varieties of the btrfs flavor: >> >> https://marc.info/?l=linux-btrfs=136286324417767 >> >> Switching to this naming would be a good addition to the extended raid. >> > As just a humble BTRFS user I agree and really think it is about time to move > far away from the RAID terminology. However adding some more descriptive > profile names (or at least some aliases) would be much better for the > commoners (such as myself). > > For example: > > Old format / New Format / My suggested alias > SINGLE / 1C / SINGLE > DUP / 2CD / DUP (or even MIRRORLOCAL1) > RAID0 / 1CmS / STRIPE > RAID1 / 2C / MIRROR1 > RAID1c3 / 3C / MIRROR2 > RAID1c4 / 4C / MIRROR3 > RAID10 / 2CmS / STRIPE.MIRROR1 Striping and mirroring/pairing are orthogonal properties; mirror and parity are mutually exclusive. What about RAID1 -> MIRROR1 RAID10 -> MIRROR1S RAID1c3 -> MIRROR2 RAID1c3+striping -> MIRROR2S and so on... > RAID5 / 1CmS1P / STRIPE.PARITY1 > RAID6 / 1CmS2P / STRIPE.PARITY2 To me these should be called something like RAID5 -> PARITY1S RAID6 -> PARITY2S The S final is due to the fact that usually RAID5/6 spread the data on all available disks Question #1: for "parity" profiles, does make sense to limit the maximum disks number where the data may be spread ? If the answer is not, we could omit the last S. IMHO it should. Question #2: historically RAID10 is requires 4 disks. However I am guessing if the stripe could be done on a different number of disks: What about RAID1+Striping on 3 (or 5 disks) ? The key of striping is that every 64k, the data are stored on a different disk > > I find that writing something like "btrfs balance start > -dconvert=stripe5.parity2 /mnt" is far less confusing and therefore less > error prone than writing "-dconvert=1C5S2P". > > While Hugo's suggestion is compact and to the point I would call for > expanding that so it is a bit more descriptive and human readable. > > So for example : STRIPE where obviously is the same as Hugo > proposed - the number of storage devices for the stripe and no would be > best to mean 'use max devices'. > For PARITY then is obviously required > > Keep in mind that most people (...and I am willing to bet even Duncan which > probably HAS backups ;) ) get a bit stressed when their storage system is > degraded. With that in mind I hope for more elaborate, descriptive and human > readable profile names to be used to avoid making mistakes using the > "compact" layout. > > ...and yes, of course this could go both ways. A more compact (and dare I say > cryptic) variant can cause people to stop and think before doing something > and thus avoid errors, > > Now that I made my point I can't help being a bit extra hash, obnoxious and > possibly difficult so I would also suggest that Hugo's format could have been > changed (dare I say improved?) from > > numCOPIESnumSTRIPESnumPARITY > > to. > > REPLICASnum.STRIPESnum.PARITYnum > > Which would make the above table look like so: > > Old format / My Format / My suggested alias > SINGLE / R0.S0.P0 / SINGLE > DUP / R1.S1.P0 / DUP (or even MIRRORLOCAL1) > RAID0 / R0.Sm.P0 / STRIPE > RAID1 / R1.S0.P0 / MIRROR1 > RAID1c3 / R2.S0.P0 / MIRROR2 > RAID1c4 / R3.S0.P0 / MIRROR3 > RAID10 / R1.Sm.P0 / STRIPE.MIRROR1 > RAID5 / R1.Sm.P1 / STRIPE.PARITY1 > RAID6 / R1.Sm.P2 / STRIPE.PARITY2 > > And i think this is much more readable, but others may disagree. And as a > side note... from a (hobby) coders perspective this is probably simpler to > parse as well. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] 3- and 4- copy RAID1
On Fri, Jul 13, 2018 at 08:46:28PM +0200, David Sterba wrote: [snip] > An interesting question is the naming of the extended profiles. I picked > something that can be easily understood but it's not a final proposal. > Years ago, Hugo proposed a naming scheme that described the > non-standard raid varieties of the btrfs flavor: > > https://marc.info/?l=linux-btrfs=136286324417767 > > Switching to this naming would be a good addition to the extended raid. I'd suggest using lower-case letter for the c, s, p, rather than upper, as it makes it much easier to read. The upper-case version tends to make the letters and numbers merge into each other. With lower-case c, s, p, the taller digits (or M) stand out: 1c 1cMs2p 2c3s8p (OK, just kidding about this one) Hugo. -- Hugo Mills | The English language has the mot juste for every hugo@... carfax.org.uk | occasion. http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: [PATCH 0/4] 3- and 4- copy RAID1
David Sterba wrote: An interesting question is the naming of the extended profiles. I picked something that can be easily understood but it's not a final proposal. Years ago, Hugo proposed a naming scheme that described the non-standard raid varieties of the btrfs flavor: https://marc.info/?l=linux-btrfs=136286324417767 Switching to this naming would be a good addition to the extended raid. As just a humble BTRFS user I agree and really think it is about time to move far away from the RAID terminology. However adding some more descriptive profile names (or at least some aliases) would be much better for the commoners (such as myself). For example: Old format / New Format / My suggested alias SINGLE / 1C / SINGLE DUP / 2CD/ DUP (or even MIRRORLOCAL1) RAID0 / 1CmS / STRIPE RAID1 / 2C / MIRROR1 RAID1c3 / 3C / MIRROR2 RAID1c4 / 4C / MIRROR3 RAID10 / 2CmS / STRIPE.MIRROR1 RAID5 / 1CmS1P / STRIPE.PARITY1 RAID6 / 1CmS2P / STRIPE.PARITY2 I find that writing something like "btrfs balance start -dconvert=stripe5.parity2 /mnt" is far less confusing and therefore less error prone than writing "-dconvert=1C5S2P". While Hugo's suggestion is compact and to the point I would call for expanding that so it is a bit more descriptive and human readable. So for example : STRIPE where obviously is the same as Hugo proposed - the number of storage devices for the stripe and no would be best to mean 'use max devices'. For PARITY then is obviously required Keep in mind that most people (...and I am willing to bet even Duncan which probably HAS backups ;) ) get a bit stressed when their storage system is degraded. With that in mind I hope for more elaborate, descriptive and human readable profile names to be used to avoid making mistakes using the "compact" layout. ...and yes, of course this could go both ways. A more compact (and dare I say cryptic) variant can cause people to stop and think before doing something and thus avoid errors, Now that I made my point I can't help being a bit extra hash, obnoxious and possibly difficult so I would also suggest that Hugo's format could have been changed (dare I say improved?) from numCOPIESnumSTRIPESnumPARITY to. REPLICASnum.STRIPESnum.PARITYnum Which would make the above table look like so: Old format / My Format / My suggested alias SINGLE / R0.S0.P0 / SINGLE DUP / R1.S1.P0 / DUP (or even MIRRORLOCAL1) RAID0 / R0.Sm.P0 / STRIPE RAID1 / R1.S0.P0 / MIRROR1 RAID1c3 / R2.S0.P0 / MIRROR2 RAID1c4 / R3.S0.P0 / MIRROR3 RAID10 / R1.Sm.P0 / STRIPE.MIRROR1 RAID5 / R1.Sm.P1 / STRIPE.PARITY1 RAID6 / R1.Sm.P2 / STRIPE.PARITY2 And i think this is much more readable, but others may disagree. And as a side note... from a (hobby) coders perspective this is probably simpler to parse as well. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] 3- and 4- copy RAID1
Hi, I have some goodies that go into the RAID56 problem, although not implementing all the remaining features, it can be useful independently. This time my hackweek project https://hackweek.suse.com/17/projects/do-something-about-btrfs-and-raid56 aimed to implement the fix for the write hole problem but I spent more time with analysis and design of the solution and don't have a working prototype for that yet. This patchset brings a feature that will be used by the raid56 log, the log has to be on the same redundancy level and thus we need a 3-copy replication for raid6. As it was easy to extend to higher replication, I've added a 4-copy replication, that would allow triple copy raid (that does not have a standardized name). The number of copies is fixed, so it's not N-copy for an arbitrary N. This would complicate the implementation too much, though I'd be willing to add a 5-copy replication for a small bribe. The new raid profiles and covered by an incompatibility bit, called extended_raid, the (idealistic) plan is to stuff as many new raid-related features as possible. The patch 4/4 mentions the 3- 4- copy raid1, configurable stripe length, write hole log and triple parity. If the plan turns out to be too ambitious, the ready and implemented features will be split and merged. An interesting question is the naming of the extended profiles. I picked something that can be easily understood but it's not a final proposal. Years ago, Hugo proposed a naming scheme that described the non-standard raid varieties of the btrfs flavor: https://marc.info/?l=linux-btrfs=136286324417767 Switching to this naming would be a good addition to the extended raid. Regarding the missing raid56 features, I'll continue working on them as time permits in the following weeks/months, as I'm not aware of anybody working on that actively enough so to speak. Anyway, git branches with the patches: kernel: git://github.com/kdave/btrfs-devel dev/extended-raid-ncopies progs: git://github.com/kdave/btrfs-progs dev/extended-raid-ncopies David Sterba (4): btrfs: refactor block group replication factor calculation to a helper btrfs: add support for 3-copy replication (raid1c3) btrfs: add support for 4-copy replication (raid1c4) btrfs: add incompatibility bit for extended raid features fs/btrfs/ctree.h| 1 + fs/btrfs/extent-tree.c | 45 +++--- fs/btrfs/relocation.c | 1 + fs/btrfs/scrub.c| 4 +- fs/btrfs/super.c| 17 +++ fs/btrfs/sysfs.c| 2 + fs/btrfs/volumes.c | 84 ++--- fs/btrfs/volumes.h | 6 +++ include/uapi/linux/btrfs.h | 12 - include/uapi/linux/btrfs_tree.h | 6 +++ 10 files changed, 134 insertions(+), 44 deletions(-) -- 2.18.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html