On 2017-08-23 11:28, Chris Murphy wrote:
On Wed, Aug 2, 2017 at 2:27 PM, Liu Bo <bo.li....@oracle.com> wrote:
On Wed, Aug 02, 2017 at 10:41:30PM +0200, Goffredo Baroncelli wrote:
What I want to understand, is if it is possible to log only the "partial
stripe" RMW cycle.
I think your point is valid if all data is written with datacow. In
case of nodatacow, btrfs does overwrite in place, so a full stripe
write may pollute on-disk data after unclean shutdown. Checksum can
detect errors but repair thru raid5 may not recover the correct data.
What's simpler? raid56 journal for everything (cow, nocow, data,
metadata), or to apply some limitations to available layouts?
- if raid56, then cow only (no such thing as nodatacow)
This should obviously be something that will be contentious to certain
individuals.
- permit raid56 for data bg only, system and metadata can be raid1, or raid10
I'm hard pressed thinking of a use case where metadata raid56 is
beneficial over raid10; a metadata heavy workload is not well suited
for any parity raid. And if it isn't metadata heavy, then chances are
you don't even need raid10 but raid1 is sufficient.
Until BTRFS gets n-way replication, raid6 remains the only way to
configure a BTRFS volume to survive more than one device failure.
Of the more complicated ways to solve it:
- journal
- dynamically sized stripes, so that writes can always be full stripe
writes, no overwrites, and atomic
- mixed block groups where only sequential full stripe writes use
raid56 block group; random and smaller writes go in a raid 1 or 10
block group.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html