Re: [PATCH v3] btrfs: zoned: move superblock logging zone location
On 2021/04/10 19:15, David Sterba wrote: > From: Naohiro Aota > > Moves the location of the superblock logging zones. The new locations of > the logging zones are now determined based on fixed block addresses > instead of on fixed zone numbers. > > The old placement method based on fixed zone numbers causes problems when > one needs to inspect a file system image without access to the drive zone > information. In such case, the super block locations cannot be reliably > determined as the zone size is unknown. By locating the superblock logging > zones using fixed addresses, we can scan a dumped file system image without > the zone information since a super block copy will always be present at or > after the fixed known locations. > > Introduce the following three pairs of zones containing fixed offset > locations, regardless of the device zone size. > > - primary superblock: offset 0B (and the following zone) > - first copy: offset 512G (and the following zone) > - Second copy:offset 4T (4096G, and the following zone) > > If a logging zone is outside of the disk capacity, we do not record the > superblock copy. > > The first copy position is much larger than for a non-zoned filesystem, > which is at 64M. This is to avoid overlapping with the log zones for > the primary superblock. This higher location is arbitrary but allows > supporting devices with very large zone sizes, plus some space around in > between. > > Such large zone size is unrealistic and very unlikely to ever be seen in > real devices. Currently, SMR disks have a zone size of 256MB, and we are > expecting ZNS drives to be in the 1-4GB range, so this limit gives us > room to breathe. For now, we only allow zone sizes up to 8GB. The > maximum zone size that would still fit in the space is 256G. > > The fixed location addresses are somewhat arbitrary, with the intent of > maintaining superblock reliability for smaller and larger devices, with > the preference for the latter. For this reason, there are two superblocks > under the first 1T. This should cover use cases for physical devices and > for emulated/device-mapper devices. > > The superblock logging zones are reserved for superblock logging and > never used for data or metadata blocks. Note that we only reserve the > two zones per primary/copy actually used for superblock logging. We do > not reserve the ranges of zones possibly containing superblocks with the > largest supported zone size (0-16GB, 512G-528GB, 4096G-4112G). > > The zones containing the fixed location offsets used to store > superblocks on a non-zoned volume are also reserved to avoid confusion. > > Signed-off-by: Naohiro Aota > Signed-off-by: David Sterba > --- > > For context see replies under > https://lore.kernel.org/linux-btrfs/2f58edb74695825632c77349b000d31f16cb3226.1617870145.git.naohiro.a...@wdc.com/ > > fs/btrfs/zoned.c | 53 ++-- > 1 file changed, 42 insertions(+), 11 deletions(-) > > diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > index 1f972b75a9ab..eeb3ebe11d7a 100644 > --- a/fs/btrfs/zoned.c > +++ b/fs/btrfs/zoned.c > @@ -21,9 +21,30 @@ > /* Pseudo write pointer value for conventional zone */ > #define WP_CONVENTIONAL ((u64)-2) > > +/* > + * Location of the first zone of superblock logging zone pairs. > + * > + * - primary superblock:0B (zone 0) > + * - first copy: 512G (zone starting at that offset) > + * - second copy: 4T (zone starting at that offset) > + */ > +#define BTRFS_SB_LOG_PRIMARY_OFFSET (0ULL) > +#define BTRFS_SB_LOG_FIRST_OFFSET(512ULL * SZ_1G) > +#define BTRFS_SB_LOG_SECOND_OFFSET (4096ULL * SZ_1G) > + > +#define BTRFS_SB_LOG_FIRST_SHIFT const_ilog2(BTRFS_SB_LOG_FIRST_OFFSET) > +#define BTRFS_SB_LOG_SECOND_SHIFTconst_ilog2(BTRFS_SB_LOG_SECOND_OFFSET) > + > /* Number of superblock log zones */ > #define BTRFS_NR_SB_LOG_ZONES 2 > > +/* > + * Maximum supported zone size. Currently, SMR disks have a zone size of > + * 256MiB, and we are expecting ZNS drives to be in the 1-4GiB range. We do > not > + * expect the zone size to become larger than 8GiB in the near future. > + */ > +#define BTRFS_MAX_ZONE_SIZE SZ_8G > + > static int copy_zone_info_cb(struct blk_zone *zone, unsigned int idx, void > *data) > { > struct blk_zone *zones = data; > @@ -111,23 +132,22 @@ static int sb_write_pointer(struct block_device *bdev, > struct blk_zone *zones, > } > > /* > - * The following zones are reserved as the circular buffer on ZONED btrfs. > - * - The primary superblock: zones 0 and 1 > - * - The first copy: zones 16 and 17 > - * - The second copy: zones 1024 or zone at 256GB which is minimum, and > - * the following one > + * Get the first zone number of the superblock mirror > */ > static inline u32 sb_zone_number(int shift, int mirror) > { > - ASSERT(mirror < BTRFS_SUPER_MIRROR_MAX); > + u64 zone; > > + ASSERT
Re: [PATCH v3] btrfs: zoned: move superblock logging zone location
On Sat, Apr 10, 2021 at 12:12:23PM +0200, David Sterba wrote: > From: Naohiro Aota > > Moves the location of the superblock logging zones. The new locations of > the logging zones are now determined based on fixed block addresses > instead of on fixed zone numbers. > > The old placement method based on fixed zone numbers causes problems when > one needs to inspect a file system image without access to the drive zone > information. In such case, the super block locations cannot be reliably > determined as the zone size is unknown. By locating the superblock logging > zones using fixed addresses, we can scan a dumped file system image without > the zone information since a super block copy will always be present at or > after the fixed known locations. > > Introduce the following three pairs of zones containing fixed offset > locations, regardless of the device zone size. > > - primary superblock: offset 0B (and the following zone) > - first copy: offset 512G (and the following zone) > - Second copy:offset 4T (4096G, and the following zone) > > If a logging zone is outside of the disk capacity, we do not record the > superblock copy. > > The first copy position is much larger than for a non-zoned filesystem, > which is at 64M. This is to avoid overlapping with the log zones for > the primary superblock. This higher location is arbitrary but allows > supporting devices with very large zone sizes, plus some space around in > between. > > Such large zone size is unrealistic and very unlikely to ever be seen in > real devices. Currently, SMR disks have a zone size of 256MB, and we are > expecting ZNS drives to be in the 1-4GB range, so this limit gives us > room to breathe. For now, we only allow zone sizes up to 8GB. The > maximum zone size that would still fit in the space is 256G. > > The fixed location addresses are somewhat arbitrary, with the intent of > maintaining superblock reliability for smaller and larger devices, with > the preference for the latter. For this reason, there are two superblocks > under the first 1T. This should cover use cases for physical devices and > for emulated/device-mapper devices. > > The superblock logging zones are reserved for superblock logging and > never used for data or metadata blocks. Note that we only reserve the > two zones per primary/copy actually used for superblock logging. We do > not reserve the ranges of zones possibly containing superblocks with the > largest supported zone size (0-16GB, 512G-528GB, 4096G-4112G). > > The zones containing the fixed location offsets used to store > superblocks on a non-zoned volume are also reserved to avoid confusion. > > Signed-off-by: Naohiro Aota > Signed-off-by: David Sterba > --- > > For context see replies under > https://lore.kernel.org/linux-btrfs/2f58edb74695825632c77349b000d31f16cb3226.1617870145.git.naohiro.a...@wdc.com/ > > fs/btrfs/zoned.c | 53 ++-- > 1 file changed, 42 insertions(+), 11 deletions(-) > > diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > index 1f972b75a9ab..eeb3ebe11d7a 100644 > --- a/fs/btrfs/zoned.c > +++ b/fs/btrfs/zoned.c > @@ -21,9 +21,30 @@ > /* Pseudo write pointer value for conventional zone */ > #define WP_CONVENTIONAL ((u64)-2) > > +/* > + * Location of the first zone of superblock logging zone pairs. > + * > + * - primary superblock:0B (zone 0) > + * - first copy: 512G (zone starting at that offset) > + * - second copy: 4T (zone starting at that offset) > + */ > +#define BTRFS_SB_LOG_PRIMARY_OFFSET (0ULL) > +#define BTRFS_SB_LOG_FIRST_OFFSET(512ULL * SZ_1G) > +#define BTRFS_SB_LOG_SECOND_OFFSET (4096ULL * SZ_1G) > + > +#define BTRFS_SB_LOG_FIRST_SHIFT const_ilog2(BTRFS_SB_LOG_FIRST_OFFSET) > +#define BTRFS_SB_LOG_SECOND_SHIFTconst_ilog2(BTRFS_SB_LOG_SECOND_OFFSET) > + > /* Number of superblock log zones */ > #define BTRFS_NR_SB_LOG_ZONES 2 > > +/* > + * Maximum supported zone size. Currently, SMR disks have a zone size of > + * 256MiB, and we are expecting ZNS drives to be in the 1-4GiB range. We do > not > + * expect the zone size to become larger than 8GiB in the near future. > + */ > +#define BTRFS_MAX_ZONE_SIZE SZ_8G > + > static int copy_zone_info_cb(struct blk_zone *zone, unsigned int idx, void > *data) > { > struct blk_zone *zones = data; > @@ -111,23 +132,22 @@ static int sb_write_pointer(struct block_device *bdev, > struct blk_zone *zones, > } > > /* > - * The following zones are reserved as the circular buffer on ZONED btrfs. > - * - The primary superblock: zones 0 and 1 > - * - The first copy: zones 16 and 17 > - * - The second copy: zones 1024 or zone at 256GB which is minimum, and > - * the following one > + * Get the first zone number of the superblock mirror > */ > static inline u32 sb_zone_number(int shift, int mirror) > { > - ASSERT(mirror < BTRFS_SUPER_MIRROR_MAX); > + u64 zon
Re: [PATCH v3] btrfs: zoned: move superblock logging zone location
On 10/04/2021 12:15, David Sterba wrote: > From: Naohiro Aota > > Moves the location of the superblock logging zones. The new locations of > the logging zones are now determined based on fixed block addresses > instead of on fixed zone numbers. > > The old placement method based on fixed zone numbers causes problems when > one needs to inspect a file system image without access to the drive zone > information. In such case, the super block locations cannot be reliably > determined as the zone size is unknown. By locating the superblock logging > zones using fixed addresses, we can scan a dumped file system image without > the zone information since a super block copy will always be present at or > after the fixed known locations. > > Introduce the following three pairs of zones containing fixed offset > locations, regardless of the device zone size. > > - primary superblock: offset 0B (and the following zone) > - first copy: offset 512G (and the following zone) > - Second copy:offset 4T (4096G, and the following zone) > > If a logging zone is outside of the disk capacity, we do not record the > superblock copy. > > The first copy position is much larger than for a non-zoned filesystem, > which is at 64M. This is to avoid overlapping with the log zones for > the primary superblock. This higher location is arbitrary but allows > supporting devices with very large zone sizes, plus some space around in > between. > > Such large zone size is unrealistic and very unlikely to ever be seen in > real devices. Currently, SMR disks have a zone size of 256MB, and we are > expecting ZNS drives to be in the 1-4GB range, so this limit gives us > room to breathe. For now, we only allow zone sizes up to 8GB. The > maximum zone size that would still fit in the space is 256G. > > The fixed location addresses are somewhat arbitrary, with the intent of > maintaining superblock reliability for smaller and larger devices, with > the preference for the latter. For this reason, there are two superblocks > under the first 1T. This should cover use cases for physical devices and > for emulated/device-mapper devices. > > The superblock logging zones are reserved for superblock logging and > never used for data or metadata blocks. Note that we only reserve the > two zones per primary/copy actually used for superblock logging. We do > not reserve the ranges of zones possibly containing superblocks with the > largest supported zone size (0-16GB, 512G-528GB, 4096G-4112G). > > The zones containing the fixed location offsets used to store > superblocks on a non-zoned volume are also reserved to avoid confusion. > > Signed-off-by: Naohiro Aota > Signed-off-by: David Sterba Looks good to me, Thanks Reviewed-by: Johannes Thumshirn