Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-25 Thread Damien Le Moal
On 2020/03/25 19:00, Hannes Reinecke wrote:
> On 3/25/20 10:10 AM, Damien Le Moal wrote:
>> On 2020/03/25 17:52, Hannes Reinecke wrote:
>>> On 3/25/20 9:02 AM, Damien Le Moal wrote:
 On 2020/03/25 15:47, Hannes Reinecke wrote:
> On 3/25/20 7:29 AM, Damien Le Moal wrote:
>> On 2020/03/24 20:04, Bob Liu wrote:
>>> This patch implemented metadata support for regular device by:
>>> - Emulated zone information for regular device.
>>> - Store metadata at the beginning of regular device.
>>>
>>> | --- zoned device --- | -- regular device ||
>>> ^  ^
>>> |  |Metadata
>>> zone 0
>>>
>>> Signed-off-by: Bob Liu 
>>> ---
>>> drivers/md/dm-zoned-metadata.c | 135 
>>> +++--
>>> drivers/md/dm-zoned-target.c   |   6 +-
>>> drivers/md/dm-zoned.h  |   3 +-
>>> 3 files changed, 108 insertions(+), 36 deletions(-)
>>>
> Having thought about it some more, I think we cannot continue with this
> 'simple' approach.
> The immediate problem is that we lie about the disk size; clearly the
> metadata cannot be used for regular data, yet we expose a target device
> with the full size of the underlying device.
> Making me wonder if anybody ever tested a disk-full scenario...

 Current dm-zoned does not do that... What is exposed as target capacity is
 number of chunks * zone size, with the number of chunks being number of 
 zones
 minus metadata zones minus number of zones reserved for reclaim. And I did 
 test
 disk full scenario (when performance goes to the trash bin because reclaim
 struggles...)

>>> Thing is, the second number for the dmsetup target line is _supposed_ to
>>> be the target size.
>>> Which clearly is wrong here.
>>> I must admit I'm not sure what device-mapper will do with a target
>>> definition which is larger than the resulting target device ...
>>> Mike should know, but it's definitely awkward.
>>
>> AHh. OK. Never thought of it like this, especially considering the fact that 
>> the
>> table entry is checked to see if the entire drive is given. So instead of the
>> target size, I was in fact using the size parameter of dmsetup as the size to
>> use on the backend, which for dm-zoned must be the device capacity...
>>
>> Not sure if we can fix that now ? Especially considering that the number of
>> reserved seq zones for reclaim is not constant but a dmzadm format option. So
>> the average user would have to know exactly the useable size to dmsetup the
>> target. Akward too, or rather, not super easy to use. I wonder how dm-thin or
>> other targets with metadata handle this ? Do they format themselves
>> automatically on dmsetup using the size specified ?
>>
> Which is _precisely_ why I want to have the 'start' option to dmzadm.
> That can read the metadata, validate it, and then generate the correct 
> invocation for device-mapper.
> _And_ we get a device-uuid to boot, as this can only be set from the ioctl.

OK. Got it. Done like this, it will also be easy to support the v1 metadata.

> 
> Cheers,
> 
> Hannes
> 


-- 
Damien Le Moal
Western Digital Research



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-25 Thread Hannes Reinecke

On 3/25/20 10:10 AM, Damien Le Moal wrote:

On 2020/03/25 17:52, Hannes Reinecke wrote:

On 3/25/20 9:02 AM, Damien Le Moal wrote:

On 2020/03/25 15:47, Hannes Reinecke wrote:

On 3/25/20 7:29 AM, Damien Le Moal wrote:

On 2020/03/24 20:04, Bob Liu wrote:

This patch implemented metadata support for regular device by:
- Emulated zone information for regular device.
- Store metadata at the beginning of regular device.

| --- zoned device --- | -- regular device ||
^  ^
|  |Metadata
zone 0

Signed-off-by: Bob Liu 
---
drivers/md/dm-zoned-metadata.c | 135 
+++--
drivers/md/dm-zoned-target.c   |   6 +-
drivers/md/dm-zoned.h  |   3 +-
3 files changed, 108 insertions(+), 36 deletions(-)


Having thought about it some more, I think we cannot continue with this
'simple' approach.
The immediate problem is that we lie about the disk size; clearly the
metadata cannot be used for regular data, yet we expose a target device
with the full size of the underlying device.
Making me wonder if anybody ever tested a disk-full scenario...


Current dm-zoned does not do that... What is exposed as target capacity is
number of chunks * zone size, with the number of chunks being number of zones
minus metadata zones minus number of zones reserved for reclaim. And I did test
disk full scenario (when performance goes to the trash bin because reclaim
struggles...)


Thing is, the second number for the dmsetup target line is _supposed_ to
be the target size.
Which clearly is wrong here.
I must admit I'm not sure what device-mapper will do with a target
definition which is larger than the resulting target device ...
Mike should know, but it's definitely awkward.


AHh. OK. Never thought of it like this, especially considering the fact that the
table entry is checked to see if the entire drive is given. So instead of the
target size, I was in fact using the size parameter of dmsetup as the size to
use on the backend, which for dm-zoned must be the device capacity...

Not sure if we can fix that now ? Especially considering that the number of
reserved seq zones for reclaim is not constant but a dmzadm format option. So
the average user would have to know exactly the useable size to dmsetup the
target. Akward too, or rather, not super easy to use. I wonder how dm-thin or
other targets with metadata handle this ? Do they format themselves
automatically on dmsetup using the size specified ?


Which is _precisely_ why I want to have the 'start' option to dmzadm.
That can read the metadata, validate it, and then generate the correct 
invocation for device-mapper.

_And_ we get a device-uuid to boot, as this can only be set from the ioctl.

Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-25 Thread Damien Le Moal
On 2020/03/25 17:52, Hannes Reinecke wrote:
> On 3/25/20 9:02 AM, Damien Le Moal wrote:
>> On 2020/03/25 15:47, Hannes Reinecke wrote:
>>> On 3/25/20 7:29 AM, Damien Le Moal wrote:
 On 2020/03/24 20:04, Bob Liu wrote:
> This patch implemented metadata support for regular device by:
>- Emulated zone information for regular device.
>- Store metadata at the beginning of regular device.
>
>| --- zoned device --- | -- regular device ||
>^  ^
>|  |Metadata
> zone 0
>
> Signed-off-by: Bob Liu 
> ---
>drivers/md/dm-zoned-metadata.c | 135 
> +++--
>drivers/md/dm-zoned-target.c   |   6 +-
>drivers/md/dm-zoned.h  |   3 +-
>3 files changed, 108 insertions(+), 36 deletions(-)
>
>>> Having thought about it some more, I think we cannot continue with this
>>> 'simple' approach.
>>> The immediate problem is that we lie about the disk size; clearly the
>>> metadata cannot be used for regular data, yet we expose a target device
>>> with the full size of the underlying device.
>>> Making me wonder if anybody ever tested a disk-full scenario...
>>
>> Current dm-zoned does not do that... What is exposed as target capacity is
>> number of chunks * zone size, with the number of chunks being number of zones
>> minus metadata zones minus number of zones reserved for reclaim. And I did 
>> test
>> disk full scenario (when performance goes to the trash bin because reclaim
>> struggles...)
>>
> Thing is, the second number for the dmsetup target line is _supposed_ to 
> be the target size.
> Which clearly is wrong here.
> I must admit I'm not sure what device-mapper will do with a target 
> definition which is larger than the resulting target device ...
> Mike should know, but it's definitely awkward.

AHh. OK. Never thought of it like this, especially considering the fact that the
table entry is checked to see if the entire drive is given. So instead of the
target size, I was in fact using the size parameter of dmsetup as the size to
use on the backend, which for dm-zoned must be the device capacity...

Not sure if we can fix that now ? Especially considering that the number of
reserved seq zones for reclaim is not constant but a dmzadm format option. So
the average user would have to know exactly the useable size to dmsetup the
target. Akward too, or rather, not super easy to use. I wonder how dm-thin or
other targets with metadata handle this ? Do they format themselves
automatically on dmsetup using the size specified ?

> 
> Cheers,
> 
> Hannes
> 


-- 
Damien Le Moal
Western Digital Research



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-25 Thread Hannes Reinecke

On 3/25/20 9:02 AM, Damien Le Moal wrote:

On 2020/03/25 15:47, Hannes Reinecke wrote:

On 3/25/20 7:29 AM, Damien Le Moal wrote:

On 2020/03/24 20:04, Bob Liu wrote:

This patch implemented metadata support for regular device by:
   - Emulated zone information for regular device.
   - Store metadata at the beginning of regular device.

   | --- zoned device --- | -- regular device ||
   ^  ^
   |  |Metadata
zone 0

Signed-off-by: Bob Liu 
---
   drivers/md/dm-zoned-metadata.c | 135 
+++--
   drivers/md/dm-zoned-target.c   |   6 +-
   drivers/md/dm-zoned.h  |   3 +-
   3 files changed, 108 insertions(+), 36 deletions(-)


Having thought about it some more, I think we cannot continue with this
'simple' approach.
The immediate problem is that we lie about the disk size; clearly the
metadata cannot be used for regular data, yet we expose a target device
with the full size of the underlying device.
Making me wonder if anybody ever tested a disk-full scenario...


Current dm-zoned does not do that... What is exposed as target capacity is
number of chunks * zone size, with the number of chunks being number of zones
minus metadata zones minus number of zones reserved for reclaim. And I did test
disk full scenario (when performance goes to the trash bin because reclaim
struggles...)

Thing is, the second number for the dmsetup target line is _supposed_ to 
be the target size.

Which clearly is wrong here.
I must admit I'm not sure what device-mapper will do with a target 
definition which is larger than the resulting target device ...

Mike should know, but it's definitely awkward.

Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-25 Thread Hannes Reinecke

On 3/25/20 8:29 AM, Bob Liu wrote:

On 3/25/20 2:47 PM, Hannes Reinecke wrote:

On 3/25/20 7:29 AM, Damien Le Moal wrote:

On 2020/03/24 20:04, Bob Liu wrote:

This patch implemented metadata support for regular device by:
   - Emulated zone information for regular device.
   - Store metadata at the beginning of regular device.

   | --- zoned device --- | -- regular device ||
   ^  ^
   |  |Metadata
zone 0

Signed-off-by: Bob Liu 
---
   drivers/md/dm-zoned-metadata.c | 135 
+++--
   drivers/md/dm-zoned-target.c   |   6 +-
   drivers/md/dm-zoned.h  |   3 +-
   3 files changed, 108 insertions(+), 36 deletions(-)


Having thought about it some more, I think we cannot continue with this 
'simple' approach.
The immediate problem is that we lie about the disk size; clearly the
metadata cannot be used for regular data, yet we expose a target device with 
the full size of the underlying device.


The exposed size is "regular dev size + zoned dev size - metadata size - reserved 
seq zone size".
I didn't see why there is a lie?

The lie is in generating the device-mapper line for setting up the 
target device.

Format is

0  zoned  

and  is the capacity of the resulting device-mapper device.
So we should have adapted this to exclude the metadata size and the 
reserved seq zone size (even with the original implementation); 
'blksize' is certainly wrong here.



Making me wonder if anybody ever tested a disk-full scenario...
The other problem is that with two devices we need to be able to stitch them 
together
in an automated fashion, eg via a systemd service or udev rule.
But for this we need to be able to identify the devices, which means both need 
to carry
metadata, and both need to have unique identifier within the metadata. Which 
the current
metadata doesn't allow to.

Hence my plan is to implement a v2 metadata, carrying UUIDs for the dmz set 
_and_ the
component device. With that we can update blkid to create links etc so that the 
devices
can be identified in the system.
Additionally I would be updating dmzadm to write the new metadata.

And I will add a new command 'start' to dmzadm which will then create the 
device-mapper
device _with the correct size_. It also has the benefit that we can create the 
device-mapper
target with the UUID specified in the metadata, so the persistent device links 
will be
created automatically.

Bob, can you send me your improvements to dmzadm so that I can include them in 
my changes?



Attached, but it's a big patch I haven't split them to smaller one.
The dmz_check/repair can't work neither in current stage.

Yeah, of course. Plan is to start with V2 metadata handling first anyway 
(it adding UUIDs), then add the 'start' functionality, and only then 
implement cache device handling.


Thanks for the patch.

Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-25 Thread Bob Liu
On 3/25/20 2:47 PM, Hannes Reinecke wrote:
> On 3/25/20 7:29 AM, Damien Le Moal wrote:
>> On 2020/03/24 20:04, Bob Liu wrote:
>>> This patch implemented metadata support for regular device by:
>>>   - Emulated zone information for regular device.
>>>   - Store metadata at the beginning of regular device.
>>>
>>>   | --- zoned device --- | -- regular device ||
>>>   ^  ^
>>>   |  |Metadata
>>> zone 0
>>>
>>> Signed-off-by: Bob Liu 
>>> ---
>>>   drivers/md/dm-zoned-metadata.c | 135 
>>> +++--
>>>   drivers/md/dm-zoned-target.c   |   6 +-
>>>   drivers/md/dm-zoned.h  |   3 +-
>>>   3 files changed, 108 insertions(+), 36 deletions(-)
>>>
> Having thought about it some more, I think we cannot continue with this 
> 'simple' approach.
> The immediate problem is that we lie about the disk size; clearly the
> metadata cannot be used for regular data, yet we expose a target device with 
> the full size of the underlying device.

The exposed size is "regular dev size + zoned dev size - metadata size - 
reserved seq zone size".
I didn't see why there is a lie?

> Making me wonder if anybody ever tested a disk-full scenario...
> The other problem is that with two devices we need to be able to stitch them 
> together in an automated fashion, eg via a systemd service or udev rule.
> But for this we need to be able to identify the devices, which means both 
> need to carry metadata, and both need to have unique identifier within the 
> metadata. Which the current metadata doesn't allow to.
> 
> Hence my plan is to implement a v2 metadata, carrying UUIDs for the dmz set 
> _and_ the component device. With that we can update blkid to create links etc 
> so that the devices can be identified in the system.
> Additionally I would be updating dmzadm to write the new metadata.
> 
> And I will add a new command 'start' to dmzadm which will then create the 
> device-mapper device _with the correct size_. It also has the benefit that we 
> can create the device-mapper target with the UUID specified in the metadata, 
> so the persistent device links will be created automatically.
> 
> Bob, can you send me your improvements to dmzadm so that I can include them 
> in my changes?
> 

Attached, but it's a big patch I haven't split them to smaller one.
The dmz_check/repair can't work neither in current stage.
diff --git a/src/dmz.h b/src/dmz.h
index 57741b1..51b5019 100644
--- a/src/dmz.h
+++ b/src/dmz.h
@@ -153,19 +153,33 @@ enum dmz_op {
 	DMZ_OP_REPAIR,
 };
 
+struct dmz_raw_dev {
+	char *path;
+	char *name;
+	int fd;
+	size_t		zone_nr_sectors;
+	size_t		zone_nr_blocks;
+	/* Device info */
+	__u64		capacity;
+	unsigned int nr_zones;
+	struct blk_zone	*zones;
+	struct dmz_dev *pdev;
+};
+
 /*
  * Device descriptor.
  */
 struct dmz_dev {
 
 	/* Device file path and basename */
-	char		*path;
-	char		*name;
+	struct dmz_raw_dev zoned_dev;
+	struct dmz_raw_dev regu_dev;
+
 	int		op;
 	unsigned int	flags;
+	size_t		zone_nr_blocks;
+	int 		has_regular;
 
-	/* Device info */
-	__u64		capacity;
 
 	unsigned int	nr_zones;
 	unsigned int	nr_meta_zones;
@@ -178,11 +192,6 @@ struct dmz_dev {
 	unsigned int	total_nr_meta_zones;
 	unsigned int	nr_rnd_zones;
 
-	struct blk_zone	*zones;
-
-	size_t		zone_nr_sectors;
-	size_t		zone_nr_blocks;
-
 	/* First metadata zone */
 	struct blk_zone	*sb_zone;
 	__u64		sb_block;
@@ -195,10 +204,6 @@ struct dmz_dev {
 	/* Mapping table */
 	unsigned int	nr_map_blocks;
 	__u64		map_block;
-
-	/* Device file descriptor */
-	int		fd;
-
 };
 
 /*
@@ -317,16 +322,16 @@ dmz_zone_cond_str(struct blk_zone *zone)
 
 extern int dmz_open_dev(struct dmz_dev *dev, enum dmz_op op);
 extern void dmz_close_dev(struct dmz_dev *dev);
-extern int dmz_sync_dev(struct dmz_dev *dev);
-extern int dmz_reset_zone(struct dmz_dev *dev, struct blk_zone *zone);
-extern int dmz_reset_zones(struct dmz_dev *dev);
-extern int dmz_write_block(struct dmz_dev *dev, __u64 block, __u8 *buf);
-extern int dmz_read_block(struct dmz_dev *dev, __u64 block, __u8 *buf);
+extern int dmz_sync_dev(struct dmz_raw_dev *dev);
+extern int dmz_reset_zone(struct dmz_raw_dev *dev, struct blk_zone *zone);
+extern int dmz_reset_zones(struct dmz_raw_dev *dev);
+extern int dmz_write_block(struct dmz_raw_dev *dev, __u64 block, __u8 *buf);
+extern int dmz_read_block(struct dmz_raw_dev *dev, __u64 block, __u8 *buf);
 
 extern __u32 dmz_crc32(__u32 crc, const void *address, size_t length);
 
 extern int dmz_locate_metadata(struct dmz_dev *dev);
-extern int dmz_write_super(struct dmz_dev *dev, __u64 gen, __u64 offset);
+extern int dmz_write_super(struct dmz_raw_dev *dev, __u64 gen, __u64 offset);
 extern int dmz_format(struct dmz_dev *dev);
 extern int dmz_check(struct dmz_dev *dev);
 extern int dmz_repair(struct dmz_dev *dev);
diff --git a/src/dmz_check.c b/src/dmz_check.c
index 25ce026..da8c1a5 100644
--- a/src/dmz_check.c
+++ b/src/dmz_check.c
@@ -29,7 +29,7 @@
 #include 
 #in

Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-25 Thread Damien Le Moal
On 2020/03/25 15:47, Hannes Reinecke wrote:
> On 3/25/20 7:29 AM, Damien Le Moal wrote:
>> On 2020/03/24 20:04, Bob Liu wrote:
>>> This patch implemented metadata support for regular device by:
>>>   - Emulated zone information for regular device.
>>>   - Store metadata at the beginning of regular device.
>>>
>>>   | --- zoned device --- | -- regular device ||
>>>   ^  ^
>>>   |  |Metadata
>>> zone 0
>>>
>>> Signed-off-by: Bob Liu 
>>> ---
>>>   drivers/md/dm-zoned-metadata.c | 135 
>>> +++--
>>>   drivers/md/dm-zoned-target.c   |   6 +-
>>>   drivers/md/dm-zoned.h  |   3 +-
>>>   3 files changed, 108 insertions(+), 36 deletions(-)
>>>
> Having thought about it some more, I think we cannot continue with this 
> 'simple' approach.
> The immediate problem is that we lie about the disk size; clearly the
> metadata cannot be used for regular data, yet we expose a target device 
> with the full size of the underlying device.
> Making me wonder if anybody ever tested a disk-full scenario...

Current dm-zoned does not do that... What is exposed as target capacity is
number of chunks * zone size, with the number of chunks being number of zones
minus metadata zones minus number of zones reserved for reclaim. And I did test
disk full scenario (when performance goes to the trash bin because reclaim
struggles...)

> The other problem is that with two devices we need to be able to stitch 
> them together in an automated fashion, eg via a systemd service or udev 
> rule.

Yes, and that has been on my to-do list forever for the current dm-zoned...

> But for this we need to be able to identify the devices, which means 
> both need to carry metadata, and both need to have unique identifier 
> within the metadata. Which the current metadata doesn't allow to.
> 
> Hence my plan is to implement a v2 metadata, carrying UUIDs for the dmz 
> set _and_ the component device. With that we can update blkid to create 
> links etc so that the devices can be identified in the system.
> Additionally I would be updating dmzadm to write the new metadata.

Yep. I think that is needed. And the metadata for the disk that does not store
the mapping tables and bitmaps can be read-only at run time, that is a super
block only holding identification/UUID.

> And I will add a new command 'start' to dmzadm which will then create 
> the device-mapper device _with the correct size_. It also has the 
> benefit that we can create the device-mapper target with the UUID 
> specified in the metadata, so the persistent device links will be 
> created automatically.

The size now should be correct with single device current setup.

> 
> Bob, can you send me your improvements to dmzadm so that I can include 
> them in my changes?
> 
> Cheers,
> 
> Hannes
> 


-- 
Damien Le Moal
Western Digital Research



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-24 Thread Hannes Reinecke

On 3/25/20 7:29 AM, Damien Le Moal wrote:

On 2020/03/24 20:04, Bob Liu wrote:

This patch implemented metadata support for regular device by:
  - Emulated zone information for regular device.
  - Store metadata at the beginning of regular device.

  | --- zoned device --- | -- regular device ||
  ^  ^
  |  |Metadata
zone 0

Signed-off-by: Bob Liu 
---
  drivers/md/dm-zoned-metadata.c | 135 +++--
  drivers/md/dm-zoned-target.c   |   6 +-
  drivers/md/dm-zoned.h  |   3 +-
  3 files changed, 108 insertions(+), 36 deletions(-)

Having thought about it some more, I think we cannot continue with this 
'simple' approach.

The immediate problem is that we lie about the disk size; clearly the
metadata cannot be used for regular data, yet we expose a target device 
with the full size of the underlying device.

Making me wonder if anybody ever tested a disk-full scenario...
The other problem is that with two devices we need to be able to stitch 
them together in an automated fashion, eg via a systemd service or udev 
rule.
But for this we need to be able to identify the devices, which means 
both need to carry metadata, and both need to have unique identifier 
within the metadata. Which the current metadata doesn't allow to.


Hence my plan is to implement a v2 metadata, carrying UUIDs for the dmz 
set _and_ the component device. With that we can update blkid to create 
links etc so that the devices can be identified in the system.

Additionally I would be updating dmzadm to write the new metadata.

And I will add a new command 'start' to dmzadm which will then create 
the device-mapper device _with the correct size_. It also has the 
benefit that we can create the device-mapper target with the UUID 
specified in the metadata, so the persistent device links will be 
created automatically.


Bob, can you send me your improvements to dmzadm so that I can include 
them in my changes?


Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [RFC PATCH v2 3/3] dm zoned: add regular device info to metadata

2020-03-24 Thread Damien Le Moal
On 2020/03/24 20:04, Bob Liu wrote:
> This patch implemented metadata support for regular device by:
>  - Emulated zone information for regular device.
>  - Store metadata at the beginning of regular device.
> 
>  | --- zoned device --- | -- regular device ||
>  ^  ^
>  |  |Metadata
> zone 0
> 
> Signed-off-by: Bob Liu 
> ---
>  drivers/md/dm-zoned-metadata.c | 135 
> +++--
>  drivers/md/dm-zoned-target.c   |   6 +-
>  drivers/md/dm-zoned.h  |   3 +-
>  3 files changed, 108 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index e0e8be0..a96158a 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -131,6 +131,7 @@ struct dmz_sb {
>   */
>  struct dmz_metadata {
>   struct dmz_dev  *zoned_dev;
> + struct dmz_dev  *regu_dmz_dev;
>  
>   sector_tzone_bitmap_size;
>   unsigned intzone_nr_bitmap_blocks;
> @@ -187,6 +188,15 @@ struct dmz_metadata {
>  /*
>   * Various accessors
>   */
> +static inline struct dmz_dev *zmd_mdev(struct dmz_metadata *zmd)
> +{
> + /* Metadata always stores in regular device if there is. */
> + if (zmd->regu_dmz_dev)
> + return zmd->regu_dmz_dev;
> + else
> + return zmd->zoned_dev;

OK. I think we will be better off using an array of pointers to struct_dmz_dev
in dmz_target, i.e., a filed "struct dmz_dev*dev[2]". Doing so, we can be 
sure
to always have the device holding metatdata in entry 0, which will always be
true for the single drive case too.
With this, you will not need all these dance with "which device has metadata" ?
It always will be dmz->dev[0].

> +}
> +
>  unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
>   return ((unsigned int)(zone - zmd->zones));
> @@ -194,12 +204,33 @@ unsigned int dmz_id(struct dmz_metadata *zmd, struct 
> dm_zone *zone)
>  
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> - return (sector_t)dmz_id(zmd, zone) << 
> zmd->zoned_dev->zone_nr_sectors_shift;

With the array of dev trick, most of the changes below are simplified or go 
away.

> + int dmz_real_id;
> +
> + dmz_real_id = dmz_id(zmd, zone);
> + if (dmz_real_id >= zmd->zoned_dev->nr_zones) {
> + /* Regular dev. */
> + dmz_real_id -= zmd->zoned_dev->nr_zones;
> + WARN_ON(!zmd->regu_dmz_dev);
> +
> + return (sector_t)dmz_real_id << 
> zmd->zoned_dev->zone_nr_sectors_shift;
> + }
> + return (sector_t)dmz_real_id << zmd->zoned_dev->zone_nr_sectors_shift;
>  }
>  
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> - return (sector_t)dmz_id(zmd, zone) << 
> zmd->zoned_dev->zone_nr_blocks_shift;
> + int dmz_real_id;
> +
> + dmz_real_id = dmz_id(zmd, zone);
> + if (dmz_real_id >= zmd->zoned_dev->nr_zones) {
> + /* Regular dev. */
> + dmz_real_id -= zmd->zoned_dev->nr_zones;
> + WARN_ON(!zmd->regu_dmz_dev);
> +
> + return (sector_t)dmz_real_id << 
> zmd->zoned_dev->zone_nr_blocks_shift;
> + }
> +
> + return (sector_t)dmz_real_id << zmd->zoned_dev->zone_nr_blocks_shift;
>  }
>  
>  unsigned int dmz_nr_chunks(struct dmz_metadata *zmd)
> @@ -403,8 +434,10 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct 
> dmz_metadata *zmd,
>   struct dmz_mblock *mblk, *m;
>   sector_t block = zmd->sb[zmd->mblk_primary].block + mblk_no;
>   struct bio *bio;
> + struct dmz_dev *mdev;
>  
> - if (dmz_bdev_is_dying(zmd->zoned_dev))
> + mdev = zmd_mdev(zmd);
> + if (dmz_bdev_is_dying(mdev))
>   return ERR_PTR(-EIO);
>  
>   /* Get a new block and a BIO to read it */
> @@ -440,7 +473,7 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct 
> dmz_metadata *zmd,
>  
>   /* Submit read BIO */
>   bio->bi_iter.bi_sector = dmz_blk2sect(block);
> - bio_set_dev(bio, zmd->zoned_dev->bdev);
> + bio_set_dev(bio, mdev->bdev);
>   bio->bi_private = mblk;
>   bio->bi_end_io = dmz_mblock_bio_end_io;
>   bio_set_op_attrs(bio, REQ_OP_READ, REQ_META | REQ_PRIO);
> @@ -555,7 +588,7 @@ static struct dmz_mblock *dmz_get_mblock(struct 
> dmz_metadata *zmd,
>  TASK_UNINTERRUPTIBLE);
>   if (test_bit(DMZ_META_ERROR, &mblk->state)) {
>   dmz_release_mblock(zmd, mblk);
> - dmz_check_bdev(zmd->zoned_dev);
> + dmz_check_bdev(zmd_mdev(zmd));
>   return ERR_PTR(-EIO);
>   }
>  
> @@ -581,8 +614,10 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, 
> struct dmz_mblock *mblk,
>  {
>   sector_t block = zmd->sb[set].block + mblk->no;
>   struct bio *bio;
> + struct dmz_dev *mdev;
>  
> - if (dmz_bdev_is_dying(zmd->zoned_dev))
> + mdev = zmd_mdev(zmd