Re: [PATCH v5 4/5] xen/vpci: header: status register handler

2023-09-08 Thread Jan Beulich
On 07.09.2023 23:29, Stewart Hildebrand wrote:
> On 9/6/23 04:22, Jan Beulich wrote:
>> On 01.09.2023 06:57, Stewart Hildebrand wrote:
>>> Introduce a handler for the PCI status register, with ability to mask the
>>> capabilities bit. The status register contains reserved bits, read-only 
>>> bits,
>>> and write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. 
>>> If a
>>> bit in the bitmask is set, then the special meaning applies:
>>>
>>>   res_mask: read as zero, write ignore
>>>   ro_mask: read normal, write ignore
>>>   rw1c_mask: read normal, write 1 to clear
>>
>> With the last one's name being descriptive of what the behavior is, would
>> it make sense to name the first one "raz_mask"? (Also a question to Roger
>> as the maintainer of this code.)
> 
> Another possible name is rsvdz_mask. See below.

Ah, yes, that's better even if for now we won't need rsvdp support.

>>> --- a/xen/drivers/vpci/header.c
>>> +++ b/xen/drivers/vpci/header.c
>>> @@ -413,6 +413,18 @@ static void cf_check cmd_write(
>>>  pci_conf_write16(pdev->sbdf, reg, cmd);
>>>  }
>>>
>>> +static uint32_t cf_check status_read(const struct pci_dev *pdev,
>>> + unsigned int reg, void *data)
>>> +{
>>> +struct vpci_header *header = data;
>>> +uint32_t status = pci_conf_read16(pdev->sbdf, reg);
>>> +
>>> +if ( header->mask_cap_list )
>>> +status &= ~PCI_STATUS_CAP_LIST;
>>> +
>>> +return status;
>>> +}
>>
>> Do you actually need this function? Can't you ...
>>
>>> @@ -544,6 +556,12 @@ static int cf_check init_bars(struct pci_dev *pdev)
>>>  if ( rc )
>>>  return rc;
>>>
>>> +rc = vpci_add_register_mask(pdev->vpci, status_read, vpci_hw_write16,
>>> +PCI_STATUS, 2, header, 
>>> PCI_STATUS_RESERVED_MASK,
>>> +PCI_STATUS_RO_MASK, PCI_STATUS_RW1C_MASK);
>>
>> ... conditionally OR in PCI_STATUS_CAP_LIST right here? Without
>> capabilities the CAP_LIST bit becomes kind of reserved anyway.
> 
> Hm. The PCI_STATUS_CAP_LIST bit is a read-only bit according to spec. Given 
> the rationale below, we would want to preserve the value of r/o bits during 
> writes, so this would essentially want to become a RsvdP bit. I wasn't 
> planning on adding support for RsvdP bits here since there aren't any proper 
> RsvdP bits in the status register. If instead we are okay with treating the 
> PCI_STATUS_CAP_LIST bit as RsvdZ, then we could do as you suggest, but I'll 
> plan to keep it as is for now.

I'd say leverage rsvdz (with a comment) for the time being, but let's see
what Roger thinks.

>>> @@ -424,9 +450,13 @@ static void vpci_write_helper(const struct pci_dev 
>>> *pdev,
>>>  uint32_t val;
>>>
>>>  val = r->read(pdev, r->offset, r->private);
>>> +val &= ~r->res_mask;
>>> +val &= ~r->rw1c_mask;
>>
>> Personally I'd fold these two lines into just one (and similarly below).
> 
> Will do.
> 
>>>  data = merge_result(val, data, size, offset);
>>>  }
>>>
>>> +data &= ~r->res_mask;
>>> +data &= ~r->ro_mask;
>>
>> This seems risky to me. I'd rather see the same value being written back
>> for r/o bits, just in case a device doesn't actually implement a bit as
>> mandated. 
> 
> Regarding writes to read-only bits: okay, I'll assume that Xen should take 
> care of preserving the r/o bits (discarding the guests write value for the 
> r/o bits).
> 
> If, on the other hand, we would want to rely on the guest to preserve the r/o 
> bits during a write, then the ro_mask would effectively not be doing 
> anything, so may as well be removed.

We may never rely on the guest doing anything the way we want it.

> In either case, for partial register writes, Xen will take care of preserving 
> the r/o bits for the remaining bytes in the register.
> 
> I'll make this change for the next version of the series (assuming Xen will 
> handle preserving the r/o bits).
> 
>> For reserved flags it's less clear what's best, because in
>> principle they could also become rw1c once defined.
> 
> Well, it depends on what version of the spec we read.
> 
> PCI Local Bus 3.0 spec section 6.1 says: "On writes, software must ensure 
> that the values of reserved bit positions are preserved." PCI Local Bus 3.0 
> spec section 6.2.3 also says that we can essentially treat the whole status 
> register as rw1c. Huh, that's odd.
> 
> PCI Express Base 4.0 spec defines two reserved bit types:
> RsvdP: Reserved and Preserved - Reserved for future RW implementations. 
> Software must preserve the value read for writes to bits.
> RsvdZ: Reserved and Zero - Reserved for future RW1C implementations. Software 
> must use 0b for writes to bits.
> Both RsvdP and RsvdZ are read as zero.
> 
> I'm favoring using the definitions in the newer PCI Express Base 4.0 spec 
> since they are clearer.

+1

> Regarding writes to rsvdp bits: there are none of these in the status 
> register according to 

Re: [PATCH v5 4/5] xen/vpci: header: status register handler

2023-09-07 Thread Stewart Hildebrand
On 9/6/23 04:22, Jan Beulich wrote:
> On 01.09.2023 06:57, Stewart Hildebrand wrote:
>> Introduce a handler for the PCI status register, with ability to mask the
>> capabilities bit. The status register contains reserved bits, read-only bits,
>> and write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. If 
>> a
>> bit in the bitmask is set, then the special meaning applies:
>>
>>   res_mask: read as zero, write ignore
>>   ro_mask: read normal, write ignore
>>   rw1c_mask: read normal, write 1 to clear
> 
> With the last one's name being descriptive of what the behavior is, would
> it make sense to name the first one "raz_mask"? (Also a question to Roger
> as the maintainer of this code.)

Another possible name is rsvdz_mask. See below.

>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -413,6 +413,18 @@ static void cf_check cmd_write(
>>  pci_conf_write16(pdev->sbdf, reg, cmd);
>>  }
>>
>> +static uint32_t cf_check status_read(const struct pci_dev *pdev,
>> + unsigned int reg, void *data)
>> +{
>> +struct vpci_header *header = data;
>> +uint32_t status = pci_conf_read16(pdev->sbdf, reg);
>> +
>> +if ( header->mask_cap_list )
>> +status &= ~PCI_STATUS_CAP_LIST;
>> +
>> +return status;
>> +}
> 
> Do you actually need this function? Can't you ...
> 
>> @@ -544,6 +556,12 @@ static int cf_check init_bars(struct pci_dev *pdev)
>>  if ( rc )
>>  return rc;
>>
>> +rc = vpci_add_register_mask(pdev->vpci, status_read, vpci_hw_write16,
>> +PCI_STATUS, 2, header, 
>> PCI_STATUS_RESERVED_MASK,
>> +PCI_STATUS_RO_MASK, PCI_STATUS_RW1C_MASK);
> 
> ... conditionally OR in PCI_STATUS_CAP_LIST right here? Without
> capabilities the CAP_LIST bit becomes kind of reserved anyway.

Hm. The PCI_STATUS_CAP_LIST bit is a read-only bit according to spec. Given the 
rationale below, we would want to preserve the value of r/o bits during writes, 
so this would essentially want to become a RsvdP bit. I wasn't planning on 
adding support for RsvdP bits here since there aren't any proper RsvdP bits in 
the status register. If instead we are okay with treating the 
PCI_STATUS_CAP_LIST bit as RsvdZ, then we could do as you suggest, but I'll 
plan to keep it as is for now.

>> @@ -424,9 +450,13 @@ static void vpci_write_helper(const struct pci_dev 
>> *pdev,
>>  uint32_t val;
>>
>>  val = r->read(pdev, r->offset, r->private);
>> +val &= ~r->res_mask;
>> +val &= ~r->rw1c_mask;
> 
> Personally I'd fold these two lines into just one (and similarly below).

Will do.

>>  data = merge_result(val, data, size, offset);
>>  }
>>
>> +data &= ~r->res_mask;
>> +data &= ~r->ro_mask;
> 
> This seems risky to me. I'd rather see the same value being written back
> for r/o bits, just in case a device doesn't actually implement a bit as
> mandated. 

Regarding writes to read-only bits: okay, I'll assume that Xen should take care 
of preserving the r/o bits (discarding the guests write value for the r/o bits).

If, on the other hand, we would want to rely on the guest to preserve the r/o 
bits during a write, then the ro_mask would effectively not be doing anything, 
so may as well be removed.

In either case, for partial register writes, Xen will take care of preserving 
the r/o bits for the remaining bytes in the register.

I'll make this change for the next version of the series (assuming Xen will 
handle preserving the r/o bits).

> For reserved flags it's less clear what's best, because in
> principle they could also become rw1c once defined.

Well, it depends on what version of the spec we read.

PCI Local Bus 3.0 spec section 6.1 says: "On writes, software must ensure that 
the values of reserved bit positions are preserved." PCI Local Bus 3.0 spec 
section 6.2.3 also says that we can essentially treat the whole status register 
as rw1c. Huh, that's odd.

PCI Express Base 4.0 spec defines two reserved bit types:
RsvdP: Reserved and Preserved - Reserved for future RW implementations. 
Software must preserve the value read for writes to bits.
RsvdZ: Reserved and Zero - Reserved for future RW1C implementations. Software 
must use 0b for writes to bits.
Both RsvdP and RsvdZ are read as zero.

I'm favoring using the definitions in the newer PCI Express Base 4.0 spec since 
they are clearer.

Regarding writes to rsvdp bits: there are none of these in the status register 
according to the PCI Express Base 4.0 spec. The older PCI Local Bus 3.0 spec is 
in conflict with this definition, but at the same time the PCI Local Bus 3.0 
spec also conflicts with itself by saying that the entire status register is 
essentially a rw1c register with reserved bits that should be preserved, which 
doesn't make sense. The newer PCI Express Base 4.0 spec is clearer, and appears 
to have resolved this inconsistency, so I'm favoring 

Re: [PATCH v5 4/5] xen/vpci: header: status register handler

2023-09-06 Thread Jan Beulich
On 01.09.2023 06:57, Stewart Hildebrand wrote:
> Introduce a handler for the PCI status register, with ability to mask the
> capabilities bit. The status register contains reserved bits, read-only bits,
> and write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. If a
> bit in the bitmask is set, then the special meaning applies:
> 
>   res_mask: read as zero, write ignore
>   ro_mask: read normal, write ignore
>   rw1c_mask: read normal, write 1 to clear

With the last one's name being descriptive of what the behavior is, would
it make sense to name the first one "raz_mask"? (Also a question to Roger
as the maintainer of this code.)

> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -413,6 +413,18 @@ static void cf_check cmd_write(
>  pci_conf_write16(pdev->sbdf, reg, cmd);
>  }
>  
> +static uint32_t cf_check status_read(const struct pci_dev *pdev,
> + unsigned int reg, void *data)
> +{
> +struct vpci_header *header = data;
> +uint32_t status = pci_conf_read16(pdev->sbdf, reg);
> +
> +if ( header->mask_cap_list )
> +status &= ~PCI_STATUS_CAP_LIST;
> +
> +return status;
> +}

Do you actually need this function? Can't you ...

> @@ -544,6 +556,12 @@ static int cf_check init_bars(struct pci_dev *pdev)
>  if ( rc )
>  return rc;
>  
> +rc = vpci_add_register_mask(pdev->vpci, status_read, vpci_hw_write16,
> +PCI_STATUS, 2, header, 
> PCI_STATUS_RESERVED_MASK,
> +PCI_STATUS_RO_MASK, PCI_STATUS_RW1C_MASK);

... conditionally OR in PCI_STATUS_CAP_LIST right here? Without
capabilities the CAP_LIST bit becomes kind of reserved anyway.

> @@ -424,9 +450,13 @@ static void vpci_write_helper(const struct pci_dev *pdev,
>  uint32_t val;
>  
>  val = r->read(pdev, r->offset, r->private);
> +val &= ~r->res_mask;
> +val &= ~r->rw1c_mask;

Personally I'd fold these two lines into just one (and similarly below).

>  data = merge_result(val, data, size, offset);
>  }
>  
> +data &= ~r->res_mask;
> +data &= ~r->ro_mask;

This seems risky to me. I'd rather see the same value being written back
for r/o bits, just in case a device doesn't actually implement a bit as
mandated. For reserved flags it's less clear what's best, because in
principle they could also become rw1c once defined.

> --- a/xen/include/xen/pci_regs.h
> +++ b/xen/include/xen/pci_regs.h
> @@ -66,6 +66,14 @@
>  #define  PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */
>  #define  PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */
>  #define  PCI_STATUS_DETECTED_PARITY  0x8000 /* Set on parity error */
> +#define  PCI_STATUS_RESERVED_MASK0x06

I'd recommend separating the "derived" constants by a blank line. I'd
further like to ask that you add two more padding zeros above.

> +#define  PCI_STATUS_RO_MASK (PCI_STATUS_IMM_READY | PCI_STATUS_INTERRUPT | \
> +PCI_STATUS_CAP_LIST | PCI_STATUS_CAP_LIST | PCI_STATUS_66MHZ | \

CAP_LIST twice?

Jan



[PATCH v5 4/5] xen/vpci: header: status register handler

2023-08-31 Thread Stewart Hildebrand
Introduce a handler for the PCI status register, with ability to mask the
capabilities bit. The status register contains reserved bits, read-only bits,
and write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. If a
bit in the bitmask is set, then the special meaning applies:

  res_mask: read as zero, write ignore
  ro_mask: read normal, write ignore
  rw1c_mask: read normal, write 1 to clear

The mask_cap_list flag will be set in a follow-on patch.

Signed-off-by: Stewart Hildebrand 
---
v4->v5:
* add support for res_mask
* add support for ro_mask (squash ro_mask patch)
* add constants for reserved, read-only, and rw1c masks

v3->v4:
* move mask_cap_list setting to the capabilities patch
* single pci_conf_read16 in status_read
* align mask_cap_list bitfield in struct vpci_header
* change to rw1c bit mask instead of treating whole register as rw1c
* drop subsystem prefix on renamed add_register function

v2->v3:
* new patch
---
 xen/drivers/vpci/header.c  | 18 +++
 xen/drivers/vpci/vpci.c| 46 +++---
 xen/include/xen/pci_regs.h |  8 +++
 xen/include/xen/vpci.h | 10 +
 4 files changed, 74 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 767c1ba718d7..791791e6c9b6 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -413,6 +413,18 @@ static void cf_check cmd_write(
 pci_conf_write16(pdev->sbdf, reg, cmd);
 }
 
+static uint32_t cf_check status_read(const struct pci_dev *pdev,
+ unsigned int reg, void *data)
+{
+struct vpci_header *header = data;
+uint32_t status = pci_conf_read16(pdev->sbdf, reg);
+
+if ( header->mask_cap_list )
+status &= ~PCI_STATUS_CAP_LIST;
+
+return status;
+}
+
 static void cf_check bar_write(
 const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
 {
@@ -544,6 +556,12 @@ static int cf_check init_bars(struct pci_dev *pdev)
 if ( rc )
 return rc;
 
+rc = vpci_add_register_mask(pdev->vpci, status_read, vpci_hw_write16,
+PCI_STATUS, 2, header, 
PCI_STATUS_RESERVED_MASK,
+PCI_STATUS_RO_MASK, PCI_STATUS_RW1C_MASK);
+if ( rc )
+return rc;
+
 if ( pdev->ignore_bars )
 return 0;
 
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 3bec9a4153da..6e6ad4b80a0d 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -29,6 +29,9 @@ struct vpci_register {
 unsigned int offset;
 void *private;
 struct list_head node;
+uint32_t res_mask;
+uint32_t ro_mask;
+uint32_t rw1c_mask;
 };
 
 #ifdef __XEN__
@@ -145,9 +148,16 @@ uint32_t cf_check vpci_hw_read32(
 return pci_conf_read32(pdev->sbdf, reg);
 }
 
-int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
-  vpci_write_t *write_handler, unsigned int offset,
-  unsigned int size, void *data)
+void cf_check vpci_hw_write16(
+const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
+{
+pci_conf_write16(pdev->sbdf, reg, val);
+}
+
+static int add_register(struct vpci *vpci, vpci_read_t *read_handler,
+vpci_write_t *write_handler, unsigned int offset,
+unsigned int size, void *data, uint32_t res_mask,
+uint32_t ro_mask, uint32_t rw1c_mask)
 {
 struct list_head *prev;
 struct vpci_register *r;
@@ -167,6 +177,9 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t 
*read_handler,
 r->size = size;
 r->offset = offset;
 r->private = data;
+r->res_mask = res_mask & (0xU >> (32 - 8 * size));
+r->ro_mask = ro_mask & (0xU >> (32 - 8 * size));
+r->rw1c_mask = rw1c_mask & (0xU >> (32 - 8 * size));
 
 spin_lock(>lock);
 
@@ -193,6 +206,23 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t 
*read_handler,
 return 0;
 }
 
+int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
+  vpci_write_t *write_handler, unsigned int offset,
+  unsigned int size, void *data)
+{
+return add_register(vpci, read_handler, write_handler, offset, size, data,
+0, 0, 0);
+}
+
+int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
+   vpci_write_t *write_handler, unsigned int offset,
+   unsigned int size, void *data, uint32_t res_mask,
+   uint32_t ro_mask, uint32_t rw1c_mask)
+{
+return add_register(vpci, read_handler, write_handler, offset, size, data,
+res_mask, ro_mask, rw1c_mask);
+}
+
 int vpci_remove_register(struct vpci *vpci, unsigned int offset,
  unsigned int size)
 {
@@ -376,6 +406,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg,