Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-23 Thread Jason Wang


On 2019/10/23 下午6:13, Simon Horman wrote:

On Tue, Oct 22, 2019 at 09:32:36AM +0800, Jason Wang wrote:

On 2019/10/22 上午12:31, Simon Horman wrote:

On Mon, Oct 21, 2019 at 05:55:33PM +0800, Zhu, Lingshan wrote:

On 10/16/2019 5:53 PM, Simon Horman wrote:

Hi Zhu,

thanks for your patch.

On Wed, Oct 16, 2019 at 09:10:40AM +0800, Zhu Lingshan wrote:

...


+static void ifcvf_read_dev_config(struct ifcvf_hw *hw, u64 offset,
+  void *dst, int length)
+{
+   int i;
+   u8 *p;
+   u8 old_gen, new_gen;
+
+   do {
+   old_gen = ioread8(>common_cfg->config_generation);
+
+   p = dst;
+   for (i = 0; i < length; i++)
+   *p++ = ioread8((u8 *)hw->dev_cfg + offset + i);
+
+   new_gen = ioread8(>common_cfg->config_generation);
+   } while (old_gen != new_gen);

Would it be wise to limit the number of iterations of the loop above?

Thanks but I don't quite get it. This is used to make sure the function
would get the latest config.

I am worried about the possibility that it will loop forever.
Could that happen?

...

My understanding is that the function here is similar to virtio config
generation [1]. So this can only happen for a buggy hardware.

Ok, so this circles back to my original question.
Should we put a bound on the number of times the loop runs
or should we accept that the kernel locks up if the HW is buggy?



I'm not sure, and similar logic has been used by virtio-pci drivers for 
years. Consider this logic is pretty simple and it should not be the 
only place that virito hardware can lock kernel, we can keep it as is.


Actually, there's no need for hardware to implement generation logic, it 
could be emulated by software or even ignored. In new version of 
virtio-mdev, get_generation() is optional, when it was not implemented, 
0 is simply returned by virtio-mdev transport.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-21 Thread Jason Wang


On 2019/10/22 上午12:31, Simon Horman wrote:

On Mon, Oct 21, 2019 at 05:55:33PM +0800, Zhu, Lingshan wrote:

On 10/16/2019 5:53 PM, Simon Horman wrote:

Hi Zhu,

thanks for your patch.

On Wed, Oct 16, 2019 at 09:10:40AM +0800, Zhu Lingshan wrote:

...


+static void ifcvf_read_dev_config(struct ifcvf_hw *hw, u64 offset,
+  void *dst, int length)
+{
+   int i;
+   u8 *p;
+   u8 old_gen, new_gen;
+
+   do {
+   old_gen = ioread8(>common_cfg->config_generation);
+
+   p = dst;
+   for (i = 0; i < length; i++)
+   *p++ = ioread8((u8 *)hw->dev_cfg + offset + i);
+
+   new_gen = ioread8(>common_cfg->config_generation);
+   } while (old_gen != new_gen);

Would it be wise to limit the number of iterations of the loop above?

Thanks but I don't quite get it. This is used to make sure the function
would get the latest config.

I am worried about the possibility that it will loop forever.
Could that happen?

...



My understanding is that the function here is similar to virtio config 
generation [1]. So this can only happen for a buggy hardware.


Thanks

[1] 
https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html 
Section 2.4.1






+static void io_write64_twopart(u64 val, u32 *lo, u32 *hi)
+{
+   iowrite32(val & ((1ULL << 32) - 1), lo);
+   iowrite32(val >> 32, hi);
+}

I see this macro is also in virtio_pci_modern.c

Assuming lo and hi aren't guaranteed to be sequential
and thus iowrite64_hi_lo() cannot be used perhaps
it would be good to add a common helper somewhere.

Thanks, I will try after this IFC patchwork, I will cc you.

Thanks.

...


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-21 Thread Jason Wang


On 2019/10/21 下午6:00, Zhu, Lingshan wrote:


On 10/16/2019 4:40 PM, Jason Wang wrote:


On 2019/10/16 上午9:30, Zhu Lingshan wrote:

This commit introduced ifcvf_base layer, which handles IFC VF NIC
hardware operations and configurations.



It's better to describe the difference between ifc vf and virtio in 
the commit log or is there a open doc for this?




Hi Jason,

Sure, I will split these code into small patches with detailed commit 
logs in v1 patchset.


Signed-off-by: Zhu Lingshan 
---
  drivers/vhost/ifcvf/ifcvf_base.c | 390 
+++

  drivers/vhost/ifcvf/ifcvf_base.h | 137 ++
  2 files changed, 527 insertions(+)
  create mode 100644 drivers/vhost/ifcvf/ifcvf_base.c
  create mode 100644 drivers/vhost/ifcvf/ifcvf_base.h

diff --git a/drivers/vhost/ifcvf/ifcvf_base.c 
b/drivers/vhost/ifcvf/ifcvf_base.c

new file mode 100644
index ..b85e14c9bdcf
--- /dev/null
+++ b/drivers/vhost/ifcvf/ifcvf_base.c
@@ -0,0 +1,390 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2019 Intel Corporation.
+ */
+
+#include "ifcvf_base.h"
+
+static void *get_cap_addr(struct ifcvf_hw *hw, struct 
virtio_pci_cap *cap)

+{
+    u8 bar = cap->bar;
+    u32 length = cap->length;
+    u32 offset = cap->offset;
+    struct ifcvf_adapter *ifcvf =
+    container_of(hw, struct ifcvf_adapter, vf);
+
+    if (bar >= IFCVF_PCI_MAX_RESOURCE) {
+    IFC_ERR(ifcvf->dev,
+    "Invalid bar number %u to get capabilities.\n", bar);
+    return NULL;
+    }
+
+    if (offset + length < offset) {
+    IFC_ERR(ifcvf->dev, "offset(%u) + length(%u) overflows\n",
+    offset, length);
+    return NULL;
+    }
+
+    if (offset + length > hw->mem_resource[cap->bar].len) {
+    IFC_ERR(ifcvf->dev,
+    "offset(%u) + len(%u) overflows bar%u to get 
capabilities.\n",

+    offset, length, bar);
+    return NULL;
+    }
+
+    return hw->mem_resource[bar].addr + offset;
+}
+
+int ifcvf_read_config_range(struct pci_dev *dev,
+    uint32_t *val, int size, int where)
+{
+    int i;
+
+    for (i = 0; i < size; i += 4) {
+    if (pci_read_config_dword(dev, where + i, val + i / 4) < 0)
+    return -1;
+    }
+    return 0;
+}
+
+int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *dev)
+{
+    int ret;
+    u8 pos;
+    struct virtio_pci_cap cap;
+    u32 i;
+    u16 notify_off;
+
+    ret = pci_read_config_byte(dev, PCI_CAPABILITY_LIST, );
+
+    if (ret < 0) {
+    IFC_ERR(>dev, "failed to read PCI capability list.\n");
+    return -EIO;
+    }
+
+    while (pos) {
+    ret = ifcvf_read_config_range(dev, (u32 *),
+    sizeof(cap), pos);
+
+    if (ret < 0) {
+    IFC_ERR(>dev, "failed to get PCI capability at %x",
+    pos);
+    break;
+    }
+
+    if (cap.cap_vndr != PCI_CAP_ID_VNDR)
+    goto next;
+
+    IFC_INFO(>dev, "read PCI config:\n"
+    "config type: %u.\n"
+    "PCI bar: %u.\n"
+    "PCI bar offset: %u.\n"
+    "PCI config len: %u.\n",
+    cap.cfg_type, cap.bar,
+    cap.offset, cap.length);
+
+    switch (cap.cfg_type) {
+    case VIRTIO_PCI_CAP_COMMON_CFG:
+    hw->common_cfg = get_cap_addr(hw, );
+    IFC_INFO(>dev, "hw->common_cfg = %p.\n",
+    hw->common_cfg);
+    break;
+    case VIRTIO_PCI_CAP_NOTIFY_CFG:
+    pci_read_config_dword(dev, pos + sizeof(cap),
+    >notify_off_multiplier);
+    hw->notify_bar = cap.bar;
+    hw->notify_base = get_cap_addr(hw, );
+    IFC_INFO(>dev, "hw->notify_base = %p.\n",
+    hw->notify_base);
+    break;
+    case VIRTIO_PCI_CAP_ISR_CFG:
+    hw->isr = get_cap_addr(hw, );
+    IFC_INFO(>dev, "hw->isr = %p.\n", hw->isr);
+    break;
+    case VIRTIO_PCI_CAP_DEVICE_CFG:
+    hw->dev_cfg = get_cap_addr(hw, );
+    IFC_INFO(>dev, "hw->dev_cfg = %p.\n", hw->dev_cfg);
+    break;
+    }
+next:
+    pos = cap.cap_next;
+    }
+
+    if (hw->common_cfg == NULL || hw->notify_base == NULL ||
+    hw->isr == NULL || hw->dev_cfg == NULL) {
+    IFC_ERR(>dev, "Incomplete PCI capabilities.\n");
+    return -1;
+    }
+
+    for (i = 0; i < (IFCVF_MAX_QUEUE_PAIRS * 2); i++) {



Any reason for using hard coded queue pairs limit other than the 
max_queue_pairs in the net config?
Hi Jason, Thanks for your kindly comments. For now the driver don't 
support MQ, we intend to provide a minimal feature sets in this 
version 1 driver.



Ok, it's better to add comment above IFCVF_MAX_QUEUE_PAIRS.






+    iowrite16(i, >common_cfg->queue_select);
+    notify_off = ioread16(>common_cfg->queue_notify_off);
+    hw->notify_addr[i] = (void *)((u8 *)hw->notify_base +
+    

Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-21 Thread Jason Wang


On 2019/10/21 下午5:57, Zhu, Lingshan wrote:


On 10/16/2019 4:45 PM, Jason Wang wrote:


On 2019/10/16 上午9:30, Zhu Lingshan wrote:

+ */
+#define IFCVF_TRANSPORT_F_START 28
+#define IFCVF_TRANSPORT_F_END   34
+
+#define IFC_SUPPORTED_FEATURES \
+    ((1ULL << VIRTIO_NET_F_MAC)    | \
+ (1ULL << VIRTIO_F_ANY_LAYOUT)    | \
+ (1ULL << VIRTIO_F_VERSION_1) | \
+ (1ULL << VHOST_F_LOG_ALL)    | \



Let's avoid using VHOST_F_LOG_ALL, using the get_mdev_features() 
instead.

Thanks, I will remove VHOST_F_LOG_ALL




+ (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE)    | \
+ (1ULL << VIRTIO_NET_F_CTRL_VQ)    | \
+ (1ULL << VIRTIO_NET_F_STATUS)    | \
+ (1ULL << VIRTIO_NET_F_MRG_RXBUF)) /* not fully supported */



Why not having VIRTIO_F_IOMMU_PLATFORM and VIRTIO_F_ORDER_PLATFORM?


I will add VIRTIO_F_ORDER_PLATFORM, for VIRTIO_F_IOMMU_PLATFORM, if we 
add this bit, QEMU may enable viommu, can cause troubles in LM



Qemu has mature support of vIOMMU support for VFIO device, it can shadow 
IO page tables and setup them through DMA ioctl of vfio containers. Any 
issue you saw here?


Btw, to test them quickly, you can implement set_config/get_config and 
test them through virtio-mdev/kernel drivers as well.


Thanks



(through we don't support LM in this version driver)

Thanks,
BR
Zhu Lingshan


Thanks



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-16 Thread Jason Wang


On 2019/10/16 上午9:30, Zhu Lingshan wrote:

+ */
+#define IFCVF_TRANSPORT_F_START 28
+#define IFCVF_TRANSPORT_F_END   34
+
+#define IFC_SUPPORTED_FEATURES \
+   ((1ULL << VIRTIO_NET_F_MAC)   | \
+(1ULL << VIRTIO_F_ANY_LAYOUT)| \
+(1ULL << VIRTIO_F_VERSION_1) | \
+(1ULL << VHOST_F_LOG_ALL)| \



Let's avoid using VHOST_F_LOG_ALL, using the get_mdev_features() instead.



+(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE)| \
+(1ULL << VIRTIO_NET_F_CTRL_VQ)   | \
+(1ULL << VIRTIO_NET_F_STATUS)| \
+(1ULL << VIRTIO_NET_F_MRG_RXBUF)) /* not fully supported */



Why not having VIRTIO_F_IOMMU_PLATFORM and VIRTIO_F_ORDER_PLATFORM?

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-16 Thread Jason Wang


On 2019/10/16 上午9:30, Zhu Lingshan wrote:

This commit introduced ifcvf_base layer, which handles IFC VF NIC
hardware operations and configurations.



It's better to describe the difference between ifc vf and virtio in the 
commit log or is there a open doc for this?





Signed-off-by: Zhu Lingshan 
---
  drivers/vhost/ifcvf/ifcvf_base.c | 390 +++
  drivers/vhost/ifcvf/ifcvf_base.h | 137 ++
  2 files changed, 527 insertions(+)
  create mode 100644 drivers/vhost/ifcvf/ifcvf_base.c
  create mode 100644 drivers/vhost/ifcvf/ifcvf_base.h

diff --git a/drivers/vhost/ifcvf/ifcvf_base.c b/drivers/vhost/ifcvf/ifcvf_base.c
new file mode 100644
index ..b85e14c9bdcf
--- /dev/null
+++ b/drivers/vhost/ifcvf/ifcvf_base.c
@@ -0,0 +1,390 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2019 Intel Corporation.
+ */
+
+#include "ifcvf_base.h"
+
+static void *get_cap_addr(struct ifcvf_hw *hw, struct virtio_pci_cap *cap)
+{
+   u8 bar = cap->bar;
+   u32 length = cap->length;
+   u32 offset = cap->offset;
+   struct ifcvf_adapter *ifcvf =
+   container_of(hw, struct ifcvf_adapter, vf);
+
+   if (bar >= IFCVF_PCI_MAX_RESOURCE) {
+   IFC_ERR(ifcvf->dev,
+   "Invalid bar number %u to get capabilities.\n", bar);
+   return NULL;
+   }
+
+   if (offset + length < offset) {
+   IFC_ERR(ifcvf->dev, "offset(%u) + length(%u) overflows\n",
+   offset, length);
+   return NULL;
+   }
+
+   if (offset + length > hw->mem_resource[cap->bar].len) {
+   IFC_ERR(ifcvf->dev,
+   "offset(%u) + len(%u) overflows bar%u to get 
capabilities.\n",
+   offset, length, bar);
+   return NULL;
+   }
+
+   return hw->mem_resource[bar].addr + offset;
+}
+
+int ifcvf_read_config_range(struct pci_dev *dev,
+   uint32_t *val, int size, int where)
+{
+   int i;
+
+   for (i = 0; i < size; i += 4) {
+   if (pci_read_config_dword(dev, where + i, val + i / 4) < 0)
+   return -1;
+   }
+   return 0;
+}
+
+int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *dev)
+{
+   int ret;
+   u8 pos;
+   struct virtio_pci_cap cap;
+   u32 i;
+   u16 notify_off;
+
+   ret = pci_read_config_byte(dev, PCI_CAPABILITY_LIST, );
+
+   if (ret < 0) {
+   IFC_ERR(>dev, "failed to read PCI capability list.\n");
+   return -EIO;
+   }
+
+   while (pos) {
+   ret = ifcvf_read_config_range(dev, (u32 *),
+   sizeof(cap), pos);
+
+   if (ret < 0) {
+   IFC_ERR(>dev, "failed to get PCI capability at %x",
+   pos);
+   break;
+   }
+
+   if (cap.cap_vndr != PCI_CAP_ID_VNDR)
+   goto next;
+
+   IFC_INFO(>dev, "read PCI config:\n"
+   "config type: %u.\n"
+   "PCI bar: %u.\n"
+   "PCI bar offset: %u.\n"
+   "PCI config len: %u.\n",
+   cap.cfg_type, cap.bar,
+   cap.offset, cap.length);
+
+   switch (cap.cfg_type) {
+   case VIRTIO_PCI_CAP_COMMON_CFG:
+   hw->common_cfg = get_cap_addr(hw, );
+   IFC_INFO(>dev, "hw->common_cfg = %p.\n",
+   hw->common_cfg);
+   break;
+   case VIRTIO_PCI_CAP_NOTIFY_CFG:
+   pci_read_config_dword(dev, pos + sizeof(cap),
+   >notify_off_multiplier);
+   hw->notify_bar = cap.bar;
+   hw->notify_base = get_cap_addr(hw, );
+   IFC_INFO(>dev, "hw->notify_base = %p.\n",
+   hw->notify_base);
+   break;
+   case VIRTIO_PCI_CAP_ISR_CFG:
+   hw->isr = get_cap_addr(hw, );
+   IFC_INFO(>dev, "hw->isr = %p.\n", hw->isr);
+   break;
+   case VIRTIO_PCI_CAP_DEVICE_CFG:
+   hw->dev_cfg = get_cap_addr(hw, );
+   IFC_INFO(>dev, "hw->dev_cfg = %p.\n", hw->dev_cfg);
+   break;
+   }
+next:
+   pos = cap.cap_next;
+   }
+
+   if (hw->common_cfg == NULL || hw->notify_base == NULL ||
+   hw->isr == NULL || hw->dev_cfg == NULL) {
+   IFC_ERR(>dev, "Incomplete PCI capabilities.\n");
+   return -1;
+   }
+
+   for (i = 0; i < (IFCVF_MAX_QUEUE_PAIRS * 2); i++) {



Any reason 

Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-15 Thread Stephen Hemminger
On Wed, 16 Oct 2019 09:03:17 +0800
Zhu Lingshan  wrote:

> + IFC_INFO(>dev, "PCI capability mapping:\n"
> + "common cfg: %p\n"
> + "notify base: %p\n"
> + "isr cfg: %p\n"
> + "device cfg: %p\n"
> + "multiplier: %u\n",
> + hw->common_cfg,
> + hw->notify_base,
> + hw->isr,
> + hw->dev_cfg,
> + hw->notify_off_multiplier);

Since kernel messages go to syslog, syslog does not handle multi-line
messages very well. This should be a single line.

Also, this is the kind of message that should be at the debug
level; something that is useful to the driver developers
but not something that needs to be filling up every end users log.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-15 Thread Stephen Hemminger
On Wed, 16 Oct 2019 09:03:17 +0800
Zhu Lingshan  wrote:

> +int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *dev)
> +{
> + int ret;
> + u8 pos;
> + struct virtio_pci_cap cap;
> + u32 i;
> + u16 notify_off;

For network code, the preferred declaration style is
reverse christmas tree.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization