On 2021/3/1 8:06 下午, Sean Mooney wrote:
On Mon, 2021-03-01 at 11:28 +0100, Adrian Moreno wrote:
On 3/1/21 8:50 AM, Jason Wang wrote:
On 2021/3/1 3:29 下午, Parav Pandit wrote:
From: Jason Wang <[email protected]>
Sent: Monday, March 1, 2021 9:29 AM

On 2021/2/26 4:50 下午, Parav Pandit wrote:
From: Jason Wang <[email protected]>
Sent: Friday, February 26, 2021 1:56 PM


On 2021/2/26 1:32 下午, Parav Pandit wrote:
From: Jason Wang <[email protected]>
Sent: Wednesday, February 24, 2021 2:18 PM

On 2021/2/24 2:18 下午, Parav Pandit wrote:
+
+    if (nla_put_u16(msg,
VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
+            config->max_virtqueue_pairs))
+        return -EMSGSIZE;
We probably need to check VIRTIO_NET_F_RSS here.
Yes. Will use it in v2.

+    if (nla_put_u8(msg,
VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,
+               config->rss_max_key_size))
+        return -EMSGSIZE;
+
+    rss_field = le16_to_cpu(config->rss_max_key_size);
+    if (nla_put_u16(msg,
VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,
rss_field))
+        return -EMSGSIZE;
+
+    hash_types = le32_to_cpu(config->supported_hash_types);
+    if (nla_put_u32(msg,
VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,
+            config->supported_hash_types))
+        return -EMSGSIZE;
+    return 0;
+}
+
+static int vdpa_dev_net_config_fill(struct vdpa_device *vdev,
+struct sk_buff *msg) {
+    struct virtio_net_config config = {};
+
+    vdev->config->get_config(vdev, 0, &config, sizeof(config));
Do we need to sync with other possible get_config calls here?
To readers of get_config() is ok. No need to sync.
However, I think set_config() and get_config() should sync otherwise
get_config can get partial value.
Will probably have to add vdpa_device->config_access_lock().
Probably. And the set_config() should be synchronized with the
requrest that comes from vDPA bus.
Yes, a rw semaphore is good enough.
Device config fields can be change from the device side anyway, such as
link status anyway.
Sync using lock shouldn’t be problem.

This makes me think whether we should go back to two phases method,
create and enable.

The vDPA device is only registred after enabling, then there's no
need to sync with vDPA bus, and mgmt is not allowed to change config
after enalbing?
In that case we should be able to disable it as well. Disable should take the
device of the bus.
I find it weird to have this model to linger around the device without
anchoring on the bus.
For example device is not yet enabled, so it is not anchored on the bus, but
its name is still have to remain unique.


Can we do some synchornization between vdpa device allocation and vdpa
device registier to solve the naming issue?
mgmt tool querying the device config space after vdpa device is in use is real.
So I don't see solving it any differently.

Now upper layers of vdpa to not access the device on the placed on the vdpa
bus, can be taken care by existing driver code using
echo 0 > /sys/bus/vdpa/drivers_autoprobe

By default vdpa device placed on the bus wont bind to any driver (net/vhost 
etc).
1. Decision to bind to which driver is done after config of the vdpa device is
done.
2. orchestration sw does create and config before it binds to the right driver
3. which driver to bind to decided based on the use case of that individual
vdpa device
For example,
(a) vdpa0 binds to net driver to have Netdev for container
(b) vdpa1 binds to vhost driver to map vdpa device to VM

I think it should work.
Adding Adrian, Maxime and Sean for more comments.
Is this a strong requirement or just a optional operation?


So my understanding is that something needs to be done to make sure:

1) the vpda device is not bound to virtio-vdpa
2) config is not changed if it has been set by management


This might break some existing logic that expects netdev or vhost-vdpa device to
exist before the some config (e.g:mac address) is set.
the orcestration software should not require permission to do dirver binding 
and in the case of openstack will
not have the capablity to bind the vdpa device to a dirvier in the current 
proposed design.
the ordering we are expecting to support in openstack at least initally is that 
the adminsitrator
will use either systemd service files or udev rules to allocate the VF and vdpa 
device staictly at boot.

in the case of openstack we will expect all vdpa devies that will be used by 
openstack(nova) for assignment
to vms will be prebound by the adminstrators to vhost-vdpa. openstack will 
simply track the /dev/vhost* paths
corallated to the vdpa devices whose parent VF are allowed by the pci 
passthough whitelist.

if the adminstrator binds the vdpa device to virtio-vdpa so that virtio-net-pci 
is used on the host nova and libvirt
will not correct that and the vm boot will fail. but provided you do not list 
that parent VF in the whitelist nova
will also ignore it. so if you want to mix and match the vdpa device driver 
that should not break anything provided
you do not list its parent in the pci whitelist.

but one thing to keep in mind is whlie we are not planning to create VF/VDPA 
devices dynamiclaly at lesat not for the initall
implentation we do expact to be able to configure attibute of the vdpa device 
after tehy are create like any other nic.
for example the mac, mtu, link-state should all be configureable at runtime via 
normal ip route2 commands.


So in the case of vDPA, this task is expected to be done by vdpa tool. (There's no guarantee that vDPA is a VF).



at the morment we are encounterng issue with the mac adress specificly which 
qemu is hopefully going to work around since
libvirt and nova cannot currently configure the mac address of a vhost-vdpa 
device so we are relying on qemu
to take the mac adress form the qemu command line so that it does not end up as 
all 0 macs in the guest.


Note that a kernel patch is posted by Eli[1] which assign a randomized mac. This should solve the issue in another way.


as some one that works on orcestration level i would have expect that setting 
the mac/mtu on either the vf or the netdev
representor to match the deireed mac/mtu in the guest would have been sufficent 
to alter the vdpa device confirutation.
likely  the netdev representor would make the most sence to alter to match the 
desired mac so that the mac adress see on the ovs
bridge matches the mac of the vm.


Speaking for genreal vDPA, it's not guaranteed that the vendor will implmenet the device with a representor. So vDPA tool is expected to be the only way to configure the device.

Thanks

[1] https://www.spinics.net/lists/linux-virtualization/msg48451.html


the current case where the vdpa representor netdev seams to have no bareing on 
the vdpa device
config makes debugging much more difficalt and is likely to confuse monitoring 
software. fortunetly openstack is not matching on the mac
address in this spefic case to identify the netdev represntor port but we have 
other backend that do look at the mac. in the case of ovs
we store the uuid of the neutron port in the ovsdb when adding a port to ovs so 
we rely on that not the mac but  for legacy mode
sriov which we hope to support with vdpa eventulally after we comple the 
hardware offloaed ovs support, currenlty relys on the mac in some cases.

i hope that answered your questions?
we are aware that we likely will have to support vdpa device creation at some 
point if we intend to add support for vdpa devices
created form subfunctions or intels siov howver that was not ready when we 
started this work so it was declared out of scope and
the curernt static approch was taken as the only path we could enable before 
code free in 10 days.

regards sean.


Thanks


https://www.spinics.net/lists/linux-virtualization/msg48451.html

_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to