This continues the breakup of the huge DPDK "howto" into smaller components. There are a couple of related changes included, such as using "Rx queue" instead of "rxq" and noting how Tx queues cannot be configured.
We enable the TODO directive, so we can actually start calling out some TODOs. Signed-off-by: Stephen Finucane <step...@that.guru> --- Documentation/conf.py | 2 +- Documentation/howto/dpdk.rst | 86 ------------------- Documentation/topics/dpdk/index.rst | 1 + Documentation/topics/dpdk/phy.rst | 10 +++ Documentation/topics/dpdk/pmd.rst | 139 +++++++++++++++++++++++++++++++ Documentation/topics/dpdk/vhost-user.rst | 17 ++-- 6 files changed, 159 insertions(+), 96 deletions(-) create mode 100644 Documentation/topics/dpdk/pmd.rst diff --git a/Documentation/conf.py b/Documentation/conf.py index 6ab144c5d..babda21de 100644 --- a/Documentation/conf.py +++ b/Documentation/conf.py @@ -32,7 +32,7 @@ needs_sphinx = '1.1' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. -extensions =  +extensions = ['sphinx.ext.todo'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst index d717d2ebe..c2324118d 100644 --- a/Documentation/howto/dpdk.rst +++ b/Documentation/howto/dpdk.rst @@ -81,92 +81,6 @@ To stop ovs-vswitchd & delete bridge, run:: $ ovs-appctl -t ovsdb-server exit $ ovs-vsctl del-br br0 -PMD Thread Statistics ---------------------- - -To show current stats:: - - $ ovs-appctl dpif-netdev/pmd-stats-show - -To clear previous stats:: - - $ ovs-appctl dpif-netdev/pmd-stats-clear - -Port/RXQ Assigment to PMD Threads ---------------------------------- - -To show port/rxq assignment:: - - $ ovs-appctl dpif-netdev/pmd-rxq-show - -To change default rxq assignment to pmd threads, rxqs may be manually pinned to -desired cores using:: - - $ ovs-vsctl set Interface <iface> \ - other_config:pmd-rxq-affinity=<rxq-affinity-list> - -where: - -- ``<rxq-affinity-list>`` is a CSV list of ``<queue-id>:<core-id>`` values - -For example:: - - $ ovs-vsctl set interface dpdk-p0 options:n_rxq=4 \ - other_config:pmd-rxq-affinity="0:3,1:7,3:8" - -This will ensure: - -- Queue #0 pinned to core 3 -- Queue #1 pinned to core 7 -- Queue #2 not pinned -- Queue #3 pinned to core 8 - -After that PMD threads on cores where RX queues was pinned will become -``isolated``. This means that this thread will poll only pinned RX queues. - -.. warning:: - If there are no ``non-isolated`` PMD threads, ``non-pinned`` RX queues will - not be polled. Also, if provided ``core_id`` is not available (ex. this - ``core_id`` not in ``pmd-cpu-mask``), RX queue will not be polled by any PMD - thread. - -If pmd-rxq-affinity is not set for rxqs, they will be assigned to pmds (cores) -automatically. The processing cycles that have been stored for each rxq -will be used where known to assign rxqs to pmd based on a round robin of the -sorted rxqs. - -For example, in the case where here there are 5 rxqs and 3 cores (e.g. 3,7,8) -available, and the measured usage of core cycles per rxq over the last -interval is seen to be: - -- Queue #0: 30% -- Queue #1: 80% -- Queue #3: 60% -- Queue #4: 70% -- Queue #5: 10% - -The rxqs will be assigned to cores 3,7,8 in the following order: - -Core 3: Q1 (80%) | -Core 7: Q4 (70%) | Q5 (10%) -core 8: Q3 (60%) | Q0 (30%) - -To see the current measured usage history of pmd core cycles for each rxq:: - - $ ovs-appctl dpif-netdev/pmd-rxq-show - -.. note:: - - A history of one minute is recorded and shown for each rxq to allow for - traffic pattern spikes. An rxq's pmd core cycles usage changes due to traffic - pattern or reconfig changes will take one minute before they are fully - reflected in the stats. - -Rxq to pmds assignment takes place whenever there are configuration changes -or can be triggered by using:: - - $ ovs-appctl dpif-netdev/pmd-rxq-rebalance - QoS --- diff --git a/Documentation/topics/dpdk/index.rst b/Documentation/topics/dpdk/index.rst index 5f836a6e9..dfde88377 100644 --- a/Documentation/topics/dpdk/index.rst +++ b/Documentation/topics/dpdk/index.rst @@ -31,3 +31,4 @@ The DPDK Datapath phy vhost-user ring + pmd diff --git a/Documentation/topics/dpdk/phy.rst b/Documentation/topics/dpdk/phy.rst index 1c18e4e3d..222fa3e9f 100644 --- a/Documentation/topics/dpdk/phy.rst +++ b/Documentation/topics/dpdk/phy.rst @@ -109,3 +109,13 @@ tool:: For more information, refer to the `DPDK documentation <dpdk-drivers>`__. .. _dpdk-drivers: http://dpdk.org/doc/guides/linux_gsg/linux_drivers.html + +Multiqueue +---------- + +Poll Mode Driver (PMD) threads are the threads that do the heavy lifting for +the DPDK datapath. Correct configuration of PMD threads and the Rx queues they +utilize is a requirement in order to deliver the high-performance possible with +the DPDK datapath. It is possible to configure multiple Rx queues for ``dpdk`` +ports, thus ensuring this is not a bottleneck for performance. For information +on configuring PMD threads, refer to :doc:`pmd`. diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst new file mode 100644 index 000000000..e15e8cc3b --- /dev/null +++ b/Documentation/topics/dpdk/pmd.rst @@ -0,0 +1,139 @@ +.. + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + ======= Heading 0 (reserved for the title in a document) + ------- Heading 1 + ~~~~~~~ Heading 2 + +++++++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + +=========== +PMD Threads +=========== + +Poll Mode Driver (PMD) threads are the threads that do the heavy lifting for +the DPDK datapath and perform tasks such as continuous polling of input ports +for packets, classifying packets once received, and executing actions on the +packets once they are classified. + +PMD threads utilize Receive (Rx) and Transmit (Tx) queues, commonly known as +*rxq*\s and *txq*\s. While Tx queue configuration happens automatically, Rx +queues can be configured by the user. This can happen in one of two ways: + +- For physical interfaces, configuration is done using the + :program:`ovs-appctl` utility. + +- For virtual interfaces, configuration is done using the :program:`ovs-appctl` + utility, but this configuration must be reflected in the guest configuration + (e.g. QEMU command line arguments). + +The :program:`ovs-appctl` utility also provides a number of commands for +querying PMD threads and their respective queues. This, and all of the above, +is discussed here. + +PMD Thread Statistics +--------------------- + +To show current stats:: + + $ ovs-appctl dpif-netdev/pmd-stats-show + +To clear previous stats:: + + $ ovs-appctl dpif-netdev/pmd-stats-clear + +Port/Rx Queue Assigment to PMD Threads +-------------------------------------- + +.. todo:: + + This needs a more detailed overview of *why* this should be done, along with + the impact on things like NUMA affinity. + +To show port/RX queue assignment:: + + $ ovs-appctl dpif-netdev/pmd-rxq-show + +Rx queues may be manually pinned to cores. This will change the default Rx +queue assignment to PMD threads:: + + $ ovs-vsctl set Interface <iface> \ + other_config:pmd-rxq-affinity=<rxq-affinity-list> + +where: + +- ``<rxq-affinity-list>`` is a CSV list of ``<queue-id>:<core-id>`` values + +For example:: + + $ ovs-vsctl set interface dpdk-p0 options:n_rxq=4 \ + other_config:pmd-rxq-affinity="0:3,1:7,3:8" + +This will ensure there are *4* Rx queues and that these queues are configured +like so: + +- Queue #0 pinned to core 3 +- Queue #1 pinned to core 7 +- Queue #2 not pinned +- Queue #3 pinned to core 8 + +PMD threads on cores where Rx queues are *pinned* will become *isolated*. This +means that this thread will only poll the *pinned* Rx queues. + +.. warning:: + + If there are no *non-isolated* PMD threads, *non-pinned* RX queues will not + be polled. Also, if the provided ``<core-id>`` is not available (e.g. the + ``<core-id>`` is not in ``pmd-cpu-mask``), the RX queue will not be polled by + any PMD thread. + +If ``pmd-rxq-affinity`` is not set for Rx queues, they will be assigned to PMDs +(cores) automatically. Where known, the processing cycles that have been stored +for each Rx queue will be used to assign Rx queue to PMDs based on a round +robin of the sorted Rx queues. For example, take the following example, where +there are five Rx queues and three cores - 3, 7, and 8 - available and the +measured usage of core cycles per Rx queue over the last interval is seen to +be: + +- Queue #0: 30% +- Queue #1: 80% +- Queue #3: 60% +- Queue #4: 70% +- Queue #5: 10% + +The Rx queues will be assigned to the cores in the following order: + +Core 3: Q1 (80%) | +Core 7: Q4 (70%) | Q5 (10%) +core 8: Q3 (60%) | Q0 (30%) + +To see the current measured usage history of PMD core cycles for each Rx +queue:: + + $ ovs-appctl dpif-netdev/pmd-rxq-show + +.. note:: + + A history of one minute is recorded and shown for each Rx queue to allow for + traffic pattern spikes. Any changes in the Rx queue's PMD core cycles usage, + due to traffic pattern or reconfig changes, will take one minute to be fully + reflected in the stats. + +Rx queue to PMD assignment takes place whenever there are configuration changes +or can be triggered by using:: + + $ ovs-appctl dpif-netdev/pmd-rxq-rebalance diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst index 95517a676..d84d99246 100644 --- a/Documentation/topics/dpdk/vhost-user.rst +++ b/Documentation/topics/dpdk/vhost-user.rst @@ -127,11 +127,10 @@ an additional set of parameters:: -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2 -In addition, QEMU must allocate the VM's memory on hugetlbfs. vhost-user -ports access a virtio-net device's virtual rings and packet buffers mapping the -VM's physical memory on hugetlbfs. To enable vhost-user ports to map the VM's -memory into their process address space, pass the following parameters to -QEMU:: +In addition, QEMU must allocate the VM's memory on hugetlbfs. vhost-user ports +access a virtio-net device's virtual rings and packet buffers mapping the VM's +physical memory on hugetlbfs. To enable vhost-user ports to map the VM's memory +into their process address space, pass the following parameters to QEMU:: -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc @@ -151,18 +150,18 @@ where: The number of vectors, which is ``$q`` * 2 + 2 The vhost-user interface will be automatically reconfigured with required -number of rx and tx queues after connection of virtio device. Manual +number of Rx and Tx queues after connection of virtio device. Manual configuration of ``n_rxq`` is not supported because OVS will work properly only if ``n_rxq`` will match number of queues configured in QEMU. -A least 2 PMDs should be configured for the vswitch when using multiqueue. +A least two PMDs should be configured for the vswitch when using multiqueue. Using a single PMD will cause traffic to be enqueued to the same vhost queue rather than being distributed among different vhost queues for a vhost-user interface. If traffic destined for a VM configured with multiqueue arrives to the vswitch -via a physical DPDK port, then the number of rxqs should also be set to at -least 2 for that physical DPDK port. This is required to increase the +via a physical DPDK port, then the number of Rx queues should also be set to at +least two for that physical DPDK port. This is required to increase the probability that a different PMD will handle the multiqueue transmission to the guest using a different vhost queue. -- 2.14.3 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev