The run-time performance of PMDs is often difficult to understand and trouble-shoot. The existing PMD statistics counters only provide a coarse grained average picture. At packet rates of several Mpps sporadic drops of packet bursts happen at sub-millisecond time scales and are impossible to capture and analyze with existing tools.
This patch collects a large number of important PMD performance metrics per PMD iteration, maintaining histograms and circular histories for iteration metrics and millisecond averages. To capture sporadic drop events, the patch set can be configured to monitor iterations for suspicious metrics and to log the neighborhood of such iterations for off-line analysis. The extra cost for the performance metric collection and the supervision has been measured to be in the order of 1% compared to the base commit in a PVP setup with L3 pipeline over VXLAN tunnels. For that reason the metrics collection is disabled by default and can be enabled at run-time through configuration. v11 -> v12: * Rebased to master (commit 83c2757bd) * Clarified meaning of recv() param *qfill in netdev-provider.h (Ben) v10 -> v11: * Rebased to master (commit 00a0a011d) * Implemented comments on v10 by Ilya, Aaron and Ian. * Replaced broken macro ATOMIC_LLONG_LOCK_FREE with working macro ATOMIC_ALWAYS_LOCK_FREE_8B. * Changed iteration key in iteration history from TSC timetamp to iteration counter. * Bugfix: Suspicious iteration logged was one off the actual suspicious iteration. v9 -> v10: * Implemented missed comment by Ilya on v8: use ATOMIC_LLONG_LOCK_FREE * Fixed travis and checkpatch errors reported by Ian on v9. v8 -> v9: * Rebased to master (commit cb8cbbbe9) * Implemented minor comments on v8 by Billy v7 -> v8: * Rebased on to master (commit 4e99b70df) * Implemented comments from Ilya Maximets and Billy O'Mahony. * Replaced netdev_rxq_length() introduced in v7 by optional out parameter for the remaining rx queue len in netdev_rxq_recv(). * Fixed thread synchronization issues in clearing PMD stats: - Use mutex to control whether to clear from main thread directly or in PMD at start of next iteration. - Use mutex to prevent concurrent clearing and printing of metrics. * Added tx packet and batch stats to pmd-perf-show output. * Delay warning for suspicious iteration to the iteration in which we also log the neighborhood to not pollute the logged iteration stats with logging costs. * Corrected the exact number of iterations logged before and after a supicious iteration. * Introduced options -e and -ne in pmd-perf-log-set to control whether to *extend* the range of logged iterations when additional supicious iterations are detected before the scheduled end of logging interval is reached. * Exclude logging cycles from the iteration stats to avoid confusing ghost peaks. * Performance impact compared to master less than 1% even with supervision enabled. v5 -> v7: * Rebased on to dpdk_merge (commit e666668) - New base contains earlier refactoring parts of series. * Implemented comments from Ilya Maximets and Billy O'Mahony. * Replaced piggybacking qlen on dp_packet_batch with a new netdev API netdev_rxq_length(). * Thread-safe clearing of pmd counters in pmd_perf_start_iteration(). * Fixed bug in reporting datapath stats. * Work-around a bug in DPDK rte_vhost_rx_queue_count() which sometimes returns bogus in the upper 16 bits of the uint32_t return value. v4 -> v5: * Rebased to master (commit e9de6c0) * Implemented comments from Aaron Conole and Darrel Ball v3 -> v4: * Rebased to master (commit 4d0a31b) - Reverting changes to struct dp_netdev_pmd_thread. * Make metrics collection configurable. * Several bugfixes. v2 -> v3: * Rebased to OVS master (commit 3728b3b). * Non-trivial adaptation to struct dp_netdev_pmd_thread. - refactored in commit a807c157 (Bhanu). * No other changes compared to v2. v1 -> v2: * Rebased to OVS master (commit 7468ec788). * No other changes compared to v1. Jan Scheurich (3): netdev: Add optional qfill output parameter to rxq_recv() dpif-netdev: Detailed performance stats for PMDs dpif-netdev: Detection and logging of suspicious PMD iterations NEWS | 6 + lib/automake.mk | 1 + lib/dpif-netdev-perf.c | 685 +++++++++++++++++++++++++++++++++++++++++++- lib/dpif-netdev-perf.h | 218 ++++++++++++-- lib/dpif-netdev-unixctl.man | 216 ++++++++++++++ lib/dpif-netdev.c | 192 ++++++++++++- lib/netdev-bsd.c | 8 +- lib/netdev-dpdk.c | 41 ++- lib/netdev-dummy.c | 8 +- lib/netdev-linux.c | 7 +- lib/netdev-provider.h | 8 +- lib/netdev.c | 5 +- lib/netdev.h | 3 +- manpages.mk | 2 + vswitchd/ovs-vswitchd.8.in | 27 +- vswitchd/vswitch.xml | 12 + 16 files changed, 1363 insertions(+), 76 deletions(-) create mode 100644 lib/dpif-netdev-unixctl.man -- 1.9.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev