Hi Honnappa,
On 10/11/22 21:43, Honnappa Nagarahalli wrote:
Hi Andrew,
Few questions inline.
-----Original Message-----
From: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>
Sent: Monday, October 3, 2022 5:00 AM
To: tho...@monjalon.net
Cc: dev@dpdk.org; Ferruh Yigit <ferruh.yi...@xilinx.com>; Ajit Khaparde
(ajit.khapa...@broadcom.com) <ajit.khapa...@broadcom.com>;
jer...@marvell.com; Dongdong Liu <liudongdo...@huawei.com>; Qiming
Yang <qiming.y...@intel.com>; Yuying Zhang <yuying.zh...@intel.com>;
Beilei Xing <beilei.x...@intel.com>; Qi Zhang <qi.z.zh...@intel.com>;
hemant.agra...@nxp.com; Maxime Coquelin
<maxime.coque...@redhat.com>; Viacheslav Ovsiienko
<viachesl...@nvidia.com>; Stephen Hemminger
<sthem...@microsoft.com>
Subject: Opensource ethdev tests
Hi Thomas and community,
May I ask to add https://ts-factory.io/ to the DPDK ecosystem.
I'm not 100% that it is suitable for the ecosystem since it is not consuming
DPDK, but rather testing DPDK ethdev.
Few questions:
1) Are you asking that we add these to the UNH infrastructure?
I have no opinion here. If the community considers it, we'll
try to help as much as we can.
2) We have DTS already and the community is working on integrating DTS into
DPDK. Does it make sense to understand any gaps in the test cases and
incorporate them in DTS instead?
I'm afraid it can take man/years for DTS to catch up.
I could be wrong here since I don't know DTS status in details.
3) Are there any additional benefits this brings compared to DTS?
I look at DTS long-long time ago. About 5 years ago when we
start our DPDK activities. May be something has changed there -
I don't know. If I'm not mistaken DTS was concentrated on
application-level testing and usage of external tools and HW
(Ixia that days).
I've failed to find set of features/tests in DTS. If you point
me out, I'll try to compare. ethdev features coverage of
these tests - see [1].
[1]
https://ts-factory.io/logs/2022/09/29/dain-sfc-p0-18/tce_log_dpdk_files/463.html
However, it is not about ethdev features only since some
features are absolutely transparent to ethdev and handled
by PMDs and, further, by HW.
In fact, I think it is very hard to compare it vs DST, since
approaches are different.
Some key aspects of these tests:
1. API-level testing using RPC. Basically you say from the
test which function to call remotely with with arguments.
2. Raw socket is used on peer to generate traffic on Tx
(full control which packets to send) and capture on Rx
(i.e. you have everything to analyze what comes from
wire since we try to disable all offloads).
3. Configuration tracking and rollback. I.e. if one test
changes some settings on peer via provided interface,
these changes will be automatically rolled back before
the next test.
4. Of course it can run applications like testpmd remotely,
capture and analyze output. We have tests with testpmd and
l2fwd which report measurement results using dedicated
log messages in JSON format.
5. Infrastructure to report noticed behaviour aspects,
remember and track it. It is essential for regressions
tracking.
6. Tooling to keep logs, view testing results and history
of the corresponding test. I.e. how corresponding test
behaves before on the same or other test configurations.
I need to stop somewhere. It is just few points which come to
my mind right now.
Since we have some HW and testing results, Ivan is submitting
examples of found bugs to DPDK bugzilla.
Anyway it could be useful for ethdev PMD developers and maintainers.
I'll not repeat what is written on the site [1] and documentation (including
the framework [2] and test scenarios documentation [3]) to keep the mail
small enough.
[1] https://ts-factory.io/
[2] https://ts-factory.io/doc/test-environment/
[3] https://ts-factory.io/doc/dpdk-ethdev-ts/
First of all I'd like to thank Xilinx/AMD for making these tests opensource.
Testing framework (Test Environment) used by these tests is an opensource as
well.
The database has examples of testing log for QEMU virtio [4], Solarflare
SFN8522 [5], Intel X710 [6] and Mellanox ConnectX-5 [7] NICs.
[4]
https://ts-factory.io/bublik/v2/runs?runData=pci-
1af4%3BTS_NAME%3Ddpdk-ethdev-ts
[5]
https://ts-factory.io/bublik/v2/runs?runData=pci-
1924%3BTS_NAME%3Ddpdk-ethdev-ts
[6]
https://ts-factory.io/bublik/v2/runs?runData=pci-8086-
1572%3BTS_NAME%3Ddpdk-ethdev-ts
[7]
https://ts-factory.io/bublik/v2/runs?runData=pci-15b3-
1017%3BTS_NAME%3Ddpdk-ethdev-ts
Full list of sample DPDK ethdev logs [8].
[8] https://ts-factory.io/bublik/v2/runs?runData=TS_NAME%3Ddpdk-ethdev-
ts
Testing results are classified into 6 categories. There are 3 results:
passed, failed and skipped (when test fails to do its job because tested
functionality itself or some required per-conditions are not supported).
Each result could be either expected in accordance with filled in expectations
or unexpected if obtained result does not match expectations. These
expectations could differ for different NICs, tested DPDK version etc.
High rate of expected results for SFN8522 and virtio is explained by origin of
the tests. Expectations for these NICs are mostly filled in.
High number of unexpected results for i40e and mlx5 drivers does not mean
these drivers or NICs are bad. First of all it is tests which could be wrong,
too
strict or just have bugs. Second, expectations (because of missing
functionality or known aspects of the behaviour) for these NICs are not filled
in in many-many cases.
Let's get down to few examples of unexpected results.
1. QEMU virtio. VLAN tagged packet is not delivered. it is a virtio testing when
two VMs talk to each other via Linux bridge.
Sent packet is observed on Peer (line 38), but DPDK fails to receive it
(line
65). Most likely it is some kind of misconfiguration.
https://ts-
factory.io/bublik/v2/log/93205?focusId=93288&mode=treeAndinfoAndlog
2. QEMU virtio. Inconsistent number of xstats on get number (38) and actual
get (just 13 returned).
The trick here is that xstats API is called just after
rte_eth_dev_configiure().
I.e. queues are not configured and device is not started yet.
https://ts-
factory.io/bublik/v2/log/93205?focusId=93983&mode=treeAndinfoAndlog
3. Intel X710. CWR TCP flag loss in dummy TSO case (i.e. when TSO payload is
less than TCP MSS).
Sent packet has CWR bit set in TCP flags, but the packet received on Peer
does not have it.
Since it is a dummy TSO case it is hardly critical, but still interesting
aspect
of the behaviour.
https://ts-
factory.io/bublik/v2/log/70553?focusId=72767&mode=treeAndinfoAndlog
4. ConnectX-5. Prepared but stuck on Tx burst single segment TSO packet.
Tx prepare on line 51 accepts the packet, but attempt to transmit fails on
line 62.
Of course the packet layout is specific since typically TSO header goes
in its
own segment,
but the behavbiour is still unfriendly since application never knows if Tx
burst returns 0 since Tx ring is full or something else is wrong.
https://ts-
factory.io/bublik/v2/log/85618?focusId=87546&mode=treeAndinfoAndlog
5. ConnectX-5: Sometimes huge number of tests bring fruits like this when all
further tests fail because of NIC probe failure.
Most likely it was a bug in a particular driver version since the problem
is
not always repeatable.
https://ts-
factory.io/bublik/v2/log/48348?focusId=51740&mode=treeAndinfoAndlog
Performance testing results using testpmd are not representative since used
hosts are too weak and not really tuned for performance testing. If testing
hosts are good, these tests can do its job as well.
For example a number of bugs in net/virtion were found using these tests and
corresponding patches sent upstream in the past.
These tests are fully automatic and suitable for release testing as well as
everyday regressions tracking as soon as expectations are filled in.
Also having expectations for different NICs filled in allows to generate
comparison reports to understand the difference in behaviour and supported
features.
The testing framework supports collection of gcov-based coverage and
generates reports if requested, for example [9] for i40e.
[9] https://ts-factory.io/logs/2022/09/30/fror-x710-p0-7/tce_log_dpdk.html
Besides DPDK ethdev tests there are testing logs for Linux net drivers for same
NICs [10].
[10] https://ts-factory.io/bublik/v2/runs?runData=TS_NAME%3Dnet-drv-ts
Andrew.