On Wed, 27 Aug 2025 08:52:26 +0800 "Doraemon" <nobitan...@qq.com> wrote:
> Hello DPDK / net_ice maintainers, > > > We are seeing a reproducible and concerning issue when using the net_ice PMD > with DPDK 22.11.2, and we would appreciate your help diagnosing it. > > > Summary > - Environment: > - DPDK: 22.11.2 > - net_ice PCI device: 8086:159b > - ice kernel driver: 1.12.7 > - NIC firmware: FW 7.3.6111681 (NVM 4.30) > - IOVA mode: PA, VFIO enabled > - Multi-process socket: /var/run/dpdk/PGW/mp_socket > - NUMA: 2, detected lcores: 112 > - Bonding: pmd_bond with bonded devices created (net_bonding0 on port 4, > net_bonding1 on port 5) > - Driver enabled AVX2 OFFLOAD Vector Tx (log shows "ice_set_tx_function(): > Using AVX2 OFFLOAD Vector Tx") > > > - Problem statement: > - Our application calls rte_eth_tx_prepare before calling rte_eth_tx_burst as > part of the normal transmission path. > - After the application has been running for some time (not immediate), the > kernel/driver emits the following messages repeatedly: > - ice_interrupt_handler(): OICR: MDD event > - ice_interrupt_handler(): Malicious Driver Detection event 3 by TCLAN on TX > queue 1025 PF# 1 > - We are using a single TX queue (application-level single queue) and are > sending only one packet per burst (burst size = 1). > - The sequence is: rte_eth_tx_prepare (returns) -> rte_eth_tx_burst -> > MDD events occur later. > - The events affect stability and repeat over time. > > > Relevant startup logs (excerpt) > EAL: Detected CPU lcores: 112 > EAL: Detected NUMA nodes: 2 > EAL: Selected IOVA mode 'PA' > EAL: VFIO support initialized > EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:3b:00.1 (socket 0) > ice_load_pkg_type(): Active package is: 1.3.45.0, ICE COMMS Package (double > VLAN mode) > ice_dev_init(): FW 7.3.6111681 API 1.7 > ... > bond_probe(3506) - Initializing pmd_bond for net_bonding0 > bond_probe(3592) - Create bonded device net_bonding0 on port 4 in mode 1 on > socket 0. > ... > ice_set_tx_function(): Using AVX2 OFFLOAD Vector Tx (port 0). > TELEMETRY: No legacy callbacks, legacy socket not created > > > What we have tried / preliminary observations > - Confirmed application calls rte_eth_tx_prepare prior to rte_eth_tx_burst. > - Confirmed single TX queue configuration and small bursts (size = 1) — not > high-rate, not a typical high-burst/malicious pattern. > - The MDD log identifies "TX queue 1025"; unclear how that maps to our > DPDK queue numbering (we use queue 0 in the app). > - No obvious other DPDK errors at startup; interface initializes > normally and vector TX is enabled. > - We suspect the driver's Malicious Driver Detection (MDD) is triggering due > to some descriptor/doorbell ordering or offload interaction, possibly related > to AVX2 Vector Tx offload. > > > Questions / requests to the maintainers > 1. What specifically triggers "MDD event 3 by TCLAN" in net_ice? > Which driver check/threshold corresponds to event type 3? > 2. How is the "TX queue 1025" value computed/mapped in the log? > (Is it queue id + offset, VF mapping, or an internal vector id?) We > need to map that log value to our DPDK queue index. > 3. Can the rte_eth_tx_prepare + rte_eth_tx_burst call pattern cause MDD > detections under any circumstances? If so, are there recommended usage > patterns or ordering constraints to avoid false positives? > 4. Are there known firmware/driver/DPDK version combinations with > similar MDD behavior? Do you recommend specific NIC firmware, kernel > driver, or DPDK versions as a workaround/fix? > 5. Any suggested workarounds we can test quickly (e.g., disable vector > TX offload, disable specific HW offloads, change interrupt/queue bindings, or > adjust doorbell behavior)? > > > > > Best regards. Did you make sure that the source address of the packet matches the MAC address of teh VF?