You have been subscribed to a public bug:

Issue Environment:
==================

root@npx:~# cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/";
SUPPORT_URL="https://help.ubuntu.com/";
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/";
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy";
UBUNTU_CODENAME=jammy


root@npx:~# uname -r
5.15.0-88-generic


root@npx:~# lscpu | head -n 5
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      52 bits physical, 57 bits virtual
Byte Order:                         Little Endian
CPU(s):                             256


root@npx:~# ethtool -i ens2f0
driver: ice
version: 5.15.0-88-generic
firmware-version: 4.40 0x8001c7d5 1.3534.0
expansion-rom-version:
bus-info: 0000:16:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes


root@npx:~# lspci -s 16:00.0 -vvv
16:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-C for 
SFP (rev 02)
        Subsystem: Intel Corporation Ethernet Network Adapter E810-XXV-4
        Physical Slot: 2
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 16
        NUMA node: 0
        IOMMU group: 19
        Region 0: Memory at 201ffa000000 (64-bit, prefetchable) [size=32M]
        Region 3: Memory at 201ffe030000 (64-bit, prefetchable) [size=64K]
        Expansion ROM at 95800000 [disabled] [size=1M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] MSI-X: Enable+ Count=512 Masked-
                Vector table: BAR=3 offset=00000000
                PBA: BAR=3 offset=00008000
        Capabilities: [a0] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, 
L1 <64us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ 
SlotPowerLimit 0.000W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 512 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ 
TransPend-
                LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM not supported
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 16GT/s (ok), Width x16 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- 
LTR-
                         10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt+ 
EETLPPrefix+, MaxEETLPPrefixes 1
                         EmergencyPowerReduction Not Supported, 
EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 
OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 
2Retimers+ DRS-
                LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, 
EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ 
EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ 
LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [e0] Vital Product Data
                Product Name: Intel(R) Ethernet Network Adapter E810-XXVDA4
                Read-only fields:
                        [V1] Vendor specific: Intel(R) Ethernet Network Adapter 
E810-XXVDA4
                        [PN] Part number: ~PBA-----~
                        [SN] Serial number: ~MAC-------~
                        [V2] Vendor specific: ~WY~
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                End
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC+ UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES- TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
AdvNonFatalErr-
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- 
ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [148 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 1
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [150 v1] Device Serial Number 50-7c-6f-ff-ff-3b-78-30
        Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 64, Total VFs: 64, Number of VFs: 4, Function 
Dependency Link: 00
                VF offset: 8, stride: 1, Device ID: 1889
                Supported Page Size: 00000553, System Page Size: 00000001
                Region 0: Memory at 0000201ffd800000 (64-bit, prefetchable)
                Region 3: Memory at 0000201ffe340000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [1a0 v1] Transaction Processing Hints
                Device specific mode supported
                No steering table available
        Capabilities: [1b0 v1] Access Control Services
                ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- 
EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- 
EgressCtrl- DirectTrans-
        Capabilities: [1d0 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [200 v1] Data Link Feature <?>
        Capabilities: [210 v1] Physical Layer 16.0 GT/s <?>
        Capabilities: [250 v1] Lane Margining at the Receiver <?>
        Kernel driver in use: ice
        Kernel modules: ice


Issue Description:
==================
# echo 1 > /sys/class/net/ens2f0/device/sriov_numvfs

[ 5734.469217] ice 0000:16:00.0: Enabling 1 VFs
[ 5734.574945] pci 0000:16:01.0: [8086:1889] type 00 class 0x020000
[ 5734.574970] pci 0000:16:01.0: enabling Extended Tags
[ 5734.575471] pci 0000:16:01.0: Adding to iommu group 443
[ 5734.575718] ice 0000:16:00.0: Only 0 MSI-X interrupts available for SR-IOV. 
Not enough to support minimum of 2 MSI-X interrupts per VF for 1 VFs
[ 5734.575815] ice 0000:16:00.0: Not enough resources for 1 VFs, try with fewer 
number of VFs
[ 5734.576861] pci 0000:16:01.0: Removing from iommu group 443
[ 5734.623292] iavf: Intel(R) Ethernet Adaptive Virtual Function Network Driver
[ 5734.623297] Copyright (c) 2013 - 2018 Intel Corporation.
[ 5735.598871] ice 0000:16:00.0: Failed to enable SR-IOV: -28


Issue Found:
============
1> After disable RDMA, the VF creation works fine; from kernel code, the MSIx 
are preserved by LAN and RDMA based on CPU cores, this will exhauste all 
available MSIx for larger core systems (some PF port will only have 512 MSIx in 
total), this doesn't make sense as the default value (at least make sure a few 
number VFs can be created successfully if NIC support it)
2> When do the MSIx resource reallocation manually, still raise below error, 
this is some what a strange behavior, it's better to allow such actions by 
default from kernel:
    root@npx:~# devlink resource show pci/0000:16:00.0
    kernel answers: Operation not supported

** Affects: linux-hwe-5.15 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: bot-comment
-- 
VF cannot creation with large CPU core systems when RDMA enabled with intel ice 
driver
https://bugs.launchpad.net/bugs/2044810
You received this bug notification because you are a member of Kernel Packages, 
which is subscribed to linux-hwe-5.15 in Ubuntu.

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to