[Kernel-packages] [Bug 2062384] Re: mlxbf-gige: autonegotiation fails to complete on BF2
** Also affects: linux-bluefield (Ubuntu Focal) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2062384 Title: mlxbf-gige: autonegotiation fails to complete on BF2 Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Focal: New Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] During the reboot test, QA found an intermittent issue where the OOB link is down. The link is down because the KSZ9031 PHY fails to complete autonegotiation. Even under "normal" circumstances where autonegotiation completes, it takes an abnormal time to do so (on average, at least 8 seconds). Hence, the hardware team and Microchip are involved in this debug but the root cause is still unknown. In the meantime, we need to provide a software workaround since customers are starting to see this issue as well. [Fix] * Restart autonegotiation when it fails the first time. [Test Case] * On BF2, Do the reboot test: 2000 loops. * Check that the OOB link is up and ip is assigned. [Regression Potential] * no known regression. [Other] * Note that this issue is BF2 hardware specific. The same ethernet code is used for BF3 and we don't see any issues. In fact, the link up time on BF3 <= 1s. On BF2, the link up time is > 8s. * we have been aware of this issue for 2 years and have shared this with the PHY vendor and the hardware team but there were not root causes identified. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2062384/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2064163] Re: mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2064163 Title: mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad state, resulting in no ip provisioning. The only way to recover is to powercycle. We might have found a software workaround to avoid getting in this state in the first place: suspend the PHY during graceful shutdown. Suspend the PHY = Power down = set bit 11 to 1 in reg 0 of the PHY. This WA passed 1800 reboots on QA's setup. [Fix] * During reboot, the mlxbf_gige_shutdown() function makes a call to phy_stop(). phy_stop() calls phy_suspend(). * Certain Linux PHY drivers, like the Vitesse PHY, don't support suspend() to power down the PHY during shutdown. * Our Hardware also does not toggle the hard reset signal of the PHY during reboot. * Hence, when the PHY is in a bad state, it stays in its bad state until powercycle. * We have found a way to prevent the PHY from entering this bad state by suspending the PHY in the case of reboot. [Test Case] * do the reboot test (at least 2000 reboots): run 'reboot' from linux. * Check that the oob_net0 interface is up and the ip is assigned. * please note that if the the OOB doesn't get an ip, try reloading the driver (rmmod/modprobe). it that solves the issue, that would be a different bug. In the bug at stake, nothing recovers the OOB ip except power cycle. [Regression Potential] * Make sure the redfish DHCP is still working during the reboot test * Make sure the OOB gets an ip [Other] These changes were made both in the mlxbf-gige driver and UEFI To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2064163/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2063210] [NEW] New upstream relase 550.76
Public bug reported: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] ** Affects: nvidia-graphics-drivers-550 (Ubuntu) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-550 (Ubuntu Bionic) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-550 (Ubuntu Focal) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-550 (Ubuntu Jammy) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-550 (Ubuntu Mantic) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-550 (Ubuntu Noble) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-550 (Ubuntu Noble) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-550 (Ubuntu Jammy) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-550 (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-550 (Ubuntu Bionic) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-550 (Ubuntu Mantic) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-550 in Ubuntu. https://bugs.launchpad.net/bugs/2063210 Title: New upstream relase 550.76 Status in nvidia-graphics-drivers-550 package in Ubuntu: New Status in nvidia-graphics-drivers-550 source package in Bionic: New Status in nvidia-graphics-drivers-550 source package in Focal: New Status in nvidia-graphics-drivers-550 source package in Jammy: New Status in nvidia-graphics-drivers-550 source package in Mantic: New Status in nvidia-graphics-drivers-550 source package in Noble: New Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-550/+bug/2063210/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2062384] Re: mlxbf-gige: autonegotiation fails to complete on BF2
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2062384 Title: mlxbf-gige: autonegotiation fails to complete on BF2 Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] During the reboot test, QA found an intermittent issue where the OOB link is down. The link is down because the KSZ9031 PHY fails to complete autonegotiation. Even under "normal" circumstances where autonegotiation completes, it takes an abnormal time to do so (on average, at least 8 seconds). Hence, the hardware team and Microchip are involved in this debug but the root cause is still unknown. In the meantime, we need to provide a software workaround since customers are starting to see this issue as well. [Fix] * Restart autonegotiation when it fails the first time. [Test Case] * On BF2, Do the reboot test: 2000 loops. * Check that the OOB link is up and ip is assigned. [Regression Potential] * no known regression. [Other] * Note that this issue is BF2 hardware specific. The same ethernet code is used for BF3 and we don't see any issues. In fact, the link up time on BF3 <= 1s. On BF2, the link up time is > 8s. * we have been aware of this issue for 2 years and have shared this with the PHY vendor and the hardware team but there were not root causes identified. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2062384/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2059279] Re: mlxbf_gige: replace SAUCE patch for pause frame counters
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2059279 Title: mlxbf_gige: replace SAUCE patch for pause frame counters Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The support for mlxbf_gige pause frame counters was added to Jammy via a SAUCE patch. After upstream review, a small issue was fixed that makes the SAUCE patch different from the upstream commit. [Fix] The fix is to replace the SAUCE patch in the Jammy repo with the upstream equivalent. [Test Case] * Boot BF platform and bring up "oob_net0" interface * Execute the command "ethtool -I -a oob_net0" to get baseline stats * Send heavy traffic into "oob_net0" interface * Re-run the above ethtool command, noting the pause frame counters [Regression Potential] There is low potential for regression as this brings in upstream content. [Other] None To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2059279/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2059310] Re: mlxbf_gige: call request_irq() after NAPI initialized
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2059310 Title: mlxbf_gige: call request_irq() after NAPI initialized Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The mlxbf_gige driver encounters a NULL pointer exception in mlxbf_gige_open() when kdump is enabled. The exception happens because there is a pending RX interrupt before the call to request_irq(RX IRQ) executes. Then, the RX IRQ handler fires immediately after this request_irq() completes. The RX IRQ handler runs "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()" and "napi_enable()", both which happen later in the open() logic. [Fix] The logic in mlxbf_gige_open() must fully initialize NAPI before any calls to request_irq() execute. [Test Case] * Boot BF platform and bring up "oob_net0" interface * Enable kdump completely * Trigger kdump via "echo c > /proc/sysrq-trigger" * There should be no exceptions from mlxbf_gige driver [Regression Potential] There is low potential for regression as this brings in upstream content. [Other] None To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2059310/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2059951] Re: mlxbf_gige: stop interface during shutdown
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2059951 Title: mlxbf_gige: stop interface during shutdown Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The mlxbf_gige driver shutdown() is invoked during reboot processing, but the stop() logic is not guaranteed to execute on all distros. If stop() does not execute NAPI remains enabled during system shutdown and can cause an exception if NAPI is scheduled when interface is shutdown but not stopped. [Fix] The networking interface managed by the mlxbf_gige driver must be stopped during reboot processing so that it is put into a clean state and driver callbacks won't be called. [Test Case] * Put BF platform into a reboot loop * Ensure that BF platform brings up "oob_net0" interface each reboot, and that no mlxbf_gige driver exceptions occur [Regression Potential] There is low potential for regression as this brings in upstream content. [Other] None To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2059951/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2059961] Re: genetlink: fix single op policy dump when do is present
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2059961 Title: genetlink: fix single op policy dump when do is present Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: intro - Our internal test triggers a kernel crash dump below [ 888.690348] Sun Mar 24 23:51:59 2024: DriVerTest - Start Test [ 888.691834] [ 888.983912] mlx5_core :08:00.1 eth3: Link up [ 888.987644] IPv6: ADDRCONF(NETDEV_CHANGE): eth3: link becomes ready [ 889.336577] mlx5_core :08:00.0 eth2: Link up [ 894.635836] Sun Mar 24 11:52:04 PM IST 2024 - DriVerTest Debug Heartbeat [ 940.431644] general protection fault, probably for non-canonical address 0x80020014: [#1] SMP NOPTI [ 940.432866] CPU: 7 PID: 94305 Comm: ethtool Tainted: G OE 5.15.0-1039.17.g0d63875-bluefield #1 [ 940.433970] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 940.435220] RIP: 0010:netlink_policy_dump_add_policy+0x95/0x160 [ 940.435893] Code: 48 c1 e0 04 4c 8b 34 01 4d 85 f6 74 5b 31 db eb 10 4c 89 e8 83 c3 01 48 c1 e0 04 39 5c 01 08 72 3f 89 d8 48 c1 e0 04 4c 01 f0 <0f> b6 10 83 ea 08 83 fa 01 77 dc 0f b7 50 02 48 8b 70 08 48 8d 7c [ 940.437921] RSP: 0018:ffa002d37a08 EFLAGS: 00010286 [ 940.438551] RAX: 80020014 RBX: RCX: ff1100027d00 [ 940.439351] RDX: fff8 RSI: 0018 RDI: ffa002d37a10 [ 940.440131] RBP: 0003 R08: 0040 R09: ff1100027d2d0f10 [ 940.440900] R10: 0318 R11: R12: ff1100011fa59bc0 [ 940.441683] R13: 0004 R14: 80020014 R15: 83fa6540 [ 940.442459] FS: 7f4a17993740() GS:ff1100085f9c() knlGS: [ 940.443394] CS: 0010 DS: ES: CR0: 80050033 [ 940.444044] CR2: 00429f50 CR3: 00012fc2e002 CR4: 00771ee0 [ 940.444847] DR0: DR1: DR2: [ 940.445639] DR3: DR6: fffe0ff0 DR7: 0400 [ 940.446431] PKRU: 5554 [ 940.446795] Call Trace: [ 940.447144] [ 940.447444] ? __die_body+0x1b/0x60 [ 940.447880] ? die_addr+0x39/0x60 [ 940.448315] ? exc_general_protection+0x1bc/0x3c0 [ 940.448867] ? asm_exc_general_protection+0x22/0x30 [ 940.449445] ? netlink_policy_dump_add_policy+0x95/0x160 [ 940.450058] ? netlink_policy_dump_add_policy+0xb2/0x160 [ 940.450714] ? ethtool_get_phc_vclocks+0x70/0x70 [ 940.451272] ctrl_dumppolicy_start+0xc4/0x2a0 [ 940.451788] ? ethnl_reply_init+0xd0/0xd0 [ 940.452284] ? __nla_parse+0x22/0x30 [ 940.452734] ? __cond_resched+0x15/0x30 [ 940.453211] ? kmem_cache_alloc_trace+0x44/0x390 [ 940.453750] genl_start+0xc3/0x150 [ 940.454179] __netlink_dump_start+0x175/0x250 [ 940.454706] genl_family_rcv_msg_dumpit.isra.0+0x9a/0x100 [ 940.455334] ? genl_family_rcv_msg_attrs_parse.isra.0+0xe0/0xe0 [ 940.455998] ? genl_unlock+0x20/0x20 [ 940.456453] ? genl_parallel_done+0x40/0x40 [ 940.456957] genl_rcv_msg+0x11f/0x2b0 [ 940.457421] ? genl_get_cmd+0x170/0x170 [ 940.457890] ? ctrl_dumppolicy_put_op.isra.0+0x1e0/0x1e0 [ 940.458515] ? genl_lock_done+0x60/0x60 [ 940.458987] ? genl_family_rcv_msg_doit.isra.0+0x110/0x110 [ 940.459634] netlink_rcv_skb+0x54/0x100 [ 940.460107] genl_rcv+0x24/0x40 [ 940.460504] netlink_unicast+0x18d/0x230 [ 940.460983] netlink_sendmsg+0x240/0x4a0 [ 940.461472] __sock_sendmsg+0x2f/0x40 [ 940.461922] __sys_sendto+0xee/0x160 [ 940.462384] ? __sys_recvmsg+0x56/0xa0 [ 940.462854] ? exit_to_user_mode_prepare+0x35/0x170 [ 940.463439] __x64_sys_sendto+0x25/0x30 [ 940.463906] do_syscall_64+0x35/0x80 [ 940.464368] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 940.464955] RIP: 0033:0x7f4a17aa940a [ 940.465415] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 [ 940.467418] RSP: 002b:7ffc3612cac8 EFLAGS: 0246 ORIG_RAX: 002c [ 940.468284] RAX: ffda RBX: 00c3b3b0 RCX: 7f4a17aa940a [ 940.469057] RDX: 0024 RSI: 00c3b3b0 RDI: 0003 [ 940.469852] RBP: 00c3b2a0 R08: 7f4a17ba4200 R09: 000c [ 940.470674] R10:
[Kernel-packages] [Bug 2058498] Re: pwr-mlxbf: support Bobcat graceful shutdown via gpio6
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2058498 Title: pwr-mlxbf: support Bobcat graceful shutdown via gpio6 Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] OCP3.0 project was discontinued and canceled so all stale code related to that was removed. The HID MLNXBF29 is used now for triggering a graceful shutdown for the Bobcat board. On the bobcat board, the main board cpld issues the shutdown request to the dpu cpld. the dpu cpld issues that request to ARM via GPIO6 and ARM should trigger a graceful shutdown and set the ARM boot progress to 6. [Fix] The HID MLNXBF29 is used now for triggering a graceful shutdown for the Bobcat board. On the bobcat board, the main board cpld issues the shutdown request to the dpu cpld. the dpu cpld issues that request to ARM via GPIO6 and ARM should trigger a graceful shutdown and set the ARM boot progress to 6. [Test Case] * Test shutdown on Bobcat via CPLD command. [Regression Potential] * no known regression as the OCP3.0 board was discontinued. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2058498/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2059951] Re: mlxbf_gige: stop interface during shutdown
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2059951 Title: mlxbf_gige: stop interface during shutdown Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] The mlxbf_gige driver shutdown() is invoked during reboot processing, but the stop() logic is not guaranteed to execute on all distros. If stop() does not execute NAPI remains enabled during system shutdown and can cause an exception if NAPI is scheduled when interface is shutdown but not stopped. [Fix] The networking interface managed by the mlxbf_gige driver must be stopped during reboot processing so that it is put into a clean state and driver callbacks won't be called. [Test Case] * Put BF platform into a reboot loop * Ensure that BF platform brings up "oob_net0" interface each reboot, and that no mlxbf_gige driver exceptions occur [Regression Potential] There is low potential for regression as this brings in upstream content. [Other] None To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2059951/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2059310] Re: mlxbf_gige: call request_irq() after NAPI initialized
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2059310 Title: mlxbf_gige: call request_irq() after NAPI initialized Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] The mlxbf_gige driver encounters a NULL pointer exception in mlxbf_gige_open() when kdump is enabled. The exception happens because there is a pending RX interrupt before the call to request_irq(RX IRQ) executes. Then, the RX IRQ handler fires immediately after this request_irq() completes. The RX IRQ handler runs "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()" and "napi_enable()", both which happen later in the open() logic. [Fix] The logic in mlxbf_gige_open() must fully initialize NAPI before any calls to request_irq() execute. [Test Case] * Boot BF platform and bring up "oob_net0" interface * Enable kdump completely * Trigger kdump via "echo c > /proc/sysrq-trigger" * There should be no exceptions from mlxbf_gige driver [Regression Potential] There is low potential for regression as this brings in upstream content. [Other] None To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2059310/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2059961] Re: genetlink: fix single op policy dump when do is present
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2059961 Title: genetlink: fix single op policy dump when do is present Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: intro - Our internal test triggers a kernel crash dump below [ 888.690348] Sun Mar 24 23:51:59 2024: DriVerTest - Start Test [ 888.691834] [ 888.983912] mlx5_core :08:00.1 eth3: Link up [ 888.987644] IPv6: ADDRCONF(NETDEV_CHANGE): eth3: link becomes ready [ 889.336577] mlx5_core :08:00.0 eth2: Link up [ 894.635836] Sun Mar 24 11:52:04 PM IST 2024 - DriVerTest Debug Heartbeat [ 940.431644] general protection fault, probably for non-canonical address 0x80020014: [#1] SMP NOPTI [ 940.432866] CPU: 7 PID: 94305 Comm: ethtool Tainted: G OE 5.15.0-1039.17.g0d63875-bluefield #1 [ 940.433970] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 940.435220] RIP: 0010:netlink_policy_dump_add_policy+0x95/0x160 [ 940.435893] Code: 48 c1 e0 04 4c 8b 34 01 4d 85 f6 74 5b 31 db eb 10 4c 89 e8 83 c3 01 48 c1 e0 04 39 5c 01 08 72 3f 89 d8 48 c1 e0 04 4c 01 f0 <0f> b6 10 83 ea 08 83 fa 01 77 dc 0f b7 50 02 48 8b 70 08 48 8d 7c [ 940.437921] RSP: 0018:ffa002d37a08 EFLAGS: 00010286 [ 940.438551] RAX: 80020014 RBX: RCX: ff1100027d00 [ 940.439351] RDX: fff8 RSI: 0018 RDI: ffa002d37a10 [ 940.440131] RBP: 0003 R08: 0040 R09: ff1100027d2d0f10 [ 940.440900] R10: 0318 R11: R12: ff1100011fa59bc0 [ 940.441683] R13: 0004 R14: 80020014 R15: 83fa6540 [ 940.442459] FS: 7f4a17993740() GS:ff1100085f9c() knlGS: [ 940.443394] CS: 0010 DS: ES: CR0: 80050033 [ 940.444044] CR2: 00429f50 CR3: 00012fc2e002 CR4: 00771ee0 [ 940.444847] DR0: DR1: DR2: [ 940.445639] DR3: DR6: fffe0ff0 DR7: 0400 [ 940.446431] PKRU: 5554 [ 940.446795] Call Trace: [ 940.447144] [ 940.447444] ? __die_body+0x1b/0x60 [ 940.447880] ? die_addr+0x39/0x60 [ 940.448315] ? exc_general_protection+0x1bc/0x3c0 [ 940.448867] ? asm_exc_general_protection+0x22/0x30 [ 940.449445] ? netlink_policy_dump_add_policy+0x95/0x160 [ 940.450058] ? netlink_policy_dump_add_policy+0xb2/0x160 [ 940.450714] ? ethtool_get_phc_vclocks+0x70/0x70 [ 940.451272] ctrl_dumppolicy_start+0xc4/0x2a0 [ 940.451788] ? ethnl_reply_init+0xd0/0xd0 [ 940.452284] ? __nla_parse+0x22/0x30 [ 940.452734] ? __cond_resched+0x15/0x30 [ 940.453211] ? kmem_cache_alloc_trace+0x44/0x390 [ 940.453750] genl_start+0xc3/0x150 [ 940.454179] __netlink_dump_start+0x175/0x250 [ 940.454706] genl_family_rcv_msg_dumpit.isra.0+0x9a/0x100 [ 940.455334] ? genl_family_rcv_msg_attrs_parse.isra.0+0xe0/0xe0 [ 940.455998] ? genl_unlock+0x20/0x20 [ 940.456453] ? genl_parallel_done+0x40/0x40 [ 940.456957] genl_rcv_msg+0x11f/0x2b0 [ 940.457421] ? genl_get_cmd+0x170/0x170 [ 940.457890] ? ctrl_dumppolicy_put_op.isra.0+0x1e0/0x1e0 [ 940.458515] ? genl_lock_done+0x60/0x60 [ 940.458987] ? genl_family_rcv_msg_doit.isra.0+0x110/0x110 [ 940.459634] netlink_rcv_skb+0x54/0x100 [ 940.460107] genl_rcv+0x24/0x40 [ 940.460504] netlink_unicast+0x18d/0x230 [ 940.460983] netlink_sendmsg+0x240/0x4a0 [ 940.461472] __sock_sendmsg+0x2f/0x40 [ 940.461922] __sys_sendto+0xee/0x160 [ 940.462384] ? __sys_recvmsg+0x56/0xa0 [ 940.462854] ? exit_to_user_mode_prepare+0x35/0x170 [ 940.463439] __x64_sys_sendto+0x25/0x30 [ 940.463906] do_syscall_64+0x35/0x80 [ 940.464368] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 940.464955] RIP: 0033:0x7f4a17aa940a [ 940.465415] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 [ 940.467418] RSP: 002b:7ffc3612cac8 EFLAGS: 0246 ORIG_RAX: 002c [ 940.468284] RAX: ffda RBX: 00c3b3b0 RCX: 7f4a17aa940a [ 940.469057] RDX: 0024 RSI: 00c3b3b0 RDI: 0003 [ 940.469852] RBP: 00c3b2a0 R08: 7f4a17ba4200 R09: 000c [ 940.470674] R10:
[Kernel-packages] [Bug 2059147] [NEW] New upstream relase 535.171.04
Public bug reported: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] ** Affects: nvidia-graphics-drivers-535 (Ubuntu) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-535 (Ubuntu Bionic) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-535 (Ubuntu Focal) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-535 (Ubuntu Jammy) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-535 (Ubuntu Mantic) Importance: Undecided Status: New ** Affects: nvidia-graphics-drivers-535 (Ubuntu Noble) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-535 (Ubuntu Bionic) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-535 (Ubuntu Mantic) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-535 (Ubuntu Noble) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-535 (Ubuntu Jammy) Importance: Undecided Status: New ** Also affects: nvidia-graphics-drivers-535 (Ubuntu Focal) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-535 in Ubuntu. https://bugs.launchpad.net/bugs/2059147 Title: New upstream relase 535.171.04 Status in nvidia-graphics-drivers-535 package in Ubuntu: New Status in nvidia-graphics-drivers-535 source package in Bionic: New Status in nvidia-graphics-drivers-535 source package in Focal: New Status in nvidia-graphics-drivers-535 source package in Jammy: New Status in nvidia-graphics-drivers-535 source package in Mantic: New Status in nvidia-graphics-drivers-535 source package in Noble: New Bug description: [Impact] These releases provide both bug fixes and new features, and we would like to make sure all of our users have access to these improvements. See the changelog entry below for a full list of changes and bugs. [Test Case] The following development and SRU process was followed: https://wiki.ubuntu.com/NVidiaUpdates Certification test suite must pass on a range of hardware: https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu The QA team that executed the tests will be in charge of attaching the artifacts and console output of the appropriate run to the bug. nVidia maintainers team members will not mark ‘verification-done’ until this has happened. [Regression Potential] In order to mitigate the regression potential, the results of the aforementioned system level tests are attached to this bug. [Discussion] [Changelog] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-535/+bug/2059147/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2053155] Re: Add DPLL and syncE support
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2053155 Title: Add DPLL and syncE support Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: * intro Synchronous Ethernet, or SyncE, is an ITU-T standard for computer networking that facilitates the transference of clock signals over the Ethernet physical layer. It is used to pass timing from node to node and is particularly important for mobile networks. The DPLL subsystem in the Linux kernel provides a general interface for configuring devices that use any kind of Digital PLL. This subsystem is designed to manage the clock signal synchronization of a device with an external clock signal. * Explain the bug(s) We need to support mlx5 SyncE feature. The following patches are needed. [net-next,v8,0/9] Create common DPLL configuration API [net-next,v8,1/9] dpll: documentation on DPLL subsystem interface [net-next,v8,2/9] dpll: spec: Add Netlink spec in YAML [net-next,v8,3/9] dpll: core: Add DPLL framework base functions [net-next,v8,4/9] dpll: netlink: Add DPLL framework base functions [net-next,v8,5/9] netdev: expose DPLL pin handle for netdevice [net-next,v8,6/9] ice: add admin commands to access cgu configuration [net-next,v8,7/9] ice: implement dpll interface to control cgu [net-next,v8,8/9] ptp_ocp: implement DPLL ops [net-next,v8,9/9] mlx5: Implement SyncE support using DPLL infrastructure https://lore.kernel.org/netdev/20230913204943.1051233-1-vadim.fedore...@linux.dev/ * Brief explanation of fixes We identify several dependent patches, especially related to netlink gap between current master-next. We cherry-pick/backport series of patches related to netlink. * How to test $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --dump device-get ex: root@bfqa-dell013-roy-oob:~/mlnx-ofa_kernel-4.0# /root/tools-net/ynl/cli.py --spec ~/netlink/specs/dpll.yaml --dump device-get [{'clock-id': 5237736944144095348, 'id': 0, 'lock-status': 'unlocked', 'mode': 'manual', 'mode-supported': ['manual'], 'module-name': 'mlx5_core', 'type': 'eec'}] $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-get ex: root@bfqa-dell013-roy-oob:~# /root/tools-net/ynl/cli.py --spec ~/netlink/specs/dpll.yaml --dump pin-get [{'capabilities': 4, 'clock-id': 5237736944144095348, 'id': 0, 'module-name': 'mlx5_core', 'parent-device': [{'direction': 'input', 'parent-id': 0, 'state': 'disconnected'}], 'phase-adjust-max': 0, 'phase-adjust-min': 0, 'type': 'synce-eth-port'}, {'capabilities': 4, 'clock-id': 5237736944144095348, 'id': 1, 'module-name': 'mlx5_core', 'parent-device': [{'direction': 'input', 'parent-id': 0, 'state': 'disconnected'}], 'phase-adjust-max': 0, 'phase-adjust-min': 0, 'type': 'synce-eth-port'}] * detect whether your device supports DPLL/SyncE root@bfqa-dell013-roy-oob:~/linux-bluefield-jammy# mlxreg -d 03:00.0 --reg_name MCAM --get -i "access_reg_group=2,feature_group=0" Field Name| Data === access_reg_group | 0x0002 feature_group | 0x mng_access_reg_cap_mask[0]| 0x0004 mng_access_reg_cap_mask[1]| 0x0060 --> must see 6 OR, $ mlxfwmanager showing “Enhanced-SyncE & PTP GM support”, * list of patches applied to 5.15 jammy based on 911f816f4c04 mlxbf_gige: fix receive packet race condition we applied the following 83a11d94c436 UBUNTU: SAUCE: fix build error after resv_start_op 036b2fecd315 genetlink: allow families to use split ops directly 9f40a82f73ea genetlink: inline old iteration helpers d8ae137b98fc genetlink: use iterator in the op to policy map dumping 095792a3a723 genetlink: add iterator for walking family ops 764747ba0085 genetlink: limit the use of validation workarounds to old ops 35b95f016392 genetlink: inline genl_get_cmd() c448680485e6 genetlink: support split policies in ctrl_dumppolicy_put_op() a17efffb8ce8 genetlink: add policies for both doit and dumpit in ctrl_dumppolicy_start() 82af441de2fa genetlink: check for callback type at op load time dc17c9675d6d genetlink: load policy based on validation flags d867b1e130d3 genetlink: move the private fields in struct genl_family b515a3664ef5 genetlink: piggy back on resv_op to default to a reject policy 1fa6e0ec60a4 genetlink: refactor the cmd <> policy mapping dump c8ba54011c1d netlink: add helpers for extack attr presence checking
[Kernel-packages] [Bug 2055052] Re: net/sched: flower: Add lock protection when remove filter handle
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2055052 Title: net/sched: flower: Add lock protection when remove filter handle Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: * intro As IDR can't protect itself from the concurrent modification, place idr_remove() under the protection of tp->lock. * how to fix need to backport the patch from Jianbo "net/sched: flower: Add lock protection when remove filter handle" https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=1fde0ca3a0de7e9f917668941156959dd5e9108b * reproduce and verification It's hard to reproduce because it is a race condition we found in k8s+ovn deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2055052/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2055086] Re: mlxbf_gige: add support for pause frame counters
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2055086 Title: mlxbf_gige: add support for pause frame counters Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The BlueField-2 and BlueField-3 silicon designs include a provision for reporting pause frame counters, but the driver (mlxbf_gige) does not have software support yet. [Fix] The fix is to update the mlxbf_gige driver to support the "get_pause_stats()" callback, which enables display of pause frame counters via "ethtool -I -a ooob_net0". The support of this callback was added to Linux kernel 5.9.0 [Test Case] * Reboot BF platform, and configure oob_net0 interface * Issue command "ethtool -I -a oob_net0" to display pause frame counters * Send in multiple streams of traffic to oob_net0 interface, e.g. from two or three simultaneous SCP operations * Re-issue "ethtool -I -a oob_net0" and note difference in output [Regression Potential] The changed logic extends the ethtool support, so low risk for any problem. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2055086/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2054845] Re: mlxbf-gige: support fixed phy for Bobcat
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2054845 Title: mlxbf-gige: support fixed phy for Bobcat Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Bobcat is a new board which doesn't have an external PHY connected to the OOB. There are no MDIO busses involved and no PHY involved. The OOB is directly connected to the switch (MAC to MAC). SO we need to use the linux fixed phy to be able to emulate the mdio behavior and load the ethernet driver. [Fix] * Register the fixed-phy register in the case of the bobcat. [Test Case] * Important: For testing on the bobcat board, make sure the corresponding UEFI image is also loaded, otherwise, the oob driver will fail to load. For other board, it is backward compatible. * Check if the mlxbf-gige is loaded on bobcat * Check that it is using the fixed-phy via dmesg | grep PHY * check that the oob interface is pingable * check that files can be copied over via oob interface * rmmod/modprobe * ifconfig up/down * reboot test [Regression Potential] * Redo all the OOB testing on other boards (moonraker) and make sure the bobcat changes dont trigger any regressions. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2054845/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2056364] Re: Add test script for DPLL
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2056364 Title: Add test script for DPLL Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: * intro In bug 2053155 "Add DPLL and syncE support" below: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2053155 It requires using a yaml spec file, dpll.yaml, and a python script, cli.py, to verify the correctness. ex: $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --dump device-get We've found that the script and spec file are missing in current repo (Ubuntu-bluefield-5.15.0-1037.39). * how to fix Since the existing Bluefield-5.15 doesn't have the tools/net/ynl directory, the efforts to cherry-pick all individual patches shouldn't be too hard due to no dependencies and most likely no conflict, but there are around 200 patches in tools/net/ynl $ git log --oneline tools/net/ynl/ | wc -l 205 and for Documentation/netlink/genetlink.yaml (a dependent file for dpll.yaml) $ git log --oneline Documentation/netlink/genetlink.yaml | wc -l 15 So we decided to just create a new patch consisting all the required files, shown below: create mode 100644 Documentation/netlink/genetlink.yaml create mode 100644 tools/net/ynl/cli.py create mode 100644 tools/net/ynl/lib/__init__.py create mode 100644 tools/net/ynl/lib/nlspec.py create mode 100644 tools/net/ynl/lib/ynl.py To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2056364/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2056718] Re: openvswitch gentling validation warning: missing .resv_start_op
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2056718 Title: openvswitch gentling validation warning: missing .resv_start_op Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Intro: == When hit a kernel warning when loading openvswitch kernel module. Digging into the source code, we found it's due to the code snippet if (WARN_ON(i.cmd >= family->resv_start_op && (i.doit.validate || i.dumpit.validate))) return -EINVAL; in the gene_validate_ops() in net/netlink/genetlink.c, introduced in 108880a07bab genetlink: add iterator for walking family ops from buglink about DPLL/SynCE https://bugs.launchpad.net/bugs/2053155 How to fix: === We need to cherry-pick the missing patch Fixes: e4ba4554209f ("net: openvswitch: add missing .resv_start_op") Author: Jakub Kicinski Date: Thu Oct 27 20:25:01 2022 -0700 net: openvswitch: add missing .resv_start_op I missed one of the families in OvS when annotating .resv_start_op. This triggers the warning added in commit ce48ebdd5651 ("genetlink: limit the use of validation workarounds to old ops"). Reported-by: syzbot+40eb8c0447c0e47a7...@syzkaller.appspotmail.com Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes") Link: https://lore.kernel.org/r/20221028032501.2724270-1-k...@kernel.org Signed-off-by: Jakub Kicinski Thanks! How to reproduce: = simply load the openvswitch.ko and dmesg [ 1083.518212] WARNING: CPU: 2 PID: 17269 at net/netlink/genetlink.c:554 genl_validate_ops+0x134/0x254 ... [ 1083.518306] CPU: 2 PID: 17269 Comm: modprobe Tainted: GW OE 5.15.0-1037.39.10.g319565b-bluefield #g319565b [ 1083.518309] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.7.0.13056 Feb 28 2024 [ 1083.518311] pstate: 0049 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1083.518313] pc : genl_validate_ops+0x134/0x254 [ 1083.518315] lr : genl_validate_ops+0x68/0x254 [ 1083.518317] sp : 8a773810 [ 1083.518318] x29: 8a773810 x28: 8a773ba0 x27: b1ea36f87318 [ 1083.518321] x26: b1ea36f8cd20 x25: 0001 x24: b1ea36f8cda8 [ 1083.518323] x23: x22: 0001 x21: b1ea36f87210 [ 1083.518325] x20: b1ea36f8b410 x19: 0001 x18: [ 1083.518328] x17: 000d00020008 x16: b1ea4b70c2d0 x15: 003c00010006 [ 1083.518330] x14: 68746170 x13: x12: 0001 [ 1083.518332] x11: x10: x9 : b1ea4b709a5c [ 1083.518335] x8 : x7 : x6 : b1ea4d4218c0 [ 1083.518337] x5 : 0004 x4 : x3 : 0001 [ 1083.518339] x2 : x1 : x0 : 0003 [ 1083.518341] Call trace: [ 1083.518343] genl_validate_ops+0x134/0x254 [ 1083.518344] genl_register_family+0x30/0x1f4 [ 1083.518347] dp_init+0xd4/0x174 [openvswitch] [ 1083.518360] do_one_initcall+0x4c/0x250 [ 1083.518364] do_init_module+0x50/0x260 [ 1083.518368] load_module+0x9fc/0xbe0 [ 1083.518370] __do_sys_finit_module+0xa8/0x114 [ 1083.518372] __arm64_sys_finit_module+0x28/0x3c [ 1083.518375] invoke_syscall+0x78/0x100 [ 1083.518379] el0_svc_common.constprop.0+0x54/0x184 [ 1083.518381] do_el0_svc+0x30/0xac [ 1083.518383] el0_svc+0x48/0x160 [ 1083.518387] el0t_64_sync_handler+0xa4/0x12c [ 1083.518390] el0t_64_sync+0x1a4/0x1a8 [ 1083.518392] ---[ end trace ec4279298c2ae7be ]--- [ 1083.830668] openvswitch: Open vSwitch switching datapath To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2056718/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2056718] Re: openvswitch gentling validation warning: missing .resv_start_op
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2056718 Title: openvswitch gentling validation warning: missing .resv_start_op Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: Intro: == When hit a kernel warning when loading openvswitch kernel module. Digging into the source code, we found it's due to the code snippet if (WARN_ON(i.cmd >= family->resv_start_op && (i.doit.validate || i.dumpit.validate))) return -EINVAL; in the gene_validate_ops() in net/netlink/genetlink.c, introduced in 108880a07bab genetlink: add iterator for walking family ops from buglink about DPLL/SynCE https://bugs.launchpad.net/bugs/2053155 How to fix: === We need to cherry-pick the missing patch Fixes: e4ba4554209f ("net: openvswitch: add missing .resv_start_op") Author: Jakub Kicinski Date: Thu Oct 27 20:25:01 2022 -0700 net: openvswitch: add missing .resv_start_op I missed one of the families in OvS when annotating .resv_start_op. This triggers the warning added in commit ce48ebdd5651 ("genetlink: limit the use of validation workarounds to old ops"). Reported-by: syzbot+40eb8c0447c0e47a7...@syzkaller.appspotmail.com Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes") Link: https://lore.kernel.org/r/20221028032501.2724270-1-k...@kernel.org Signed-off-by: Jakub Kicinski Thanks! How to reproduce: = simply load the openvswitch.ko and dmesg [ 1083.518212] WARNING: CPU: 2 PID: 17269 at net/netlink/genetlink.c:554 genl_validate_ops+0x134/0x254 ... [ 1083.518306] CPU: 2 PID: 17269 Comm: modprobe Tainted: GW OE 5.15.0-1037.39.10.g319565b-bluefield #g319565b [ 1083.518309] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.7.0.13056 Feb 28 2024 [ 1083.518311] pstate: 0049 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1083.518313] pc : genl_validate_ops+0x134/0x254 [ 1083.518315] lr : genl_validate_ops+0x68/0x254 [ 1083.518317] sp : 8a773810 [ 1083.518318] x29: 8a773810 x28: 8a773ba0 x27: b1ea36f87318 [ 1083.518321] x26: b1ea36f8cd20 x25: 0001 x24: b1ea36f8cda8 [ 1083.518323] x23: x22: 0001 x21: b1ea36f87210 [ 1083.518325] x20: b1ea36f8b410 x19: 0001 x18: [ 1083.518328] x17: 000d00020008 x16: b1ea4b70c2d0 x15: 003c00010006 [ 1083.518330] x14: 68746170 x13: x12: 0001 [ 1083.518332] x11: x10: x9 : b1ea4b709a5c [ 1083.518335] x8 : x7 : x6 : b1ea4d4218c0 [ 1083.518337] x5 : 0004 x4 : x3 : 0001 [ 1083.518339] x2 : x1 : x0 : 0003 [ 1083.518341] Call trace: [ 1083.518343] genl_validate_ops+0x134/0x254 [ 1083.518344] genl_register_family+0x30/0x1f4 [ 1083.518347] dp_init+0xd4/0x174 [openvswitch] [ 1083.518360] do_one_initcall+0x4c/0x250 [ 1083.518364] do_init_module+0x50/0x260 [ 1083.518368] load_module+0x9fc/0xbe0 [ 1083.518370] __do_sys_finit_module+0xa8/0x114 [ 1083.518372] __arm64_sys_finit_module+0x28/0x3c [ 1083.518375] invoke_syscall+0x78/0x100 [ 1083.518379] el0_svc_common.constprop.0+0x54/0x184 [ 1083.518381] do_el0_svc+0x30/0xac [ 1083.518383] el0_svc+0x48/0x160 [ 1083.518387] el0t_64_sync_handler+0xa4/0x12c [ 1083.518390] el0t_64_sync+0x1a4/0x1a8 [ 1083.518392] ---[ end trace ec4279298c2ae7be ]--- [ 1083.830668] openvswitch: Open vSwitch switching datapath To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2056718/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2056364] Re: Add test script for DPLL
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2056364 Title: Add test script for DPLL Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: * intro In bug 2053155 "Add DPLL and syncE support" below: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2053155 It requires using a yaml spec file, dpll.yaml, and a python script, cli.py, to verify the correctness. ex: $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --dump device-get We've found that the script and spec file are missing in current repo (Ubuntu-bluefield-5.15.0-1037.39). * how to fix Since the existing Bluefield-5.15 doesn't have the tools/net/ynl directory, the efforts to cherry-pick all individual patches shouldn't be too hard due to no dependencies and most likely no conflict, but there are around 200 patches in tools/net/ynl $ git log --oneline tools/net/ynl/ | wc -l 205 and for Documentation/netlink/genetlink.yaml (a dependent file for dpll.yaml) $ git log --oneline Documentation/netlink/genetlink.yaml | wc -l 15 So we decided to just create a new patch consisting all the required files, shown below: create mode 100644 Documentation/netlink/genetlink.yaml create mode 100644 tools/net/ynl/cli.py create mode 100644 tools/net/ynl/lib/__init__.py create mode 100644 tools/net/ynl/lib/nlspec.py create mode 100644 tools/net/ynl/lib/ynl.py To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2056364/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2055086] Re: mlxbf_gige: add support for pause frame counters
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2055086 Title: mlxbf_gige: add support for pause frame counters Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] The BlueField-2 and BlueField-3 silicon designs include a provision for reporting pause frame counters, but the driver (mlxbf_gige) does not have software support yet. [Fix] The fix is to update the mlxbf_gige driver to support the "get_pause_stats()" callback, which enables display of pause frame counters via "ethtool -I -a ooob_net0". The support of this callback was added to Linux kernel 5.9.0 [Test Case] * Reboot BF platform, and configure oob_net0 interface * Issue command "ethtool -I -a oob_net0" to display pause frame counters * Send in multiple streams of traffic to oob_net0 interface, e.g. from two or three simultaneous SCP operations * Re-issue "ethtool -I -a oob_net0" and note difference in output [Regression Potential] The changed logic extends the ethtool support, so low risk for any problem. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2055086/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2055052] Re: net/sched: flower: Add lock protection when remove filter handle
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2055052 Title: net/sched: flower: Add lock protection when remove filter handle Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: * intro As IDR can't protect itself from the concurrent modification, place idr_remove() under the protection of tp->lock. * how to fix need to backport the patch from Jianbo "net/sched: flower: Add lock protection when remove filter handle" https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=1fde0ca3a0de7e9f917668941156959dd5e9108b * reproduce and verification It's hard to reproduce because it is a race condition we found in k8s+ovn deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2055052/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2054845] Re: mlxbf-gige: support fixed phy for Bobcat
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2054845 Title: mlxbf-gige: support fixed phy for Bobcat Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] Bobcat is a new board which doesn't have an external PHY connected to the OOB. There are no MDIO busses involved and no PHY involved. The OOB is directly connected to the switch (MAC to MAC). SO we need to use the linux fixed phy to be able to emulate the mdio behavior and load the ethernet driver. [Fix] * Register the fixed-phy register in the case of the bobcat. [Test Case] * Important: For testing on the bobcat board, make sure the corresponding UEFI image is also loaded, otherwise, the oob driver will fail to load. For other board, it is backward compatible. * Check if the mlxbf-gige is loaded on bobcat * Check that it is using the fixed-phy via dmesg | grep PHY * check that the oob interface is pingable * check that files can be copied over via oob interface * rmmod/modprobe * ifconfig up/down * reboot test [Regression Potential] * Redo all the OOB testing on other boards (moonraker) and make sure the bobcat changes dont trigger any regressions. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2054845/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2051871] Re: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2051871 Title: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] This is a cherry-pick fix from upstream to drop Tx network packet when Tx TmFIFO is full. [Fix] Starting from Linux 5.16 kernel, Tx timeout mechanism was added in the virtio_net driver which prints the "Tx timeout" warning message when a packet stays in Tx queue for too long. Below is an example of the reported message: "[494105.316739] virtio_net virtio1 tmfifo_net0: TX timeout on queue: 0, sq: output.0, vq: 0×1, name: output.0, usecs since last trans: 3079892256". This issue could happen when external host driver which drains the FIFO is restared, stopped or upgraded. To avoid such confusing "Tx timeout" messages, this commit adds logic to drop the outstanding Tx packet if it's not able to transmit in two seconds due to Tx FIFO full, which can be considered as congestion or out-of-resource drop. This commit also handles the special case that the packet is half- transmitted into the Tx FIFO. In such case, the packet is discarded with remaining length stored in vring->rem_padding. So paddings with zeros can be sent out when Tx space is available to maintain the integrity of the packet format. The padded packet will be dropped on the receiving side. [Test Case] Same functionality and testing as on BlueField-1/2/3. No functionality change. [Regression Potential] Same behavior from user perspective. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2051871/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2053155] Re: Add DPLL and syncE support
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2053155 Title: Add DPLL and syncE support Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: * intro Synchronous Ethernet, or SyncE, is an ITU-T standard for computer networking that facilitates the transference of clock signals over the Ethernet physical layer. It is used to pass timing from node to node and is particularly important for mobile networks. The DPLL subsystem in the Linux kernel provides a general interface for configuring devices that use any kind of Digital PLL. This subsystem is designed to manage the clock signal synchronization of a device with an external clock signal. * Explain the bug(s) We need to support mlx5 SyncE feature. The following patches are needed. [net-next,v8,0/9] Create common DPLL configuration API [net-next,v8,1/9] dpll: documentation on DPLL subsystem interface [net-next,v8,2/9] dpll: spec: Add Netlink spec in YAML [net-next,v8,3/9] dpll: core: Add DPLL framework base functions [net-next,v8,4/9] dpll: netlink: Add DPLL framework base functions [net-next,v8,5/9] netdev: expose DPLL pin handle for netdevice [net-next,v8,6/9] ice: add admin commands to access cgu configuration [net-next,v8,7/9] ice: implement dpll interface to control cgu [net-next,v8,8/9] ptp_ocp: implement DPLL ops [net-next,v8,9/9] mlx5: Implement SyncE support using DPLL infrastructure https://lore.kernel.org/netdev/20230913204943.1051233-1-vadim.fedore...@linux.dev/ * Brief explanation of fixes We identify several dependent patches, especially related to netlink gap between current master-next. We cherry-pick/backport series of patches related to netlink. * How to test $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --dump device-get (mlx5 device should show up) $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-get * list of patches applied to 5.15 jammy based on 911f816f4c04 mlxbf_gige: fix receive packet race condition we applied the following 83a11d94c436 UBUNTU: SAUCE: fix build error after resv_start_op 036b2fecd315 genetlink: allow families to use split ops directly 9f40a82f73ea genetlink: inline old iteration helpers d8ae137b98fc genetlink: use iterator in the op to policy map dumping 095792a3a723 genetlink: add iterator for walking family ops 764747ba0085 genetlink: limit the use of validation workarounds to old ops 35b95f016392 genetlink: inline genl_get_cmd() c448680485e6 genetlink: support split policies in ctrl_dumppolicy_put_op() a17efffb8ce8 genetlink: add policies for both doit and dumpit in ctrl_dumppolicy_start() 82af441de2fa genetlink: check for callback type at op load time dc17c9675d6d genetlink: load policy based on validation flags d867b1e130d3 genetlink: move the private fields in struct genl_family b515a3664ef5 genetlink: piggy back on resv_op to default to a reject policy 1fa6e0ec60a4 genetlink: refactor the cmd <> policy mapping dump c8ba54011c1d netlink: add helpers for extack attr presence checking 3d9c79882136 netlink: add support for ext_ack missing attributes a0edd1e6e99c netlink: factor out extack composition 181e77073d7c genetlink: introduce split op representation 58077e5acc95 genetlink: reject use of nlmsg_flags for new commands 199935182e01 genetlink: start to validate reserved header bytes 137d6fb11693 kernel: make taskstats available from all net namespaces 75acbd978295 genetlink: fix kdoc warnings 05670bbfb561 dpll: remove leftover mode_supported() op and use mode_get() instead e878653f5370 dpll: netlink/core: change pin frequency set behavior d483a8e43651 dpll: netlink/core: add support for pin-dpll signal phase offset/adjust 71574fa963d0 dpll: spec: add support for pin-dpll signal phase offset/adjust 2b004af8572d netlink: specs: remove redundant type keys from attributes in subsets 473b6cb3696f dpll: docs: add support for pin signal phase offset/adjust 015ea1e2fbda Documentation: dpll: wrap DPLL_CMD_PIN_GET output in a code block 5167a94e40e1 Documentation: dpll: Fix code blocks 16ab1010e92d netdev: expose DPLL pin handle for netdevice aa78df66e66b dpll: netlink: Add DPLL framework base functions d0eafb969ef9 dpll: core: Add DPLL framework base functions 8bb4159242d6 dpll: spec: Add Netlink spec in YAML 3dbcbef38d8e dpll: documentation on DPLL subsystem interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2053155/+subscriptions -- Mailing list:
[Kernel-packages] [Bug 2051871] Re: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2051871 Title: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] This is a cherry-pick fix from upstream to drop Tx network packet when Tx TmFIFO is full. [Fix] Starting from Linux 5.16 kernel, Tx timeout mechanism was added in the virtio_net driver which prints the "Tx timeout" warning message when a packet stays in Tx queue for too long. Below is an example of the reported message: "[494105.316739] virtio_net virtio1 tmfifo_net0: TX timeout on queue: 0, sq: output.0, vq: 0×1, name: output.0, usecs since last trans: 3079892256". This issue could happen when external host driver which drains the FIFO is restared, stopped or upgraded. To avoid such confusing "Tx timeout" messages, this commit adds logic to drop the outstanding Tx packet if it's not able to transmit in two seconds due to Tx FIFO full, which can be considered as congestion or out-of-resource drop. This commit also handles the special case that the packet is half- transmitted into the Tx FIFO. In such case, the packet is discarded with remaining length stored in vring->rem_padding. So paddings with zeros can be sent out when Tx space is available to maintain the integrity of the packet format. The padded packet will be dropped on the receiving side. [Test Case] Same functionality and testing as on BlueField-1/2/3. No functionality change. [Regression Potential] Same behavior from user perspective. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2051871/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033406] Re: [SRU][J/L/M] UBUNTU: [Packaging] Make WWAN driver a loadable module
$ dpkg -X linux-buildinfo-5.15.0-1033-bluefield_5.15.0-1033.35_arm64.deb unpack $ grep ^wwan unpack/usr/lib/linux/5.15.0-1033-bluefield/modules wwan wwan_hwsim ** Tags removed: verification-needed-jammy-linux-bluefield ** Tags added: verification-done-jammy-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2033406 Title: [SRU][J/L/M] UBUNTU: [Packaging] Make WWAN driver a loadable module Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Released Status in linux source package in Lunar: Fix Released Status in linux source package in Mantic: Fix Committed Bug description: == SRU Justification == The CONFIG_WWAN config is set to 'Y' for the generic and most derivative kernels. This is affecting custom driver development for some partners. Change this config to be a loadable module and include it in linux- modules-*. Make this change to -generic kernels, so all derivatives will inherit it. == Fix == UBUNTU: [Packaging] Make WWAN driver loadable modules == Regression Potential == Medium. This change is only to WWAN, and is changing it to a loadable module and not removing it. == Test Case == A test kernel was built with this patch and tested by a partner. It was also compile and boot tested internally. Testing will also be performed on a WWAN device. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2033406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2049930] Re: mlxbf_gige: disable RX_DMA during NAPI poll
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2049930 Title: mlxbf_gige: disable RX_DMA during NAPI poll Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] A regression has been introduced in the latest 22.04 kernel by including the following two commits: revert "UBUNTU: SAUCE: Fix OOB handling RX packets in heavy traffic" addition of upstream "mlxbf_gige: fix receive packet race condition" The regression will manifest as connectivity problems on "oob_net0" interface [Fix] Modify the GIGE driver to include the RX_DMA disable logic in its NAPI poll routine This logic was part of "UBUNTU: SAUCE: Fix OOB handling RX packets in heavy traffic" but not part of upstream "mlxbf_gige: fix receive packet race condition" [Test Case] Boot BF3 platform and ensure oob_net0 interface is up and has IP address Bring up many IP interfaces, including oob_net0, and make sure OOB interface is continually accessible (e.g. via ping) [Regression Potential] Low risk change, adding back in prior logic. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2049930/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2049930] Re: mlxbf_gige: disable RX_DMA during NAPI poll
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2049930 Title: mlxbf_gige: disable RX_DMA during NAPI poll Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] A regression has been introduced in the latest 22.04 kernel by including the following two commits: revert "UBUNTU: SAUCE: Fix OOB handling RX packets in heavy traffic" addition of upstream "mlxbf_gige: fix receive packet race condition" The regression will manifest as connectivity problems on "oob_net0" interface [Fix] Modify the GIGE driver to include the RX_DMA disable logic in its NAPI poll routine This logic was part of "UBUNTU: SAUCE: Fix OOB handling RX packets in heavy traffic" but not part of upstream "mlxbf_gige: fix receive packet race condition" [Test Case] Boot BF3 platform and ensure oob_net0 interface is up and has IP address Bring up many IP interfaces, including oob_net0, and make sure OOB interface is continually accessible (e.g. via ping) [Regression Potential] Low risk change, adding back in prior logic. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2049930/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2044427] Re: Kernel panic in restart driver after configuring IPsec full offload
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2044427 Title: Kernel panic in restart driver after configuring IPsec full offload Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Bug description: Restarting the driver with IPsec full offload transparent mode configuration causes kernel panic. Kernel version is linux-bluefield 5.15 Test step: 1) configure xfrm rules 2) configure VF 3) configure FW steering mode 4) restart driver 5) check dmesg Test result: [ 937.989359] [ cut here ] [ 937.989786] WARNING: CPU: 11 PID: 60463 at /tmp/23.10-0.1.8/6.5.0-rc6_mlnx/fedora_32/mlnx-ofa_kernel/BUILD/mlnx-ofa_kernel-23.10/obj/default/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c:1828 mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 937.991650] Modules linked in: esp4_offload esp4 esp6_offload esp6 act_tunnel_key vxlan act_mirred act_skbedit cls_matchall act_gact cls_flower sch_ingress vringh vhost_iotlb udp_diag tcp_diag inet_diag iptable_raw mst_pciconf(OE) bonding ip6_gre ip6_tunnel tunnel6 vfio_pci vfio_pci_core vfio_iommu_type1 vfio ipip tunnel4 geneve ip6_udp_tunnel udp_tunnel ip_gre ip_tunnel gre rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) ib_uverbs(OE) mlx5_core(OE-) mlxdevm(OE) ib_core(OE) mlx_compat(OE) mlxfw(OE) memtrack(OE) pci_hyperv_intf openvswitch nsh nf_conncount nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat br_netfilter bridge stp llc rfkill overlay kvm_intel sch_fq_codel kvm iTCO_wdt irqbypass iTCO_vendor_support crc32_pclmul pcspkr ghash_clmulni_intel i2c_i801 lpc_ich sha512_ssse3 i2c_smbus mfd_core sunrpc drm i2c_ core ip_tables crc32c_intel serio_raw [ 937.991698] fuse virtio_net net_failover failover [last unloaded: vdpa] [ 937.999155] CPU: 11 PID: 60463 Comm: modprobe Tainted: G OE 6.5.0-rc6_mlnx #1 [ 937.999891] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 938.000823] RIP: 0010:mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 938.001459] Code: f6 45 31 c0 48 89 ea 31 ff e8 d4 d5 df ff 59 e9 8c fe ff ff c3 0f 0b e9 3b fe ff ff 0f 0b e9 e8 fd ff ff 0f 0b e9 07 fe ff ff <0f> 0b e9 65 fe ff ff 0f 0b e9 82 fe ff ff 66 2e 0f 1f 84 00 00 00 [ 938.002949] RSP: 0018:c90001183c08 EFLAGS: 00010202 [ 938.003418] RAX: RBX: 8882f3869c00 RCX: 0001 [ 938.004024] RDX: 82a305c0 RSI: 0002 RDI: 888103aa2b30 [ 938.004624] RBP: 888103aa2d80 R08: 0001 R09: 888100042800 [ 938.005238] R10: 0002 R11: c90001183ba8 R12: 8881312e6800 [ 938.005836] R13: 8881127401a0 R14: 8881312e6800 R15: 888148bbd160 [ 938.006444] FS: 7fd22b82c740() GS:5fac() knlGS: [ 938.009456] CS: 0010 DS: ES: CR0: 80050033 [ 938.009970] CR2: 7f26ca697000 CR3: 00012e73f003 CR4: 00770ee0 [ 938.010568] DR0: DR1: DR2: [ 938.011173] DR3: DR6: fffe0ff0 DR7: 0400 [ 938.011772] PKRU: 5554 [ 938.012065] Call Trace: [ 938.012333] [ 938.012583] ? __warn+0x7d/0x120 [ 938.012921] ? mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 938.013494] ? report_bug+0xf1/0x1c0 [ 938.013850] ? handle_bug+0x44/0x70 [ 938.014201] ? exc_invalid_op+0x13/0x60 [ 938.014568] ? asm_exc_invalid_op+0x16/0x20 [ 938.014970] ? mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 938.015532] ? mlx5e_accel_ipsec_fs_cleanup+0xf2/0x2b0 [mlx5_core] [ 938.016093] mlx5e_ipsec_cleanup+0x1e/0x100 [mlx5_core] [ 938.016594] mlx5e_detach_netdev+0x46/0x80 [mlx5_core] [ 938.017098] mlx5e_vport_rep_unload+0x147/0x1a0 [mlx5_core] [ 938.017623] mlx5_eswitch_unregister_vport_reps+0x13e/0x190 [mlx5_core] [ 938.018221] auxiliary_bus_remove+0x18/0x30 [ 938.018616] device_release_driver_internal+0xaa/0x130 [ 938.019076] bus_remove_device+0xc3/0x130 [ 938.019451] device_del+0x157/0x380 [ 938.019792] ? kobject_put+0xb3/0x200 [ 938.020153] delete_drivers+0x72/0xa0 [mlx5_core] [ 938.020608] mlx5_unregister_device+0x34/0x70 [mlx5_core] [ 938.021113] mlx5_uninit_one+0x25/0x130 [mlx5_core] [ 938.021572] remove_one+0x72/0xc0 [mlx5_core] [ 938.022002] pci_device_remove+0x31/0xb0 [
[Kernel-packages] [Bug 2045919] Re: mlxbf-bootctl: fails to report info about cards using dev keys
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2045919 Title: mlxbf-bootctl: fails to report info about cards using dev keys Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] There is a gap in the mlxbf-bootctl driver logic, resulting in failure to report when a secure boot enabled card is using development keys. The program will instead report lifecycle state as "GA Secured" which is misleading. [Fix] Bring in the following upstream commit from linux-next: "mlxbf-bootctl: correctly identify secure boot with development keys" [Test Case] Enable secure boot on a BF3 card that is using development keys Login as root on BF3 Run the command "mlxbf-bootctl" Verify that the "lifecycle state" row shows "Secured (development)" [Regression Potential] Low risk change, cherry pick of upstream approved commit To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2045919/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2047853] Re: mlxbf_gige: replace SAUCE patch for RX race condition
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2047853 Title: mlxbf_gige: replace SAUCE patch for RX race condition Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] There is a SAUCE patch that addresses a mlxbf-gige RX race condition "UBUNTU: SAUCE: Fix OOB handling RX packets in heavy traffic" This SAUCE patch should be reverted and replaced with upstream content. [Fix] Bring in the following upstream commit from linux-next: "mlxbf_gige: fix receive packet race condition" [Test Case] Boot BF3 platform and ensure oob_net0 interface is up and has IP address Send heavy level of traffic into the BF3 oob_net0 and make sure no stalls occur [Regression Potential] Low risk change, cherry pick of upstream approved commit To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2047853/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2046657] Re: mmc: sdhci-of-dwcmshc: Replace sauce patches with upstream commits
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2046657 Title: mmc: sdhci-of-dwcmshc: Replace sauce patches with upstream commits Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The msdhci-of-dwcmshc driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. [Test Case] * Boot BF3 platform, verify no new errors [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2046657/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2047853] Re: mlxbf_gige: replace SAUCE patch for RX race condition
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2047853 Title: mlxbf_gige: replace SAUCE patch for RX race condition Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] There is a SAUCE patch that addresses a mlxbf-gige RX race condition "UBUNTU: SAUCE: Fix OOB handling RX packets in heavy traffic" This SAUCE patch should be reverted and replaced with upstream content. [Fix] Bring in the following upstream commit from linux-next: "mlxbf_gige: fix receive packet race condition" [Test Case] Boot BF3 platform and ensure oob_net0 interface is up and has IP address Send heavy level of traffic into the BF3 oob_net0 and make sure no stalls occur [Regression Potential] Low risk change, cherry pick of upstream approved commit To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2047853/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2046657] Re: mmc: sdhci-of-dwcmshc: Replace sauce patches with upstream commits
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2046657 Title: mmc: sdhci-of-dwcmshc: Replace sauce patches with upstream commits Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] The msdhci-of-dwcmshc driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. [Test Case] * Boot BF3 platform, verify no new errors [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2046657/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2044427] Re: Kernel panic in restart driver after configuring IPsec full offload
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2044427 Title: Kernel panic in restart driver after configuring IPsec full offload Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: Bug description: Restarting the driver with IPsec full offload transparent mode configuration causes kernel panic. Kernel version is linux-bluefield 5.15 Test step: 1) configure xfrm rules 2) configure VF 3) configure FW steering mode 4) restart driver 5) check dmesg Test result: [ 937.989359] [ cut here ] [ 937.989786] WARNING: CPU: 11 PID: 60463 at /tmp/23.10-0.1.8/6.5.0-rc6_mlnx/fedora_32/mlnx-ofa_kernel/BUILD/mlnx-ofa_kernel-23.10/obj/default/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c:1828 mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 937.991650] Modules linked in: esp4_offload esp4 esp6_offload esp6 act_tunnel_key vxlan act_mirred act_skbedit cls_matchall act_gact cls_flower sch_ingress vringh vhost_iotlb udp_diag tcp_diag inet_diag iptable_raw mst_pciconf(OE) bonding ip6_gre ip6_tunnel tunnel6 vfio_pci vfio_pci_core vfio_iommu_type1 vfio ipip tunnel4 geneve ip6_udp_tunnel udp_tunnel ip_gre ip_tunnel gre rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) ib_uverbs(OE) mlx5_core(OE-) mlxdevm(OE) ib_core(OE) mlx_compat(OE) mlxfw(OE) memtrack(OE) pci_hyperv_intf openvswitch nsh nf_conncount nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat br_netfilter bridge stp llc rfkill overlay kvm_intel sch_fq_codel kvm iTCO_wdt irqbypass iTCO_vendor_support crc32_pclmul pcspkr ghash_clmulni_intel i2c_i801 lpc_ich sha512_ssse3 i2c_smbus mfd_core sunrpc drm i2c_ core ip_tables crc32c_intel serio_raw [ 937.991698] fuse virtio_net net_failover failover [last unloaded: vdpa] [ 937.999155] CPU: 11 PID: 60463 Comm: modprobe Tainted: G OE 6.5.0-rc6_mlnx #1 [ 937.999891] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 938.000823] RIP: 0010:mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 938.001459] Code: f6 45 31 c0 48 89 ea 31 ff e8 d4 d5 df ff 59 e9 8c fe ff ff c3 0f 0b e9 3b fe ff ff 0f 0b e9 e8 fd ff ff 0f 0b e9 07 fe ff ff <0f> 0b e9 65 fe ff ff 0f 0b e9 82 fe ff ff 66 2e 0f 1f 84 00 00 00 [ 938.002949] RSP: 0018:c90001183c08 EFLAGS: 00010202 [ 938.003418] RAX: RBX: 8882f3869c00 RCX: 0001 [ 938.004024] RDX: 82a305c0 RSI: 0002 RDI: 888103aa2b30 [ 938.004624] RBP: 888103aa2d80 R08: 0001 R09: 888100042800 [ 938.005238] R10: 0002 R11: c90001183ba8 R12: 8881312e6800 [ 938.005836] R13: 8881127401a0 R14: 8881312e6800 R15: 888148bbd160 [ 938.006444] FS: 7fd22b82c740() GS:5fac() knlGS: [ 938.009456] CS: 0010 DS: ES: CR0: 80050033 [ 938.009970] CR2: 7f26ca697000 CR3: 00012e73f003 CR4: 00770ee0 [ 938.010568] DR0: DR1: DR2: [ 938.011173] DR3: DR6: fffe0ff0 DR7: 0400 [ 938.011772] PKRU: 5554 [ 938.012065] Call Trace: [ 938.012333] [ 938.012583] ? __warn+0x7d/0x120 [ 938.012921] ? mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 938.013494] ? report_bug+0xf1/0x1c0 [ 938.013850] ? handle_bug+0x44/0x70 [ 938.014201] ? exc_invalid_op+0x13/0x60 [ 938.014568] ? asm_exc_invalid_op+0x16/0x20 [ 938.014970] ? mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core] [ 938.015532] ? mlx5e_accel_ipsec_fs_cleanup+0xf2/0x2b0 [mlx5_core] [ 938.016093] mlx5e_ipsec_cleanup+0x1e/0x100 [mlx5_core] [ 938.016594] mlx5e_detach_netdev+0x46/0x80 [mlx5_core] [ 938.017098] mlx5e_vport_rep_unload+0x147/0x1a0 [mlx5_core] [ 938.017623] mlx5_eswitch_unregister_vport_reps+0x13e/0x190 [mlx5_core] [ 938.018221] auxiliary_bus_remove+0x18/0x30 [ 938.018616] device_release_driver_internal+0xaa/0x130 [ 938.019076] bus_remove_device+0xc3/0x130 [ 938.019451] device_del+0x157/0x380 [ 938.019792] ? kobject_put+0xb3/0x200 [ 938.020153] delete_drivers+0x72/0xa0 [mlx5_core] [ 938.020608] mlx5_unregister_device+0x34/0x70 [mlx5_core] [ 938.021113] mlx5_uninit_one+0x25/0x130 [mlx5_core] [ 938.021572] remove_one+0x72/0xc0
[Kernel-packages] [Bug 2045919] Re: mlxbf-bootctl: fails to report info about cards using dev keys
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2045919 Title: mlxbf-bootctl: fails to report info about cards using dev keys Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] There is a gap in the mlxbf-bootctl driver logic, resulting in failure to report when a secure boot enabled card is using development keys. The program will instead report lifecycle state as "GA Secured" which is misleading. [Fix] Bring in the following upstream commit from linux-next: "mlxbf-bootctl: correctly identify secure boot with development keys" [Test Case] Enable secure boot on a BF3 card that is using development keys Login as root on BF3 Run the command "mlxbf-bootctl" Verify that the "lifecycle state" row shows "Secured (development)" [Regression Potential] Low risk change, cherry pick of upstream approved commit To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2045919/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2041996] Re: pwr-mlxbf: Several bug fixes for focal
** Changed in: linux-bluefield (Ubuntu Focal) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2041996 Title: pwr-mlxbf: Several bug fixes for focal Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Focal: Fix Committed Bug description: SRU Justification: [Impact] There is are several changes that needs to be made to pwr-mlxbf in focal: * There is a race condition between gpio-mlxbf2.c driver being loaded and pwr-mlxbf.c being loaded * When the module is removed, there is a panic due to NULL pointer access * soft reset needs to be replaced by graceful reboot [Fix] * Fix race condition between gpio-mlxbf2.c driver being loaded and pwr-mlxbf.c being loaded * Fix panic due to access to NULL pointer when driver is removed via rmmod * support graceful reboot instead of soft reset [Test Case] * all test cases are for BF2: * trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02 This should trigger a graceful reboot of the DPU. * rmmod/modprobe * reboot test and make sure the driver is always loaded [Regression Potential] * Run the 100 reboot test and make sure that the driver is loaded with no issues. That the gpio graceful reboot works. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2041996/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2041996] Re: pwr-mlxbf: Several bug fixes for focal
** Also affects: linux-bluefield (Ubuntu Focal) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2041996 Title: pwr-mlxbf: Several bug fixes for focal Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Focal: New Bug description: SRU Justification: [Impact] There is are several changes that needs to be made to pwr-mlxbf in focal: * There is a race condition between gpio-mlxbf2.c driver being loaded and pwr-mlxbf.c being loaded * When the module is removed, there is a panic due to NULL pointer access * soft reset needs to be replaced by graceful reboot [Fix] * Fix race condition between gpio-mlxbf2.c driver being loaded and pwr-mlxbf.c being loaded * Fix panic due to access to NULL pointer when driver is removed via rmmod * support graceful reboot instead of soft reset [Test Case] * all test cases are for BF2: * trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02 This should trigger a graceful reboot of the DPU. * rmmod/modprobe * reboot test and make sure the driver is always loaded [Regression Potential] * Run the 100 reboot test and make sure that the driver is loaded with no issues. That the gpio graceful reboot works. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2041996/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2043005] Re: mlxbf-pmc: Fix crspace event offsets
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2043005 Title: mlxbf-pmc: Fix crspace event offsets Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Currently, the offsets calculated for programming crspace type events on BlueField-3 are incorrect and this leads to a mismatch of counter numbers between the driver and actual HW where, for example, event2 in the driver is actually event4 in HW. [Fix] Fix the offset calculation for programming crspace events. [Test Case] On a BlueField-3 platform, program any valid event number to the event sysfs and check that the corresponding counter value increments. [Regression Potential] This is a bug fix and the regression potential can be considered minimal. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2043005/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2043005] Re: mlxbf-pmc: Fix crspace event offsets
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2043005 Title: mlxbf-pmc: Fix crspace event offsets Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] Currently, the offsets calculated for programming crspace type events on BlueField-3 are incorrect and this leads to a mismatch of counter numbers between the driver and actual HW where, for example, event2 in the driver is actually event4 in HW. [Fix] Fix the offset calculation for programming crspace events. [Test Case] On a BlueField-3 platform, program any valid event number to the event sysfs and check that the corresponding counter value increments. [Regression Potential] This is a bug fix and the regression potential can be considered minimal. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2043005/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042527] Re: mlxbf-pmc: add support for clock_measure counters
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2042527 Title: mlxbf-pmc: add support for clock_measure counters Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Currently, the counters under the clock_measure HW block are not supported by the driver and is a new requirement for calculating performance metrics. [Fix] Add support for "clock_measure" block by passing the block info via ACPI. The events monitored by these counters is added to the driver and exposed via a new sysfs sub-directory for this block. [Test Case] Read any of the registers listed under the "clock_measure" sub-directory. These fields are read-only and hence all writes will be blocked by the driver. [Regression Potential] Can be considered minimal. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042527/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2041995] Re: pwr-mlxbf: support graceful reboot instead of soft reset
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2041995 Title: pwr-mlxbf: support graceful reboot instead of soft reset Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] There is a new feature request to replace the soft reset with a graceful reboot. We will use acpi events triggered by the irq in the pwr-mlxbf file to trigger the graceful reboot. [Fix] * Change the handling of GPIO7 (BF2) and GPIO6 (BF3). This gpio will trigger a graceful reboot instead of a forced reboot (soft reset). This is an acpi event. [Test Case] * trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02 This should trigger a graceful reboot of the DPU. [Regression Potential] * make sure it works on BF2 and BF3 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2041995/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042525] Re: mlxbf-pmc: Support 64-bit counter and counting cycles
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2042525 Title: mlxbf-pmc: Support 64-bit counter and counting cycles Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Currently, there is no method to keep track of the cycle count when an event is being monitored. And since cycle count increments quickly, the 32-bit counter values could wrap around and hence support for 64-bit counters is also needed for the same. [Fix] Expose 2 additional sysfs entries: count_clock and use_odd_counter. These fields are supported in BlueField-3 PMC hardware and each bit in count_clock corresponds to each counter, while each bit in use_odd_counter corresponds to an even counter.Exposing these fields will allow the user to program any counter of choice to monitor the cycle count for the required duration. Similarly, use_odd_counter can be set to couple 2 adjacent odd and even counters to form a 64-bit counter. [Test Case] 1. Verify that count_clock and use_odd_counter sysfs entries are created for each BlueField-3 HW block. These are not supported by BlueField-1 or BlueField-2 HW. 2. Set any bit in count_clock and check if the corresponding counter values increment after enabling. 3. Set any bit in use_odd_counter and check if the cycle count increments on 2 counters with the odd counter being the lower 32 bits and the even counter being the upper 32 bits. [Regression Potential] Can be considered minimal. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042525/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042527] Re: mlxbf-pmc: add support for clock_measure counters
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2042527 Title: mlxbf-pmc: add support for clock_measure counters Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] Currently, the counters under the clock_measure HW block are not supported by the driver and is a new requirement for calculating performance metrics. [Fix] Add support for "clock_measure" block by passing the block info via ACPI. The events monitored by these counters is added to the driver and exposed via a new sysfs sub-directory for this block. [Test Case] Read any of the registers listed under the "clock_measure" sub-directory. These fields are read-only and hence all writes will be blocked by the driver. [Regression Potential] Can be considered minimal. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042527/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2041995] Re: pwr-mlxbf: support graceful reboot instead of soft reset
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2041995 Title: pwr-mlxbf: support graceful reboot instead of soft reset Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] There is a new feature request to replace the soft reset with a graceful reboot. We will use acpi events triggered by the irq in the pwr-mlxbf file to trigger the graceful reboot. [Fix] * Change the handling of GPIO7 (BF2) and GPIO6 (BF3). This gpio will trigger a graceful reboot instead of a forced reboot (soft reset). This is an acpi event. [Test Case] * trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02 This should trigger a graceful reboot of the DPU. [Regression Potential] * make sure it works on BF2 and BF3 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2041995/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042455] Re: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2042455 Title: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Summary: Machine hangs when loading OFED 2310 mlx5 driver at BlueField How to reproduce: # load the OFED driver Reason: BF got stuck and observed call trace "mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core] dmesg from minicom: [ 726.569928] INFO: task systemd-udevd:297 blocked for more than 604 seconds. [ 726.576895] Tainted: G OE 5.15.0-1029-bluefield #31-Ubuntu [ 726.584101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 726.591913] task:systemd-udevd state:D stack:0 pid: 297 ppid: 280 flags:0x000d [ 726.600248] Call trace: [ 726.602680] __switch_to+0xf8/0x150 [ 726.606159] __schedule+0x2b8/0x790 [ 726.609634] schedule+0x64/0x140 [ 726.612850] schedule_preempt_disabled+0x18/0x24 [ 726.617453] __mutex_lock.constprop.0+0x1a0/0x680 [ 726.622141] __mutex_lock_slowpath+0x40/0x90 [ 726.626396] mutex_lock+0x64/0x70 [ 726.629695] devlink_resource_register+0x50/0x1a0 [ 726.634386] mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core] [ 726.639882] mlx5_init_one_devl_locked+0x1c8/0x784 [mlx5_core] [ 726.645791] probe_one+0x300/0x5f0 [mlx5_core] [ 726.650307] local_pci_probe+0x48/0xb4 [ 726.654043] pci_device_probe+0x18c/0x200 [ 726.658039] really_probe+0xd0/0x490 [ 726.661600] __driver_probe_device+0x148/0x190 [ 726.666029] driver_probe_device+0x48/0x180 [ 726.670198] __driver_attach+0x104/0x240 [ 726.674106] bus_for_each_dev+0x78/0xdc [ 726.677927] driver_attach+0x2c/0x40 [ 726.681486] bus_add_driver+0x154/0x270 [ 726.685307] driver_register+0x80/0x13c [ 726.689129] __pci_register_driver+0x4c/0x60 [ 726.693386] __init_backport+0xf0/0x1000 [mlx5_core] [ 726.698425] do_one_initcall+0x4c/0x250 [ 726.702248] do_init_module+0x50/0x260 [ 726.705983] load_module+0x9fc/0xbe0 [ 726.709543] __do_sys_finit_module+0xa8/0x114 [ 726.713885] __arm64_sys_finit_module+0x28/0x3c [ 726.718401] invoke_syscall+0x78/0x100 [ 726.722137] el0_svc_common.constprop.0+0x54/0x184 [ 726.726913] do_el0_svc+0x30/0xac [ 726.730215] el0_svc+0x48/0x160 [ 726.733341] el0t_64_sync_handler+0xa4/0x130 [ 726.737597] el0t_64_sync+0x1a4/0x1a8 [ 847.401924] INFO: task systemd-udevd:297 blocked for more than 724 seconds. [ 847.408891] Tainted: G OE 5.15.0-1029-bluefield #31-Ubuntu How to fix: This is related to https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869 and we need to backport/cherry-pick more patches from the series Patches are below Backport: f655dacb59ac net: devlink: remove unused locked functions Backport: 012ec02ae441 netdevsim: convert driver to use unlocked devlink API during init/fini Cherry-pick: eb0e9fa2c635 net: devlink: add unlocked variants of devlink_region_create/destroy() functions SKIP: 72a4c8c94efa mlxsw: convert driver to use unlocked devlink API during init/fini Backport: 70a2ff89369d net: devlink: add unlocked variants of devlink_dpipe*() functions Cherry-pick: 755cfa69c4ec net: devlink: add unlocked variants of devlink_sb*() functions Cherry-pick: c223d6a4bf6d net: devlink: add unlocked variants of devlink_resource*() functions Cherry-pick: 852e85a704c2 net: devlink: add unlocked variants of devling_trap*() functions Cherry-pick: e26fde2f5bef net: devlink: avoid false DEADLOCK warning reported by lock Thanks! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042455/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042455] Re: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2042455 Title: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: Summary: Machine hangs when loading OFED 2310 mlx5 driver at BlueField How to reproduce: # load the OFED driver Reason: BF got stuck and observed call trace "mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core] dmesg from minicom: [ 726.569928] INFO: task systemd-udevd:297 blocked for more than 604 seconds. [ 726.576895] Tainted: G OE 5.15.0-1029-bluefield #31-Ubuntu [ 726.584101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 726.591913] task:systemd-udevd state:D stack:0 pid: 297 ppid: 280 flags:0x000d [ 726.600248] Call trace: [ 726.602680] __switch_to+0xf8/0x150 [ 726.606159] __schedule+0x2b8/0x790 [ 726.609634] schedule+0x64/0x140 [ 726.612850] schedule_preempt_disabled+0x18/0x24 [ 726.617453] __mutex_lock.constprop.0+0x1a0/0x680 [ 726.622141] __mutex_lock_slowpath+0x40/0x90 [ 726.626396] mutex_lock+0x64/0x70 [ 726.629695] devlink_resource_register+0x50/0x1a0 [ 726.634386] mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core] [ 726.639882] mlx5_init_one_devl_locked+0x1c8/0x784 [mlx5_core] [ 726.645791] probe_one+0x300/0x5f0 [mlx5_core] [ 726.650307] local_pci_probe+0x48/0xb4 [ 726.654043] pci_device_probe+0x18c/0x200 [ 726.658039] really_probe+0xd0/0x490 [ 726.661600] __driver_probe_device+0x148/0x190 [ 726.666029] driver_probe_device+0x48/0x180 [ 726.670198] __driver_attach+0x104/0x240 [ 726.674106] bus_for_each_dev+0x78/0xdc [ 726.677927] driver_attach+0x2c/0x40 [ 726.681486] bus_add_driver+0x154/0x270 [ 726.685307] driver_register+0x80/0x13c [ 726.689129] __pci_register_driver+0x4c/0x60 [ 726.693386] __init_backport+0xf0/0x1000 [mlx5_core] [ 726.698425] do_one_initcall+0x4c/0x250 [ 726.702248] do_init_module+0x50/0x260 [ 726.705983] load_module+0x9fc/0xbe0 [ 726.709543] __do_sys_finit_module+0xa8/0x114 [ 726.713885] __arm64_sys_finit_module+0x28/0x3c [ 726.718401] invoke_syscall+0x78/0x100 [ 726.722137] el0_svc_common.constprop.0+0x54/0x184 [ 726.726913] do_el0_svc+0x30/0xac [ 726.730215] el0_svc+0x48/0x160 [ 726.733341] el0t_64_sync_handler+0xa4/0x130 [ 726.737597] el0t_64_sync+0x1a4/0x1a8 [ 847.401924] INFO: task systemd-udevd:297 blocked for more than 724 seconds. [ 847.408891] Tainted: G OE 5.15.0-1029-bluefield #31-Ubuntu How to fix: This is related to https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869 and we need to backport/cherry-pick more patches from the series Patches are below Backport: f655dacb59ac net: devlink: remove unused locked functions Backport: 012ec02ae441 netdevsim: convert driver to use unlocked devlink API during init/fini Cherry-pick: eb0e9fa2c635 net: devlink: add unlocked variants of devlink_region_create/destroy() functions SKIP: 72a4c8c94efa mlxsw: convert driver to use unlocked devlink API during init/fini Backport: 70a2ff89369d net: devlink: add unlocked variants of devlink_dpipe*() functions Cherry-pick: 755cfa69c4ec net: devlink: add unlocked variants of devlink_sb*() functions Cherry-pick: c223d6a4bf6d net: devlink: add unlocked variants of devlink_resource*() functions Cherry-pick: 852e85a704c2 net: devlink: add unlocked variants of devling_trap*() functions Cherry-pick: e26fde2f5bef net: devlink: avoid false DEADLOCK warning reported by lock Thanks! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042455/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039869] Re: Devlink reload hangs: fix race and lock issue
** Changed in: linux-bluefield (Ubuntu Jammy) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2039869 Title: Devlink reload hangs: fix race and lock issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Summary: Machine hangs when doing devlink reload How to reproduce: Host: [root@bu-lab24v ~]# echo '2' > /sys/class/net/ens2f0np0/device/sriov_numvfs Arm: root@bu-lab24v-oob:~# uname -r 5.15.0-1027-bluefield root@bu-lab24v-oob:~# devlink dev eswitch set pci/:03:00.0 mode switchdev root@bu-lab24v-oob:~# devlink dev reload pci/:03:00.0 *Hangs* Arm dmesg: [ 1089.747409] INFO: task devlink:8753 blocked for more than 120 seconds. [ 1089.760560] Tainted: G OE 5.15.0-1027-bluefield #29-Ubuntu [ 1089.775086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1089.790829] task:devlink state:D stack:0 pid: 8753 ppid: 5090 flags:0x0004 [ 1089.790838] Call trace: [ 1089.790840] __switch_to+0xf8/0x150 [ 1089.790857] __schedule+0x2b8/0x790 [ 1089.790865] schedule+0x64/0x140 [ 1089.790870] schedule_preempt_disabled+0x18/0x24 [ 1089.790874] __mutex_lock.constprop.0+0x1a0/0x680 [ 1089.790878] __mutex_lock_slowpath+0x40/0x90 [ 1089.790883] mutex_lock+0x64/0x70 [ 1089.790887] devl_lock+0x1c/0x30 [ 1089.790893] mlx5_detach_device+0x58/0x190 [mlx5_core] [ 1089.791055] mlx5_unload_one+0x40/0xe4 [mlx5_core] [ 1089.791177] mlx5_devlink_reload_down+0x184/0x270 [mlx5_core] [ 1089.791318] devlink_reload+0x214/0x290 Fixes: Checking the OFED source code, we found this missing devl trap group also need to be backported to avoid deadlock. void mlx5_detach_device(struct mlx5_core_dev *dev, bool suspend) { ... #ifdef HAVE_DEVL_PORT_REGISTER #ifdef HAVE_DEVL_TRAP_GROUPS_REGISTER devl_assert_locked(priv_to_devlink(dev)); #else devl_lock(devlink); #endif /* HAVE_DEVL_TRAP_GROUPS_REGISTER */ #endif /* HAVE_DEVL_PORT_REGISTER */ mutex_lock(_intf_mutex); #ifdef HAVE_DEVL_PORT_REGISTER Related issue: #2032378 Devlink backport: fix race and lock issue So cherry-pick the patch below commit 852e85a704c2e11c050bdea286bc438aba4f4a22 Author: Jiri Pirko Date: Sat Jul 16 13:02:34 2022 +0200 net: devlink: add unlocked variants of devling_trap*() functions Add unlocked variants of devl_trap*() functions to be used in drivers called-in with devlink->lock held. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039869] Re: Devlink reload hangs: fix race and lock issue
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed ** Changed in: linux-bluefield (Ubuntu Jammy) Status: Fix Committed => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2039869 Title: Devlink reload hangs: fix race and lock issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: In Progress Bug description: Summary: Machine hangs when doing devlink reload How to reproduce: Host: [root@bu-lab24v ~]# echo '2' > /sys/class/net/ens2f0np0/device/sriov_numvfs Arm: root@bu-lab24v-oob:~# uname -r 5.15.0-1027-bluefield root@bu-lab24v-oob:~# devlink dev eswitch set pci/:03:00.0 mode switchdev root@bu-lab24v-oob:~# devlink dev reload pci/:03:00.0 *Hangs* Arm dmesg: [ 1089.747409] INFO: task devlink:8753 blocked for more than 120 seconds. [ 1089.760560] Tainted: G OE 5.15.0-1027-bluefield #29-Ubuntu [ 1089.775086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1089.790829] task:devlink state:D stack:0 pid: 8753 ppid: 5090 flags:0x0004 [ 1089.790838] Call trace: [ 1089.790840] __switch_to+0xf8/0x150 [ 1089.790857] __schedule+0x2b8/0x790 [ 1089.790865] schedule+0x64/0x140 [ 1089.790870] schedule_preempt_disabled+0x18/0x24 [ 1089.790874] __mutex_lock.constprop.0+0x1a0/0x680 [ 1089.790878] __mutex_lock_slowpath+0x40/0x90 [ 1089.790883] mutex_lock+0x64/0x70 [ 1089.790887] devl_lock+0x1c/0x30 [ 1089.790893] mlx5_detach_device+0x58/0x190 [mlx5_core] [ 1089.791055] mlx5_unload_one+0x40/0xe4 [mlx5_core] [ 1089.791177] mlx5_devlink_reload_down+0x184/0x270 [mlx5_core] [ 1089.791318] devlink_reload+0x214/0x290 Fixes: Checking the OFED source code, we found this missing devl trap group also need to be backported to avoid deadlock. void mlx5_detach_device(struct mlx5_core_dev *dev, bool suspend) { ... #ifdef HAVE_DEVL_PORT_REGISTER #ifdef HAVE_DEVL_TRAP_GROUPS_REGISTER devl_assert_locked(priv_to_devlink(dev)); #else devl_lock(devlink); #endif /* HAVE_DEVL_TRAP_GROUPS_REGISTER */ #endif /* HAVE_DEVL_PORT_REGISTER */ mutex_lock(_intf_mutex); #ifdef HAVE_DEVL_PORT_REGISTER Related issue: #2032378 Devlink backport: fix race and lock issue So cherry-pick the patch below commit 852e85a704c2e11c050bdea286bc438aba4f4a22 Author: Jiri Pirko Date: Sat Jul 16 13:02:34 2022 +0200 net: devlink: add unlocked variants of devling_trap*() functions Add unlocked variants of devl_trap*() functions to be used in drivers called-in with devlink->lock held. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035123] Re: scripts/pahole-flags.sh change return to exit 0
** Changed in: linux-bluefield (Ubuntu Jammy) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2035123 Title: scripts/pahole-flags.sh change return to exit 0 Status in linux package in Ubuntu: Fix Released Status in linux-bluefield package in Ubuntu: Triaged Status in linux source package in Jammy: In Progress Status in linux-bluefield source package in Jammy: Fix Committed Bug description: BugLink: https://bugs.launchpad.net/bugs/2035123 [Impact] When building the Jammy linux-bluefield kernel tree on a system without pahole installed, the following warning is emitted: ./scripts/pahole-flags.sh: line 7: return: can only `return' from a function or sourced script scripts/pahole-flags.sh attempts to return from an if statement that is not within a function, and generates a warning. The fix is straightforward, changing return to an exit 0. --- a/scripts/pahole-flags.sh +++ b/scripts/pahole-flags.sh @@ -4,7 +4,7 @@ extra_paholeopt= if ! [ -x "$(command -v ${PAHOLE})" ]; then - return + exit 0 fi [Testcase] Clone the linux-bluefield kernel tree and build it on a arm64 system without pahole installed. A test kernel is available with the fix applied in: https://launchpad.net/~mruffell/+archive/ubuntu/sf368560-test Both linux-bluefield and ubuntu-jammy build correctly. [Fix] This was fixed by Linus Torvalds in the following merge commit: commit fc02cb2b37fe2cbf1d3334b9f0f0eab9431766c4 Merge: bfc484fe6abb 84882cf72cd7 Author: Linus Torvalds Date: Tue Nov 2 06:20:58 2021 -0700 Subject: Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fc02cb2b37fe2cbf1d3334b9f0f0eab9431766c4 Note, the original commit does not have the fix included: commit 9741e07ece7c247dd65e1aa01e16b683f01c05a8 Author: Jiri Olsa Date: Fri Oct 29 14:57:29 2021 +0200 Subject: kbuild: Unify options for BTF generation for vmlinux and modules Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9741e07ece7c247dd65e1aa01e16b683f01c05a8 The Ubuntu kernel cherry-picked the original commit and did not pick up the silent fix made in the merge commit. Submitting the silent fix as a SAUCE patch with changelog describing the change. Note: I know SAUCE patches are bad, but in this scenario, to revert the initial commit and re-apply the fixed version would require us to revert two additional dependency commits, making six patches to review, vs, a one line change in a SAUCE commit that has a real changelog entry. [Where problems could occur] The Ubuntu kernel is built with pahole enabled, and requires pahole to be installed as a build dependency. It is extremely unlikely that any users are disabling pahole at build time, apart from linux-bluefield engineers. If a regression were to occur, engineers would see errors during build time about scripts/pahole-flags.sh not executing properly. [Other info] Linus remarked about the issue in the following lkml discussion: https://lore.kernel.org/lkml/CAHk-=wgdE6=ob5nf60gvryag24mkajbgjf3jpufme1k_upb...@mail.gmail.com/ https://lore.kernel.org/lkml/CAHk-=wgPZM4bN=LUCrMkG3FX808QSLm6Uv6ixm5P350_7c=x...@mail.gmail.com/ This was silently Incorporated into the linux-stable commit: commit 0baced0e0938f2895ceba54038eaf15ed91032e7 5.15.y From: Jiri Olsa Date: Sun, 4 Sep 2022 15:19:00 +0200 Subject: kbuild: Unify options for BTF generation for vmlinux and modules Link: https://github.com/gregkh/linux/commit/0baced0e0938f2895ceba54038eaf15ed91032e7 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2035123/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2038868] Re: gpio-mlxbf3: support valid mask
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2038868 Title: gpio-mlxbf3: support valid mask Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] After syncing up the gpio-mlxbf3.c driver with the upstreamed version, we dropped the use of the valid_mask variable because kernels greater or equal to 6.2.0 dont need it. This is no longer needed in kernel versions >= 6.2.0 because valid_mask is populated by core gpio code. 5.15 kernel doesnt support that feature so we still need to explicitly define valid_mask like we did before. This doesnt impact the functionality of the GPIO driver but it is a security breach as it doesnt restrict the access to only gpios defined in the acpi table by valid_mask. [Fix] * define valid_mask and init_valid_mask [Test Case] * Make sure that the user (libgpiod) cannot access any other gpio besides 0->4 (gpiochip0) and 22-23 in gpiochip1. [Regression Potential] no known regression. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2038868/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039561] Re: mlxbf_pmc: Replace sauce patches with upstream commits
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2039561 Title: mlxbf_pmc: Replace sauce patches with upstream commits Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The mlxbf_pmc BlueField platform driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. 2 patches are from upstream linux, while the 3rd is from linux-next. [Test Case] * Boot BF2/BF3 platform, verify no new errors * Test sysfs interface exposed by driver for programming various counters and events. [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039561/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039325] Re: mmc: sdhci-of-dwcmshc: Intermittent data error under stress test
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2039325 Title: mmc: sdhci-of-dwcmshc: Intermittent data error under stress test Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: sdhci-dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110 SRU Justification: [Impact] This change is needed to avoid intermittent "sdhci-dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110" error under stress test [Fix] The fix added a quirk for the BF3 sdhci driver to avoid such data error. [Test Case] 1. Same functionality and testing as on BlueField-1/2. 2. Install OS on NVME, and run the FIO test on EMMC. fio --name=mmcblk0_randrw_stress_round_1 \ --ioengine=libaio --direct=1 --time_based=1 \ --end_fsync=1 --ramp_time=5 --norandommap=1 \ --randrepeat=0 --group_reporting=1 --numjobs=8 \ --iodepth=128 --rw=randrw --overwrite=1 \ --runtime=3600 --bs=4K --filename=/dev/mmcblk0 & [Regression Potential] Same behavior from user perspective. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039325/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2032378] Re: Devlink backport: fix race and lock issue
Mark the verification as already done using the new tag. ** Tags removed: verification-needed-jammy-linux-bluefield ** Tags added: verification-done-jammy-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2032378 Title: Devlink backport: fix race and lock issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Summary: machine get stuck if try to switch to switchdev mode while toggling ns over BF kernel. This is due to missing lock refactor series devlink and driver from kernel 6.0 How to test: 1. Configure SRIOV 2. Enable switchdev mode 3. Devlink reload 4. Add/Del network namespace Note: Need to run the test multiple times to reproduce the issue. How to fix: Need to backport a series of devlink patches into BlueField 5.15 kernel, from 6.0 upstream kernel. Patches needed for 5.4/5.15 kernel: First series: 367dfa121205^..f0680ef0f949 second series: 5502e8712c9b^..c90005b5f75c Also needed: 644a66c60f02 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2032378/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039325] Re: mmc: sdhci-of-dwcmshc: Intermittent data error under stress test
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2039325 Title: mmc: sdhci-of-dwcmshc: Intermittent data error under stress test Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: sdhci-dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110 SRU Justification: [Impact] This change is needed to avoid intermittent "sdhci-dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110" error under stress test [Fix] The fix added a quirk for the BF3 sdhci driver to avoid such data error. [Test Case] 1. Same functionality and testing as on BlueField-1/2. 2. Install OS on NVME, and run the FIO test on EMMC. fio --name=mmcblk0_randrw_stress_round_1 \ --ioengine=libaio --direct=1 --time_based=1 \ --end_fsync=1 --ramp_time=5 --norandommap=1 \ --randrepeat=0 --group_reporting=1 --numjobs=8 \ --iodepth=128 --rw=randrw --overwrite=1 \ --runtime=3600 --bs=4K --filename=/dev/mmcblk0 & [Regression Potential] Same behavior from user perspective. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039325/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035374] Re: pwr-mlxbf: update Kconfig with upstream changes
Mark the verification as already done using the new tag. ** Tags removed: verification-needed-jammy-linux-bluefield ** Tags added: verification-done-jammy-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035374 Title: pwr-mlxbf: update Kconfig with upstream changes Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The Mellanox BlueField power driver (pwr-mlxbf) has a dependency on either the BlueField-2 or BlueField-3 GPIO driver. The current Kconfig does not have the BlueField-3 dependency. [Fix] There is an upstream commit that extends the Kconfig to include the GPIO_MLXBF3 dependency. [Test Case] Build the power driver with and without GPIO_MLXBF2 and GPIO_MLXBF3 present in the kernel configuration. Verify power driver is built. [Regression Potential] * none [Other] * none To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035374/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033439] Re: Cherry-pick gpio and pinctrl drivers from upstream
Mark the verification as already done using the new tag. ** Tags removed: verification-needed-jammy-linux-bluefield ** Tags added: verification-done-jammy-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033439 Title: Cherry-pick gpio and pinctrl drivers from upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Cherry pick gpio-mlxbf3.c and pinctrl-mlxbf3.c patches from linux- next. [Fix] * Revert existing pinctrl-mlxbf3.c driver changes * Revert existing gpio-mlxbf3.c driver changes * Cherry-pick all the following commits from linux-next: gpio-mlxbf3.c: cd33f216d241520385a5166ae73a0771197a9f0b 38a700efc51080c7184f71edbf5e49561da9754f 1d2a22fa6d2511d5871d87c15b4fe7a944fe3b2a pinctrl-mlxbf3.c: d11f932808dc689717e409bbc81b5093e7902fc9 743d3336029ffe2bb38e982a3b572ced243c6d43 c0f84760b01e8d8b59e9e186a4f7fa8f081a4488 69657e60b8a7faf83b583c658ec7ce1f5ece9eb3 [Test Case] * All test cases are for BF3 only * Check that the gpio-mlxbf3 driver gets loaded at boot time without issues * Check that the pinctrl-mlxbf3 driver gets loaded at boot time without issues * use libgpiod to test the access to gpio pin 0 through 4 i.e. read and write. * rmmod and modprobe of both drivers * Check that pwr-mlxbf driver is loaded successfully * Check that mlxbf-gige driver is loaded successfully and the irq is initialized properly (dmesg | grep PHY) [Regression Potential] * We introduced a dependency of the gpio-mlxbf3 driver on pinctrl-mlxbf3. So we need to make sure that doesnt trigger any regressions with loading other dependent drivers * make sure that removing/reloading the driver/restarting the DPU doesnt cause any panic related to these drivers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033439/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033551] Re: mlxbf_bootctl: replace SAUCE patches with upstream
Mark the verification as already done using the new tag. ** Tags removed: verification-needed-jammy-linux-bluefield ** Tags added: verification-done-jammy-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033551 Title: mlxbf_bootctl: replace SAUCE patches with upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The mlxbf_bootctl BlueField platform driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. One patch for the mlxbf_bootctl driver is not yet upstreamed, so that patch will remain as SAUCE patch for now. [Test Case] * Boot BF2/BF3 platform, verify no new errors * Program MFG fields via "bfcfg" tool: 1) With current image, use 'bfcfg -d' to display current values 2) Reboot, stopping at UEFI menu, trigger 'Reset MFG Info' 3) With new image, use 'bfcfg' to push new or same MFG fields 4) Verify that the MFG fields are programmed properly via 'bfcfg -d' 5) Repeat process to replace proper MFG fields * Display BlueField boot log 1) cd /sys/bus/platform/devices/MLNXBF04:00 2) cat rsh_log 3) Should see boot log entries relevant to last boot [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033551/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2034594] Re: mlxbf-ptm: add thermal envelope debugfs node
Mark the verification as already done using the new tag. ** Tags removed: verification-needed-jammy-linux-bluefield ** Tags added: verification-done-jammy-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2034594 Title: mlxbf-ptm: add thermal envelope debugfs node Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] mlxbf-ptm is a kernel driver that provides debufgs interface for system software to monitor Bluefield devices' power and thermal management parameters. [Fix] This change: * This change adds additional debugfs node that gets the current thermal envelope set in the thermal state machine [Test Case] * After installing the kernel module, debugfs entry at /sys/kernel/debug/mlxbf-ptm will be created * monitors/status folder contains read-only parameters reported from the Bluefield device * cat /sys/kernel/debug/mlxbf-ptm/temp_envelope shows current thermal envelope [Regression Potential] * Minimal - as these are read-only parameters that report Bluefield device information. [Other] * This code is likely to change depending on feedback we received from maintainers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2034594/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039561] Re: mlxbf_pmc: Replace sauce patches with upstream commits
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2039561 Title: mlxbf_pmc: Replace sauce patches with upstream commits Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] The mlxbf_pmc BlueField platform driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. 2 patches are from upstream linux, while the 3rd is from linux-next. [Test Case] * Boot BF2/BF3 platform, verify no new errors * Test sysfs interface exposed by driver for programming various counters and events. [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039561/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2038868] Re: gpio-mlxbf3: support valid mask
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2038868 Title: gpio-mlxbf3: support valid mask Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] After syncing up the gpio-mlxbf3.c driver with the upstreamed version, we dropped the use of the valid_mask variable because kernels greater or equal to 6.2.0 dont need it. This is no longer needed in kernel versions >= 6.2.0 because valid_mask is populated by core gpio code. 5.15 kernel doesnt support that feature so we still need to explicitly define valid_mask like we did before. This doesnt impact the functionality of the GPIO driver but it is a security breach as it doesnt restrict the access to only gpios defined in the acpi table by valid_mask. [Fix] * define valid_mask and init_valid_mask [Test Case] * Make sure that the user (libgpiod) cannot access any other gpio besides 0->4 (gpiochip0) and 22-23 in gpiochip1. [Regression Potential] no known regression. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2038868/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2037403] Re: PCI BARs larger than 128GB are disabled
This bug is awaiting verification that the linux- bluefield/5.15.0-1027.29 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2037403 Title: PCI BARs larger than 128GB are disabled Status in linux package in Ubuntu: Incomplete Status in linux-bluefield package in Ubuntu: New Status in linux source package in Jammy: Fix Committed Status in linux-bluefield source package in Jammy: Fix Committed Bug description: [SRU Justification] [Impact] Current linux-bluefield kernel reports that PCI BARs larger than 128GB are disabled. Increase the maximum PCI BAR size from 128GB to 8TB for future expansion. [Test Plan] Use updated kernel with PCI device with PCI BARs larger than 128GB. [Where problems could occur] No potential problems are expected. [Other Info] Testing has been provided by the customer. This patch is a cherry-pick from the upstream 5.18 kernel. It is intended for linux-bluefield kernel re-spin but it may be also suitable for the generic kernel (separate submission). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2037403/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035128] Re: mlxbf-gige: Enable the OOB port in mlxbf_gige_open
This bug is awaiting verification that the linux-bluefield/5.4.0-1072.78 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-focal -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035128 Title: mlxbf-gige: Enable the OOB port in mlxbf_gige_open Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Focal: Fix Committed Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] At the moment, the OOB port is enabled in the mlxbf_gige_probe function. If the mlxbf_gige_open is not executed, this could cause pause frames to increase in the case where there is high backgroud traffic. This results in clogging the BMC port as well. [Fix] * Move enabling the OOB port to mlxbf_gige_open. [Test Case] * Main test for this bug: Check that the BMC is always pingable while pushing a BFB Other tests: * Check if the gige driver is loaded * Check that the oob_net0 interface is up and operational * Do the reboot test and powercycle test and check the oob_net0 interface again * Push BFB multiple times and make sure the OOB is up and running [Regression Potential] Since are moving code that hasn't moved since BF2, it is important to make sure that there is no regression. Make sure that the OOB interface is always up and pingable after the reboot test, the powercycle test and after pushing a BFB. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035128/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2037403] Re: PCI BARs larger than 128GB are disabled
** Also affects: linux (Ubuntu) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu Jammy) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2037403 Title: PCI BARs larger than 128GB are disabled Status in linux package in Ubuntu: New Status in linux-bluefield package in Ubuntu: New Status in linux source package in Jammy: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: [SRU Justification] [Impact] Current linux-bluefield kernel reports that PCI BARs larger than 128GB are disabled. Increase the maximum PCI BAR size from 128GB to 8TB for future expansion. [Test Plan] Use updated kernel with PCI device with PCI BARs larger than 128GB. [Where problems could occur] No potential problems are expected. [Other Info] Testing has been provided by the customer. This patch is a cherry-pick from the upstream 5.18 kernel. It is intended for linux-bluefield kernel re-spin but it may be also suitable for the generic kernel (separate submission). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2037403/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2037403] [NEW] PCI BARs larger than 128GB are disabled
Public bug reported: [SRU Justification] [Impact] Current linux-bluefield kernel reports that PCI BARs larger than 128GB are disabled. Increase the maximum PCI BAR size from 128GB to 8TB for future expansion. [Test Plan] Use updated kernel with PCI device with PCI BARs larger than 128GB. [Where problems could occur] No potential problems are expected. [Other Info] Testing has been provided by the customer. This patch is a cherry-pick from the upstream 5.18 kernel. It is intended for linux-bluefield kernel re-spin but it may be also suitable for the generic kernel (separate submission). ** Affects: linux-bluefield (Ubuntu) Importance: Undecided Status: New ** Affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: In Progress ** Package changed: linux-azure (Ubuntu) => linux-bluefield (Ubuntu) ** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2037403 Title: PCI BARs larger than 128GB are disabled Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Jammy: In Progress Bug description: [SRU Justification] [Impact] Current linux-bluefield kernel reports that PCI BARs larger than 128GB are disabled. Increase the maximum PCI BAR size from 128GB to 8TB for future expansion. [Test Plan] Use updated kernel with PCI device with PCI BARs larger than 128GB. [Where problems could occur] No potential problems are expected. [Other Info] Testing has been provided by the customer. This patch is a cherry-pick from the upstream 5.18 kernel. It is intended for linux-bluefield kernel re-spin but it may be also suitable for the generic kernel (separate submission). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2037403/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2032378] Re: Devlink backport: fix race and lock issue
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2032378 Title: Devlink backport: fix race and lock issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Summary: machine get stuck if try to switch to switchdev mode while toggling ns over BF kernel. This is due to missing lock refactor series devlink and driver from kernel 6.0 How to test: 1. Configure SRIOV 2. Enable switchdev mode 3. Devlink reload 4. Add/Del network namespace Note: Need to run the test multiple times to reproduce the issue. How to fix: Need to backport a series of devlink patches into BlueField 5.15 kernel, from 6.0 upstream kernel. Patches needed for 5.4/5.15 kernel: First series: 367dfa121205^..f0680ef0f949 second series: 5502e8712c9b^..c90005b5f75c Also needed: 644a66c60f02 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2032378/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033316] Re: UBUNTU: [Config] optee: Adjust CONFIG_OPTEE_SHM_NUM_PRIV_PAGES to 4
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033316 Title: UBUNTU: [Config] optee: Adjust CONFIG_OPTEE_SHM_NUM_PRIV_PAGES to 4 Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] This change is needed to OPTEE as the FTPM driver requires 2 contiguous SHMEM pages to boot. Without this change the ftpm_tee_probe function will fail. [Fix] Change the CONFIG_OPTEE_SHM_NUM_PRIV_PAGES from 1 to 4. Note that other SHMEM structures are allocated out of this OPTEE pool as well. [Test Case] Verified the Microsoft TPM 2.0 FTPM is fully functional with this change, running the tpm2-tss Test Suite (mandatory & optional) tests as well as many of the TPM2 Command tools. [Regression Potential] This only has an effect if OPTEE is configured/enabled in the kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033316/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033439] Re: Cherry-pick gpio and pinctrl drivers from upstream
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033439 Title: Cherry-pick gpio and pinctrl drivers from upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Cherry pick gpio-mlxbf3.c and pinctrl-mlxbf3.c patches from linux- next. [Fix] * Revert existing pinctrl-mlxbf3.c driver changes * Revert existing gpio-mlxbf3.c driver changes * Cherry-pick all the following commits from linux-next: gpio-mlxbf3.c: cd33f216d241520385a5166ae73a0771197a9f0b 38a700efc51080c7184f71edbf5e49561da9754f 1d2a22fa6d2511d5871d87c15b4fe7a944fe3b2a pinctrl-mlxbf3.c: d11f932808dc689717e409bbc81b5093e7902fc9 743d3336029ffe2bb38e982a3b572ced243c6d43 c0f84760b01e8d8b59e9e186a4f7fa8f081a4488 69657e60b8a7faf83b583c658ec7ce1f5ece9eb3 [Test Case] * All test cases are for BF3 only * Check that the gpio-mlxbf3 driver gets loaded at boot time without issues * Check that the pinctrl-mlxbf3 driver gets loaded at boot time without issues * use libgpiod to test the access to gpio pin 0 through 4 i.e. read and write. * rmmod and modprobe of both drivers * Check that pwr-mlxbf driver is loaded successfully * Check that mlxbf-gige driver is loaded successfully and the irq is initialized properly (dmesg | grep PHY) [Regression Potential] * We introduced a dependency of the gpio-mlxbf3 driver on pinctrl-mlxbf3. So we need to make sure that doesnt trigger any regressions with loading other dependent drivers * make sure that removing/reloading the driver/restarting the DPU doesnt cause any panic related to these drivers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033439/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033551] Re: mlxbf_bootctl: replace SAUCE patches with upstream
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033551 Title: mlxbf_bootctl: replace SAUCE patches with upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The mlxbf_bootctl BlueField platform driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. One patch for the mlxbf_bootctl driver is not yet upstreamed, so that patch will remain as SAUCE patch for now. [Test Case] * Boot BF2/BF3 platform, verify no new errors * Program MFG fields via "bfcfg" tool: 1) With current image, use 'bfcfg -d' to display current values 2) Reboot, stopping at UEFI menu, trigger 'Reset MFG Info' 3) With new image, use 'bfcfg' to push new or same MFG fields 4) Verify that the MFG fields are programmed properly via 'bfcfg -d' 5) Repeat process to replace proper MFG fields * Display BlueField boot log 1) cd /sys/bus/platform/devices/MLNXBF04:00 2) cat rsh_log 3) Should see boot log entries relevant to last boot [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033551/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2034578] Re: Support IPSEC full offload implementation
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2034578 Title: Support IPSEC full offload implementation Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Summary: Align Kernel IPsec Full offload implementation in the DPU to the upstream Full offload in all components: OFED, Strongswan, etc. This is in order for DPU Kernel IPsec to include policy offload and be fully aligned to what CX Kernel customers will use. How to test: Host 1: /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.0 mode legacy echo 'dmfs' > /sys/bus/pci/devices/:03:00.0/net/p0/compat/devlink/steering_mode echo 'full' > /sys/class/net/p0/compat/devlink/ipsec_mode /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.0 mode switchdev BF on host 1: /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir out tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0xefa83812 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir in tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir fwd tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165/16 dst 196.234.182.166/16 proto esp spi 0xefa83812 reqid 0xefa83812 mode transport aead 'rfc4106(gcm(aes))' 0xe2fe3857301d8f72b5d71d295a462ef21868e407 128 offload packet dev p0 dir out sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.182.166/16 dst 196.234.181.165/16 proto esp spi 0x63a7db74 reqid 0x63a7db74 mode transport aead 'rfc4106(gcm(aes))' 0xe916c4d0db1886e8c877b023e8cebef53b4d2d0f 128 offload packet dev p0 dir in sel src 196.234.182.166/16 dst 196.234.181.165/16 flag esn replay-window 32 Start OVS and set following configure on BF: /usr/bin/ovs-vsctl set Open_vSwitch . other_config:hw-offload=true /usr/bin/ovs-vsctl set Open_vSwitch . other_config:max-idle=30 Host2: /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.1 mode legacy echo 'dmfs' > /sys/bus/pci/devices/:03:00.1/net/p1/compat/devlink/steering_mode echo 'full' > /sys/class/net/p1/compat/devlink/ipsec_mode /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.1 mode switchdev BF on host 2: /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir out tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0xefa83812 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir in tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir fwd tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165 dst 196.234.182.166 proto esp spi 0xefa83812 reqid 0xefa83812 mode transport aead 'rfc4106(gcm(aes))' 0xe2fe3857301d8f72b5d71d295a462ef21868e407 128 offload packet dev p0 dir out sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165 dst 196.234.182.166 proto esp spi 0x63a7db74 reqid 0x63a7db74 mode transport aead 'rfc4106(gcm(aes))' 0xe916c4d0db1886e8c877b023e8cebef53b4d2d0f 128 offload packet dev p0 dir in sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 Start OVS and set following configure on BF: /usr/bin/ovs-vsctl set Open_vSwitch . other_config:hw-offload=true /usr/bin/ovs-vsctl set Open_vSwitch . other_config:max-idle=30 Send the traffic between host 1 and host 2 and check IPsec
[Kernel-packages] [Bug 2034594] Re: mlxbf-ptm: add thermal envelope debugfs node
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2034594 Title: mlxbf-ptm: add thermal envelope debugfs node Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] mlxbf-ptm is a kernel driver that provides debufgs interface for system software to monitor Bluefield devices' power and thermal management parameters. [Fix] This change: * This change adds additional debugfs node that gets the current thermal envelope set in the thermal state machine [Test Case] * After installing the kernel module, debugfs entry at /sys/kernel/debug/mlxbf-ptm will be created * monitors/status folder contains read-only parameters reported from the Bluefield device * cat /sys/kernel/debug/mlxbf-ptm/temp_envelope shows current thermal envelope [Regression Potential] * Minimal - as these are read-only parameters that report Bluefield device information. [Other] * This code is likely to change depending on feedback we received from maintainers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2034594/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035128] Re: mlxbf-gige: Enable the OOB port in mlxbf_gige_open
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035128 Title: mlxbf-gige: Enable the OOB port in mlxbf_gige_open Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Focal: Fix Committed Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] At the moment, the OOB port is enabled in the mlxbf_gige_probe function. If the mlxbf_gige_open is not executed, this could cause pause frames to increase in the case where there is high backgroud traffic. This results in clogging the BMC port as well. [Fix] * Move enabling the OOB port to mlxbf_gige_open. [Test Case] * Main test for this bug: Check that the BMC is always pingable while pushing a BFB Other tests: * Check if the gige driver is loaded * Check that the oob_net0 interface is up and operational * Do the reboot test and powercycle test and check the oob_net0 interface again * Push BFB multiple times and make sure the OOB is up and running [Regression Potential] Since are moving code that hasn't moved since BF2, it is important to make sure that there is no regression. Make sure that the OOB interface is always up and pingable after the reboot test, the powercycle test and after pushing a BFB. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035128/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035374] Re: pwr-mlxbf: update Kconfig with upstream changes
This bug is awaiting verification that the linux- bluefield/5.15.0-1025.27 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035374 Title: pwr-mlxbf: update Kconfig with upstream changes Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The Mellanox BlueField power driver (pwr-mlxbf) has a dependency on either the BlueField-2 or BlueField-3 GPIO driver. The current Kconfig does not have the BlueField-3 dependency. [Fix] There is an upstream commit that extends the Kconfig to include the GPIO_MLXBF3 dependency. [Test Case] Build the power driver with and without GPIO_MLXBF2 and GPIO_MLXBF3 present in the kernel configuration. Verify power driver is built. [Regression Potential] * none [Other] * none To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035374/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035128] Re: mlxbf-gige: Enable the OOB port in mlxbf_gige_open
** Changed in: linux-bluefield (Ubuntu Focal) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035128 Title: mlxbf-gige: Enable the OOB port in mlxbf_gige_open Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Focal: Fix Committed Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] At the moment, the OOB port is enabled in the mlxbf_gige_probe function. If the mlxbf_gige_open is not executed, this could cause pause frames to increase in the case where there is high backgroud traffic. This results in clogging the BMC port as well. [Fix] * Move enabling the OOB port to mlxbf_gige_open. [Test Case] * Main test for this bug: Check that the BMC is always pingable while pushing a BFB Other tests: * Check if the gige driver is loaded * Check that the oob_net0 interface is up and operational * Do the reboot test and powercycle test and check the oob_net0 interface again * Push BFB multiple times and make sure the OOB is up and running [Regression Potential] Since are moving code that hasn't moved since BF2, it is important to make sure that there is no regression. Make sure that the OOB interface is always up and pingable after the reboot test, the powercycle test and after pushing a BFB. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035128/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2016398] Re: stacked overlay file system mounts that have chroot() called against them appear to be getting locked (by the kernel most likely?)
Nothing special for bluefield. I assume that the source package was not built against the previous version and thus the verification has triggered again. ** Tags removed: verification-needed-focal-linux-bluefield ** Tags added: verification-done-focal-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2016398 Title: stacked overlay file system mounts that have chroot() called against them appear to be getting locked (by the kernel most likely?) Status in linux package in Ubuntu: Confirmed Status in linux source package in Focal: Fix Committed Status in linux source package in Jammy: In Progress Status in linux source package in Kinetic: Won't Fix Status in linux source package in Lunar: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] Opened files reported in /proc/pid/map_files can be shows with the wrong mount point using overlayfs with filesystem namspaces. This incorrect behavior is fixed: UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files However, the fix introduced a new regression, the reference to the original file stored in vma->vm_prfile is not properly released when vma->vm_prfile is replaced with a new file. This can cause a reference counter unbalance, leading errors such as "target is busy" when trying to unmount overlayfs, even if the filesystem has not active reference. [Test case] Reproducer provided by original bug reporter: https://launchpadlibrarian.net/663151659/overlayfsscript_example [Fix] Fix by properly releasing the original file stored in vm_prfile. [Regression potential] This fix seems to solve the reported bug (verified with the reproducer) and it doesn't seem to introduce other regressions or behavior change. However, we may experience regressions in overlayfs or potentially other "target is busy" errors when unmounting overlayfs filesystems with this fix applied, if there are still other corner cases not covered properly. [Original bug report] Hi BACKGROUND: --- I have a set of scripts that debootstraps and builds vanilla Debian installs, where the only modifications are what packages are installed. Then that becomes the lower directory of an overlayfs mount, a tmpfs becomes the upperdir, and then that becomes the chroot where config changes are made with more scripts. This overlayfs is mounted in its own mount namespace as the script is unshare'd Within these scripts, I make packages with a fork of checkinstall I made which uses the / as the lowerdir, as overlayfs allows that, a tmpfs as the upperdir, the install command is run with chroot, and then the contents are copied. THE ISSUE: -- I noticed that it appears that my ramdisks are remaining in memory, I have isolated it out to the overlayfs mounts with the overlayfs as the lowerdir where chroot is being called on it. I tried entering the namespace myself and umounting it, and notice I get the error "Target is busy" With the "Target Is Busy" issue, I tried to `lsof` and `fuser` and `lsfd` as much as I could, but every user mode tool shows absolutely NO files open at all I was using Kubuntu 18.04, it was an old install and 32 bit, so I had no upgrade path to be using more recent versions. Now I am on 23.04, but I also think this is happening on a 22.10 laptop DIAGNOSIS: -- I am able to isolate it out, and get this to replicate in the smallest script that I could that is now attached. the chroot is important. I am able to unmount the second overlayfs if chroot is never called. Nothing ever has to run IN the chroot, trying to call a binary that does not exist, and then trying to unmount it results in "target is busy" with no open files reportable by user mode tools FURTHER TESTING: Trying to find out the version of Linux this was introduced, I made a script that builds a very minimal ext4 file for a VM with a small debootstrapped system, builds a kernel, and runs the kernel with QEMU.. but when I started, after all that, I can't replicate it at all with a vanilla 6.2.0 kernel. I used the same config out of my /boot to the .config, only tweaking a few options like SYSTEM_TRUSTED_KEYS. Trying to unmount the second filesystem actually results in success I then built a Kubuntu Lunar VM, and was able to replicate it on not just my system. When I built and install the vanilla kernel with bindeb-pkg, and then I boot the vanilla kernel, I can't replicate it, the second filesystem unmounts. When I reboot the VM, and start the default distribution kernel, I am able to replicate the issue, the second filesystem being unable to unmount, with "target is busy" with no files open according to standard usermode tools, when chroot is
[Kernel-packages] [Bug 2031093] Re: libgnutls report "trap invalid opcode" when trying to install packages over https
bluefield is arm64 only kernel and this bug is x86 specific. ** Tags removed: verification-needed-focal-linux-bluefield ** Tags added: verification-done-focal-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2031093 Title: libgnutls report "trap invalid opcode" when trying to install packages over https Status in ubuntu-kernel-tests: New Status in gnutls28 package in Ubuntu: Confirmed Status in linux package in Ubuntu: Confirmed Status in gnutls28 source package in Focal: Confirmed Status in linux source package in Focal: Fix Released Status in gnutls28 source package in Jammy: Confirmed Status in linux source package in Jammy: Fix Released Status in gnutls28 source package in Lunar: Confirmed Status in linux source package in Lunar: Fix Released Bug description: [Impact] When booting linux with Gather Data Sampling mitigations without updated microcode on an affected CPU, AVX will be disabled. This will cause programs connecting to https using gnutls on Jammy to break, including apt and git. [Test case] git clone https://git.launchpad.net/~canonical-kernel-team/+git/autotest-client-tests Cloning into 'autotest-client-tests'... error: git-remote-https died of signal 4 dmesg: [ 806.072080] traps: git-remote-http[2561] trap invalid opcode ip:7fa2e7dac44a sp:7ffed6796480 error:0 in libgnutls.so.30.31.0[7fa2e7c85000+129000] Works fine with the mitigation disabled by default. [Potential regressions] Users booting on affected parts without microcode updates will be subject to Gather Data Sampling attacks (which can be done by local untrusted attackers), which may leak confidential data, including keys. - When trying to install linux-libc-dev on Oracle BM.Standard2.52 (seems to be the only affected instance) with Jammy 5.15.0-81-generic, it will get interrupted with: E: Method https has died unexpectedly! E: Sub-process https received signal 4. $ sudo apt install linux-libc-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done The following NEW packages will be installed: linux-libc-dev 0 upgraded, 1 newly installed, 0 to remove and 54 not upgraded. Need to get 1353 kB of archives. After this operation, 6943 kB of additional disk space will be used. E: Method https has died unexpectedly! E: Sub-process https received signal 4. From dmesg you will see: [ 1078.750067] traps: https[4572] trap invalid opcode ip:7f3c1e6316be sp:7ffea26b61c0 error:0 in libgnutls.so.30.31.0[7f3c1e50f000+129000] Also, git clone is not working as well. $ git clone --depth=1 https://git.launchpad.net/~canonical-kernel-team/+git/autotest-client-tests Cloning into 'autotest-client-tests'... error: git-remote-https died of signal 4 dmesg: [ 806.072080] traps: git-remote-http[2561] trap invalid opcode ip:7fa2e7dac44a sp:7ffed6796480 error:0 in libgnutls.so.30.31.0[7fa2e7c85000+129000] libgnutls30 version:3.7.3-4ubuntu1.2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2031093/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2031022] Re: Fix boot test warning for log_check "CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:878 get_xsave_addr+0x98/0xb0"
Similar situation for bluefield kernel as with azure kernel. Moreover bluefield is arm64 only and this is x86 specific bug. ** Tags removed: verification-needed-focal-linux-bluefield ** Tags added: verification-done-focal-linux-bluefield -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2031022 Title: Fix boot test warning for log_check "CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:878 get_xsave_addr+0x98/0xb0" Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: Fix Released Bug description: SRU Justification [Impact] Gather Data Sampling, affecting Intel processors and assigned CVE-2022-40982 introduced this warning.The fix is on microcode, but part of the mitigation on the kernel side is to detect if the microcode update is not there and disable AVX in case it's supported. This needed some reshuffle during initialization so that turning off AVX was possible without it being too late, which also moved the FPU initialization. See commit https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.4.252=6e60443668978131a442df485db3deccb31d5651 This causes the following warning during boot: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:878 get_xsave_addr+0x98/0xb0 Logs: 154609:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.438234] [ cut here ] 154709:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.439198] get of unsupported state 154809:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.439206] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:878 get_xsave_addr+0x98/0xb0 154909:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] Modules linked in: 155009:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-158-lowlatency #175-Ubuntu 155109:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] Hardware name: NVIDIA NVIDIA DGX-2/NVIDIA DGX-2, BIOS 0.29 06/07/2021 155209:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] RIP: 0010:get_xsave_addr+0x98/0xb0 155309:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] Code: 7e ff ff ff 48 83 c4 08 5b 5d c3 80 3d ed d8 bc 01 00 75 ae 48 c7 c7 52 76 51 9a 89 75 f4 c6 05 da d8 bc 01 01 e8 1a c8 a7 00 <0f> 0b 8b 75 f4 eb 91 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 155409:02:40 INFO | Aug 10 09:01:05 akis systemd-udevd[3200]: enp58s0np0: Process '/usr/bin/killall -SIGHUP irqbalance' failed with exit code 1. 155509:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] RSP: :9ac03e80 EFLAGS: 00010282 155609:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] RAX: RBX: 9ae47180 RCX: 00032109e11a 155709:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] RDX: 0018 RSI: 9bd8c620 RDI: 0246 155809:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] RBP: 9ac03e90 R08: 9bd8c620 R09: 74726f707075736e 155909:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] R10: 74726f707075736e R11: 6574617473206465 R12: 9ae47040 156009:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] R13: 0246 R14: 5a1c7469 R15: 5a1d7ee0 156109:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] FS: () GS:a0840040() knlGS: 156209:02:40 INFO | Aug 10 09:01:05 akis systemd[1]: Starting Load Kernel Module pstore_zone... 156309:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] CS: 0010 DS: ES: CR0: 80050033 156409:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] CR2: a146f000 CR3: 00802c80a001 CR4: 007200b0 156509:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] DR0: DR1: DR2: 156609:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] DR3: DR6: fffe0ff0 DR7: 0400 156709:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] Call Trace: 156809:02:40 INFO | Aug 10 09:01:05 akis kernel: [ 13.440197] ? show_regs.cold+0x1a/0x1f 156909:02:40 INFO | Aug 10 09:01:05
[Kernel-packages] [Bug 2033316] Re: UBUNTU: [Config] optee: Adjust CONFIG_OPTEE_SHM_NUM_PRIV_PAGES to 4
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033316 Title: UBUNTU: [Config] optee: Adjust CONFIG_OPTEE_SHM_NUM_PRIV_PAGES to 4 Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] This change is needed to OPTEE as the FTPM driver requires 2 contiguous SHMEM pages to boot. Without this change the ftpm_tee_probe function will fail. [Fix] Change the CONFIG_OPTEE_SHM_NUM_PRIV_PAGES from 1 to 4. Note that other SHMEM structures are allocated out of this OPTEE pool as well. [Test Case] Verified the Microsoft TPM 2.0 FTPM is fully functional with this change, running the tpm2-tss Test Suite (mandatory & optional) tests as well as many of the TPM2 Command tools. [Regression Potential] This only has an effect if OPTEE is configured/enabled in the kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033316/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033439] Re: Cherry-pick gpio and pinctrl drivers from upstream
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033439 Title: Cherry-pick gpio and pinctrl drivers from upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Cherry pick gpio-mlxbf3.c and pinctrl-mlxbf3.c patches from linux- next. [Fix] * Revert existing pinctrl-mlxbf3.c driver changes * Revert existing gpio-mlxbf3.c driver changes * Cherry-pick all the following commits from linux-next: gpio-mlxbf3.c: cd33f216d241520385a5166ae73a0771197a9f0b 38a700efc51080c7184f71edbf5e49561da9754f 1d2a22fa6d2511d5871d87c15b4fe7a944fe3b2a pinctrl-mlxbf3.c: d11f932808dc689717e409bbc81b5093e7902fc9 743d3336029ffe2bb38e982a3b572ced243c6d43 c0f84760b01e8d8b59e9e186a4f7fa8f081a4488 69657e60b8a7faf83b583c658ec7ce1f5ece9eb3 [Test Case] * All test cases are for BF3 only * Check that the gpio-mlxbf3 driver gets loaded at boot time without issues * Check that the pinctrl-mlxbf3 driver gets loaded at boot time without issues * use libgpiod to test the access to gpio pin 0 through 4 i.e. read and write. * rmmod and modprobe of both drivers * Check that pwr-mlxbf driver is loaded successfully * Check that mlxbf-gige driver is loaded successfully and the irq is initialized properly (dmesg | grep PHY) [Regression Potential] * We introduced a dependency of the gpio-mlxbf3 driver on pinctrl-mlxbf3. So we need to make sure that doesnt trigger any regressions with loading other dependent drivers * make sure that removing/reloading the driver/restarting the DPU doesnt cause any panic related to these drivers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033439/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2034578] Re: Support IPSEC full offload implementation
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2034578 Title: Support IPSEC full offload implementation Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Summary: Align Kernel IPsec Full offload implementation in the DPU to the upstream Full offload in all components: OFED, Strongswan, etc. This is in order for DPU Kernel IPsec to include policy offload and be fully aligned to what CX Kernel customers will use. How to test: Host 1: /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.0 mode legacy echo 'dmfs' > /sys/bus/pci/devices/:03:00.0/net/p0/compat/devlink/steering_mode echo 'full' > /sys/class/net/p0/compat/devlink/ipsec_mode /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.0 mode switchdev BF on host 1: /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir out tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0xefa83812 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir in tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir fwd tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165/16 dst 196.234.182.166/16 proto esp spi 0xefa83812 reqid 0xefa83812 mode transport aead 'rfc4106(gcm(aes))' 0xe2fe3857301d8f72b5d71d295a462ef21868e407 128 offload packet dev p0 dir out sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.182.166/16 dst 196.234.181.165/16 proto esp spi 0x63a7db74 reqid 0x63a7db74 mode transport aead 'rfc4106(gcm(aes))' 0xe916c4d0db1886e8c877b023e8cebef53b4d2d0f 128 offload packet dev p0 dir in sel src 196.234.182.166/16 dst 196.234.181.165/16 flag esn replay-window 32 Start OVS and set following configure on BF: /usr/bin/ovs-vsctl set Open_vSwitch . other_config:hw-offload=true /usr/bin/ovs-vsctl set Open_vSwitch . other_config:max-idle=30 Host2: /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.1 mode legacy echo 'dmfs' > /sys/bus/pci/devices/:03:00.1/net/p1/compat/devlink/steering_mode echo 'full' > /sys/class/net/p1/compat/devlink/ipsec_mode /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.1 mode switchdev BF on host 2: /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir out tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0xefa83812 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir in tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir fwd tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165 dst 196.234.182.166 proto esp spi 0xefa83812 reqid 0xefa83812 mode transport aead 'rfc4106(gcm(aes))' 0xe2fe3857301d8f72b5d71d295a462ef21868e407 128 offload packet dev p0 dir out sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165 dst 196.234.182.166 proto esp spi 0x63a7db74 reqid 0x63a7db74 mode transport aead 'rfc4106(gcm(aes))' 0xe916c4d0db1886e8c877b023e8cebef53b4d2d0f 128 offload packet dev p0 dir in sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 Start OVS and set following configure on BF: /usr/bin/ovs-vsctl set Open_vSwitch . other_config:hw-offload=true /usr/bin/ovs-vsctl set Open_vSwitch . other_config:max-idle=30 Send the traffic between host 1 and host 2 and check IPsec counters in "ethtool -S" statistics on both BF. How to fix: Need to backport a series of xfrm patches into BlueField 5.15 kernel, from 6.0 upstream kernel. Patches needed for 5.15 kernel: afe9e47 xfrm: fix conflict for netdev and tx stats 6aff54d xfrm: don't skip free of empty state in acquire policy 692fecb xfrm: delete offloaded policy 91b6276 xfrm: Support UDP encapsulation in packet offload mode 69e168a xfrm: add missed call to delete offloaded policies 9724724 xfrm: release all offloaded policy memory e57b7ec xfrm: don't require advance ESN callback for packet
[Kernel-packages] [Bug 2032378] Re: Devlink backport: fix race and lock issue
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2032378 Title: Devlink backport: fix race and lock issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: Summary: machine get stuck if try to switch to switchdev mode while toggling ns over BF kernel. This is due to missing lock refactor series devlink and driver from kernel 6.0 How to test: 1. Configure SRIOV 2. Enable switchdev mode 3. Devlink reload 4. Add/Del network namespace Note: Need to run the test multiple times to reproduce the issue. How to fix: Need to backport a series of devlink patches into BlueField 5.15 kernel, from 6.0 upstream kernel. Patches needed for 5.4/5.15 kernel: First series: 367dfa121205^..f0680ef0f949 second series: 5502e8712c9b^..c90005b5f75c Also needed: 644a66c60f02 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2032378/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2034594] Re: mlxbf-ptm: add thermal envelope debugfs node
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2034594 Title: mlxbf-ptm: add thermal envelope debugfs node Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] mlxbf-ptm is a kernel driver that provides debufgs interface for system software to monitor Bluefield devices' power and thermal management parameters. [Fix] This change: * This change adds additional debugfs node that gets the current thermal envelope set in the thermal state machine [Test Case] * After installing the kernel module, debugfs entry at /sys/kernel/debug/mlxbf-ptm will be created * monitors/status folder contains read-only parameters reported from the Bluefield device * cat /sys/kernel/debug/mlxbf-ptm/temp_envelope shows current thermal envelope [Regression Potential] * Minimal - as these are read-only parameters that report Bluefield device information. [Other] * This code is likely to change depending on feedback we received from maintainers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2034594/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035128] Re: mlxbf-gige: Enable the OOB port in mlxbf_gige_open
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035128 Title: mlxbf-gige: Enable the OOB port in mlxbf_gige_open Status in linux-bluefield package in Ubuntu: New Status in linux-bluefield source package in Focal: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] At the moment, the OOB port is enabled in the mlxbf_gige_probe function. If the mlxbf_gige_open is not executed, this could cause pause frames to increase in the case where there is high backgroud traffic. This results in clogging the BMC port as well. [Fix] * Move enabling the OOB port to mlxbf_gige_open. [Test Case] * Main test for this bug: Check that the BMC is always pingable while pushing a BFB Other tests: * Check if the gige driver is loaded * Check that the oob_net0 interface is up and operational * Do the reboot test and powercycle test and check the oob_net0 interface again * Push BFB multiple times and make sure the OOB is up and running [Regression Potential] Since are moving code that hasn't moved since BF2, it is important to make sure that there is no regression. Make sure that the OOB interface is always up and pingable after the reboot test, the powercycle test and after pushing a BFB. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035128/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033551] Re: mlxbf_bootctl: replace SAUCE patches with upstream
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033551 Title: mlxbf_bootctl: replace SAUCE patches with upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The mlxbf_bootctl BlueField platform driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. One patch for the mlxbf_bootctl driver is not yet upstreamed, so that patch will remain as SAUCE patch for now. [Test Case] * Boot BF2/BF3 platform, verify no new errors * Program MFG fields via "bfcfg" tool: 1) With current image, use 'bfcfg -d' to display current values 2) Reboot, stopping at UEFI menu, trigger 'Reset MFG Info' 3) With new image, use 'bfcfg' to push new or same MFG fields 4) Verify that the MFG fields are programmed properly via 'bfcfg -d' 5) Repeat process to replace proper MFG fields * Display BlueField boot log 1) cd /sys/bus/platform/devices/MLNXBF04:00 2) cat rsh_log 3) Should see boot log entries relevant to last boot [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033551/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035374] Re: pwr-mlxbf: update Kconfig with upstream changes
** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035374 Title: pwr-mlxbf: update Kconfig with upstream changes Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] The Mellanox BlueField power driver (pwr-mlxbf) has a dependency on either the BlueField-2 or BlueField-3 GPIO driver. The current Kconfig does not have the BlueField-3 dependency. [Fix] There is an upstream commit that extends the Kconfig to include the GPIO_MLXBF3 dependency. [Test Case] Build the power driver with and without GPIO_MLXBF2 and GPIO_MLXBF3 present in the kernel configuration. Verify power driver is built. [Regression Potential] * none [Other] * none To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035374/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033439] Re: Cherry-pick gpio and pinctrl drivers from upstream
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033439 Title: Cherry-pick gpio and pinctrl drivers from upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] Cherry pick gpio-mlxbf3.c and pinctrl-mlxbf3.c patches from linux- next. [Fix] * Revert existing pinctrl-mlxbf3.c driver changes * Revert existing gpio-mlxbf3.c driver changes * Cherry-pick all the following commits from linux-next: gpio-mlxbf3.c: cd33f216d241520385a5166ae73a0771197a9f0b 38a700efc51080c7184f71edbf5e49561da9754f 1d2a22fa6d2511d5871d87c15b4fe7a944fe3b2a pinctrl-mlxbf3.c: d11f932808dc689717e409bbc81b5093e7902fc9 743d3336029ffe2bb38e982a3b572ced243c6d43 c0f84760b01e8d8b59e9e186a4f7fa8f081a4488 69657e60b8a7faf83b583c658ec7ce1f5ece9eb3 [Test Case] * All test cases are for BF3 only * Check that the gpio-mlxbf3 driver gets loaded at boot time without issues * Check that the pinctrl-mlxbf3 driver gets loaded at boot time without issues * use libgpiod to test the access to gpio pin 0 through 4 i.e. read and write. * rmmod and modprobe of both drivers * Check that pwr-mlxbf driver is loaded successfully * Check that mlxbf-gige driver is loaded successfully and the irq is initialized properly (dmesg | grep PHY) [Regression Potential] * We introduced a dependency of the gpio-mlxbf3 driver on pinctrl-mlxbf3. So we need to make sure that doesnt trigger any regressions with loading other dependent drivers * make sure that removing/reloading the driver/restarting the DPU doesnt cause any panic related to these drivers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033439/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2035374] Re: pwr-mlxbf: update Kconfig with upstream changes
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2035374 Title: pwr-mlxbf: update Kconfig with upstream changes Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] The Mellanox BlueField power driver (pwr-mlxbf) has a dependency on either the BlueField-2 or BlueField-3 GPIO driver. The current Kconfig does not have the BlueField-3 dependency. [Fix] There is an upstream commit that extends the Kconfig to include the GPIO_MLXBF3 dependency. [Test Case] Build the power driver with and without GPIO_MLXBF2 and GPIO_MLXBF3 present in the kernel configuration. Verify power driver is built. [Regression Potential] * none [Other] * none To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035374/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2032378] Re: Devlink backport: fix race and lock issue
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2032378 Title: Devlink backport: fix race and lock issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: Summary: machine get stuck if try to switch to switchdev mode while toggling ns over BF kernel. This is due to missing lock refactor series devlink and driver from kernel 6.0 How to test: 1. Configure SRIOV 2. Enable switchdev mode 3. Devlink reload 4. Add/Del network namespace Note: Need to run the test multiple times to reproduce the issue. How to fix: Need to backport a series of devlink patches into BlueField 5.15 kernel, from 6.0 upstream kernel. Patches needed for 5.4/5.15 kernel: First series: 367dfa121205^..f0680ef0f949 second series: 5502e8712c9b^..c90005b5f75c Also needed: 644a66c60f02 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2032378/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2034578] Re: Support IPSEC full offload implementation
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2034578 Title: Support IPSEC full offload implementation Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: Summary: Align Kernel IPsec Full offload implementation in the DPU to the upstream Full offload in all components: OFED, Strongswan, etc. This is in order for DPU Kernel IPsec to include policy offload and be fully aligned to what CX Kernel customers will use. How to test: Host 1: /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.0 mode legacy echo 'dmfs' > /sys/bus/pci/devices/:03:00.0/net/p0/compat/devlink/steering_mode echo 'full' > /sys/class/net/p0/compat/devlink/ipsec_mode /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.0 mode switchdev BF on host 1: /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir out tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0xefa83812 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir in tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir fwd tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165/16 dst 196.234.182.166/16 proto esp spi 0xefa83812 reqid 0xefa83812 mode transport aead 'rfc4106(gcm(aes))' 0xe2fe3857301d8f72b5d71d295a462ef21868e407 128 offload packet dev p0 dir out sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.182.166/16 dst 196.234.181.165/16 proto esp spi 0x63a7db74 reqid 0x63a7db74 mode transport aead 'rfc4106(gcm(aes))' 0xe916c4d0db1886e8c877b023e8cebef53b4d2d0f 128 offload packet dev p0 dir in sel src 196.234.182.166/16 dst 196.234.181.165/16 flag esn replay-window 32 Start OVS and set following configure on BF: /usr/bin/ovs-vsctl set Open_vSwitch . other_config:hw-offload=true /usr/bin/ovs-vsctl set Open_vSwitch . other_config:max-idle=30 Host2: /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.1 mode legacy echo 'dmfs' > /sys/bus/pci/devices/:03:00.1/net/p1/compat/devlink/steering_mode echo 'full' > /sys/class/net/p1/compat/devlink/ipsec_mode /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/:03:00.1 mode switchdev BF on host 2: /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.182.166 dst 196.234.181.165 dir out tmpl src 196.234.182.166/16 dst 196.234.181.165/16 proto esp reqid 0xefa83812 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir in tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm policy add src 196.234.181.165 dst 196.234.182.166 dir fwd tmpl src 196.234.181.165/16 dst 196.234.182.166/16 proto esp reqid 0x63a7db74 mode transport priority 10 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165 dst 196.234.182.166 proto esp spi 0xefa83812 reqid 0xefa83812 mode transport aead 'rfc4106(gcm(aes))' 0xe2fe3857301d8f72b5d71d295a462ef21868e407 128 offload packet dev p0 dir out sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 /opt/mellanox/iproute2/sbin/ip xfrm state add src 196.234.181.165 dst 196.234.182.166 proto esp spi 0x63a7db74 reqid 0x63a7db74 mode transport aead 'rfc4106(gcm(aes))' 0xe916c4d0db1886e8c877b023e8cebef53b4d2d0f 128 offload packet dev p0 dir in sel src 196.234.181.165/16 dst 196.234.182.166/16 flag esn replay-window 32 Start OVS and set following configure on BF: /usr/bin/ovs-vsctl set Open_vSwitch . other_config:hw-offload=true /usr/bin/ovs-vsctl set Open_vSwitch . other_config:max-idle=30 Send the traffic between host 1 and host 2 and check IPsec counters in "ethtool -S" statistics on both BF. How to fix: Need to backport a series of xfrm patches into BlueField 5.15 kernel, from 6.0 upstream kernel. Patches needed for 5.15 kernel: afe9e47 xfrm: fix conflict for netdev and tx stats 6aff54d xfrm: don't skip free of empty state in acquire policy 692fecb xfrm: delete offloaded policy 91b6276 xfrm: Support UDP encapsulation in packet offload mode 69e168a xfrm: add missed call to delete offloaded policies 9724724 xfrm: release all offloaded policy
[Kernel-packages] [Bug 2034594] Re: mlxbf-ptm: add thermal envelope debugfs node
** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2034594 Title: mlxbf-ptm: add thermal envelope debugfs node Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] mlxbf-ptm is a kernel driver that provides debufgs interface for system software to monitor Bluefield devices' power and thermal management parameters. [Fix] This change: * This change adds additional debugfs node that gets the current thermal envelope set in the thermal state machine [Test Case] * After installing the kernel module, debugfs entry at /sys/kernel/debug/mlxbf-ptm will be created * monitors/status folder contains read-only parameters reported from the Bluefield device * cat /sys/kernel/debug/mlxbf-ptm/temp_envelope shows current thermal envelope [Regression Potential] * Minimal - as these are read-only parameters that report Bluefield device information. [Other] * This code is likely to change depending on feedback we received from maintainers. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2034594/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033551] Re: mlxbf_bootctl: replace SAUCE patches with upstream
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033551 Title: mlxbf_bootctl: replace SAUCE patches with upstream Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] The mlxbf_bootctl BlueField platform driver in the Jammy repo consists of some SAUCE patches. These need to be replaced. [Fix] The fix is to revert the four SAUCE patches, replacing them with upstream commits for the same functionality. One patch for the mlxbf_bootctl driver is not yet upstreamed, so that patch will remain as SAUCE patch for now. [Test Case] * Boot BF2/BF3 platform, verify no new errors * Program MFG fields via "bfcfg" tool: 1) With current image, use 'bfcfg -d' to display current values 2) Reboot, stopping at UEFI menu, trigger 'Reset MFG Info' 3) With new image, use 'bfcfg' to push new or same MFG fields 4) Verify that the MFG fields are programmed properly via 'bfcfg -d' 5) Repeat process to replace proper MFG fields * Display BlueField boot log 1) cd /sys/bus/platform/devices/MLNXBF04:00 2) cat rsh_log 3) Should see boot log entries relevant to last boot [Regression Potential] The upstream commits are not exactly the same as the SAUCE patches, so technically there is a chance of regression, but its been well-tested and the functionality is the same. [Other] n/a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033551/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2033316] Re: UBUNTU: [Config] optee: Adjust CONFIG_OPTEE_SHM_NUM_PRIV_PAGES to 4
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2033316 Title: UBUNTU: [Config] optee: Adjust CONFIG_OPTEE_SHM_NUM_PRIV_PAGES to 4 Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: New Bug description: SRU Justification: [Impact] This change is needed to OPTEE as the FTPM driver requires 2 contiguous SHMEM pages to boot. Without this change the ftpm_tee_probe function will fail. [Fix] Change the CONFIG_OPTEE_SHM_NUM_PRIV_PAGES from 1 to 4. Note that other SHMEM structures are allocated out of this OPTEE pool as well. [Test Case] Verified the Microsoft TPM 2.0 FTPM is fully functional with this change, running the tpm2-tss Test Suite (mandatory & optional) tests as well as many of the TPM2 Command tools. [Regression Potential] This only has an effect if OPTEE is configured/enabled in the kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033316/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2028190] Re: kernel panic when using conntrack tcp pedit
This bug is awaiting verification that the linux- bluefield/5.15.0-1020.22 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2028190 Title: kernel panic when using conntrack tcp pedit Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: * Explain the bug(s) when setting up conntrack offload with tcp pedit (test-ct-tcp- pedit.sh), encounter kernel panic. * brief explanation of fixes In tc_setup_flow_action, need to properly assign action. Which fixes previous commit ("UBUNTU: SAUCE: net/sched: Provide act to offload action") * Kernel log [ 226.156222] Unable to handle kernel access to user memory outside uaccess routines at [ 226.177783] Mem abort info: [ 226.183408] ESR = 0x9604 [ 226.190953] EC = 0x25: DABT (current EL), IL = 32 bits [ 226.201641] SET = 0, FnV = 0 [ 226.207786] EA = 0, S1PTW = 0 [ 226.214095] FSC = 0x04: level 0 translation fault [ 226.223906] Data abort info: [ 226.229695] ISV = 0, ISS = 0x0004 [ 226.237410] CM = 0, WnR = 0 [ 226.243372] user pgtable: 4k pages, 48-bit VAs, pgdp=000123f25000 [ 226.256328] [0090] pgd=, p4d= [ 226.269984] Internal error: Oops: 9604 [#1] SMP [ 226.279779] Modules linked in: act_pedit act_ct nf_flow_table iptable_raw xt_CT xt_tcpudp bpfilter xt_comment xt_mark [ 226.279938] async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 multipath [ 226.544260] CPU: 2 PID: 4293 Comm: handler3 Tainted: G [ 226.565581] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.2.0.12795 Jun 30 2023 [ 226.585497] pstate: a045 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 226.599481] pc : tcf_action_update_stats+0x8/0xc4 [ 226.608933] lr : mlx5e_tc_act_stats_fill_stats+0xf8/0x19c [mlx5_core] [ 226.622089] sp : 8e073130 [ 226.628735] x29: 8e073130 x28: 0008 x27: 0020 [ 226.643067] x26: ffe0 x25: 5913c62dfe71 x24: 5913c62dfe00 [ 226.657398] x23: x22: x21: 5913c62dfe70 [ 226.671730] x20: 5913c2fc3b00 x19: 5913f0058000 x18: 0014 [ 226.686059] x17: b96d1a87 x16: c6320acb93e0 x15: [ 226.700390] x14: 0001 x13: x12: 0002 [ 226.714720] x11: 7f7f7f7f7f7f7f7f x10: x9 : c631ce0b66ac [ 226.729052] x8 : 8e073130 x7 : x6 : 000d [ 226.743384] x5 : 0c62 x4 : 0001 x3 : [ 226.757715] x2 : x1 : x0 : [ 226.772047] Call trace: [ 226.776947] tcf_action_update_stats+0x8/0xc4 [ 226.785695] mlx5e_tc_act_stats_fill_stats_flow+0x78/0xc0 [mlx5_core] [ 226.798833] mlx5e_stats_flower+0x394/0x3c0 [mlx5_core] [ 226.809502] mlx5e_rep_setup_tc_cls_flower+0x8c/0xa0 [mlx5_core] [ 226.821732] mlx5e_rep_setup_tc_cb+0x74/0xb0 [mlx5_core] [ 226.832549] tc_setup_cb_call+0xa4/0x160 [ 226.840426] fl_hw_update_stats+0x98/0x164 [cls_flower] [ 226.850927] fl_dump.part.0+0x224/0x260 [cls_flower] [ 226.860891] fl_dump+0x20/0x34 [cls_flower] [ 226.869284] tcf_fill_node+0x164/0x244 [ 226.876803] tfilter_notify+0xc0/0x140 [ 226.884323] tc_new_tfilter+0x454/0x8bc [ 226.892018] rtnetlink_rcv_msg+0x2e8/0x3cc [ 226.900245] netlink_rcv_skb+0x64/0x130 [ 226.907942] rtnetlink_rcv+0x20/0x30 [ 226.915110] netlink_unicast+0x2ec/0x360 [ 226.922977] netlink_sendmsg+0x278/0x490 [ 226.930846] sock_sendmsg+0x5c/0x6c [ 226.937845] sys_sendmsg+0x290/0x2d4 [ 226.945712] ___sys_sendmsg+0x84/0xd0 [ 226.953059] __sys_sendmsg+0x70/0xd0 [ 226.960229] __arm64_sys_sendmsg+0x2c/0x40 [ 226.968447] invoke_syscall+0x78/0x100 [ 226.975974] el0_svc_common.constprop.0+0x54/0x184 [ 226.985587] do_el0_svc+0x30/0xac [ 226.992231] el0_svc+0x48/0x160 [ 226.998528] el0t_64_sync_handler+0xa4/0x130 [ 227.007094] el0t_64_sync+0x1a4/0x1a8 [ 227.01]
[Kernel-packages] [Bug 2028197] Re: rshim console truncates dmesg output due to tmfifo issue
This bug is awaiting verification that the linux- bluefield/5.15.0-1020.22 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2028197 Title: rshim console truncates dmesg output due to tmfifo issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Focal: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] mlxbf_tmfifo driver does not always schedule work for VIRTIO CONSOLE nottification. The result is with rshim console, the dmesg output would be cut off or simply cat a big text file to stdout. [Fix] * Make sure TM_TX_LWM_IRQ bit is set when there is a notification of virtio console event. [Test Case] * Compare the output of dmesg from rshim console and ssh session. Make sure both are the same. * Issue is reproduced w/o fix and compare the same dmesg output from the same as above. [Regression Potential] The changes affect only virtio console event handling only and been tested fully. [Others] This issue is the same on focal and Jammy. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2028197/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2028309] Re: mlxbf-bootctl: Fix kernel panic due to buffer overflow
This bug is awaiting verification that the linux- bluefield/5.15.0-1020.22 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification- done-jammy'. If the problem still exists, change the tag 'verification- needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2028309 Title: mlxbf-bootctl: Fix kernel panic due to buffer overflow Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Running the following LTP (linux-test-project) script, causes a kernel panic and a reboot of the DPU: ltp/testcases/bin/read_all -d /sys -q -r 10 The above test reads all directory and files under /sys. Reading the sysfs entry "large_icm" causes the kernel panic due to a garbage value returned via i2c read. That garbage value causes a buffer overflow in sprintf. [Fix] * Replace sprintf with snprintf. And also add missing lock and increase the buffer size to PAGE_SIZE. [Test Case] * Run from linux: ltp/testcases/bin/read_all -d /sys -q -r 10 [Regression Potential] no known regression To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2028309/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2028190] Re: kernel panic when using conntrack tcp pedit
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2028190 Title: kernel panic when using conntrack tcp pedit Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: * Explain the bug(s) when setting up conntrack offload with tcp pedit (test-ct-tcp- pedit.sh), encounter kernel panic. * brief explanation of fixes In tc_setup_flow_action, need to properly assign action. Which fixes previous commit ("UBUNTU: SAUCE: net/sched: Provide act to offload action") * Kernel log [ 226.156222] Unable to handle kernel access to user memory outside uaccess routines at [ 226.177783] Mem abort info: [ 226.183408] ESR = 0x9604 [ 226.190953] EC = 0x25: DABT (current EL), IL = 32 bits [ 226.201641] SET = 0, FnV = 0 [ 226.207786] EA = 0, S1PTW = 0 [ 226.214095] FSC = 0x04: level 0 translation fault [ 226.223906] Data abort info: [ 226.229695] ISV = 0, ISS = 0x0004 [ 226.237410] CM = 0, WnR = 0 [ 226.243372] user pgtable: 4k pages, 48-bit VAs, pgdp=000123f25000 [ 226.256328] [0090] pgd=, p4d= [ 226.269984] Internal error: Oops: 9604 [#1] SMP [ 226.279779] Modules linked in: act_pedit act_ct nf_flow_table iptable_raw xt_CT xt_tcpudp bpfilter xt_comment xt_mark [ 226.279938] async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 multipath [ 226.544260] CPU: 2 PID: 4293 Comm: handler3 Tainted: G [ 226.565581] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.2.0.12795 Jun 30 2023 [ 226.585497] pstate: a045 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 226.599481] pc : tcf_action_update_stats+0x8/0xc4 [ 226.608933] lr : mlx5e_tc_act_stats_fill_stats+0xf8/0x19c [mlx5_core] [ 226.622089] sp : 8e073130 [ 226.628735] x29: 8e073130 x28: 0008 x27: 0020 [ 226.643067] x26: ffe0 x25: 5913c62dfe71 x24: 5913c62dfe00 [ 226.657398] x23: x22: x21: 5913c62dfe70 [ 226.671730] x20: 5913c2fc3b00 x19: 5913f0058000 x18: 0014 [ 226.686059] x17: b96d1a87 x16: c6320acb93e0 x15: [ 226.700390] x14: 0001 x13: x12: 0002 [ 226.714720] x11: 7f7f7f7f7f7f7f7f x10: x9 : c631ce0b66ac [ 226.729052] x8 : 8e073130 x7 : x6 : 000d [ 226.743384] x5 : 0c62 x4 : 0001 x3 : [ 226.757715] x2 : x1 : x0 : [ 226.772047] Call trace: [ 226.776947] tcf_action_update_stats+0x8/0xc4 [ 226.785695] mlx5e_tc_act_stats_fill_stats_flow+0x78/0xc0 [mlx5_core] [ 226.798833] mlx5e_stats_flower+0x394/0x3c0 [mlx5_core] [ 226.809502] mlx5e_rep_setup_tc_cls_flower+0x8c/0xa0 [mlx5_core] [ 226.821732] mlx5e_rep_setup_tc_cb+0x74/0xb0 [mlx5_core] [ 226.832549] tc_setup_cb_call+0xa4/0x160 [ 226.840426] fl_hw_update_stats+0x98/0x164 [cls_flower] [ 226.850927] fl_dump.part.0+0x224/0x260 [cls_flower] [ 226.860891] fl_dump+0x20/0x34 [cls_flower] [ 226.869284] tcf_fill_node+0x164/0x244 [ 226.876803] tfilter_notify+0xc0/0x140 [ 226.884323] tc_new_tfilter+0x454/0x8bc [ 226.892018] rtnetlink_rcv_msg+0x2e8/0x3cc [ 226.900245] netlink_rcv_skb+0x64/0x130 [ 226.907942] rtnetlink_rcv+0x20/0x30 [ 226.915110] netlink_unicast+0x2ec/0x360 [ 226.922977] netlink_sendmsg+0x278/0x490 [ 226.930846] sock_sendmsg+0x5c/0x6c [ 226.937845] sys_sendmsg+0x290/0x2d4 [ 226.945712] ___sys_sendmsg+0x84/0xd0 [ 226.953059] __sys_sendmsg+0x70/0xd0 [ 226.960229] __arm64_sys_sendmsg+0x2c/0x40 [ 226.968447] invoke_syscall+0x78/0x100 [ 226.975974] el0_svc_common.constprop.0+0x54/0x184 [ 226.985587] do_el0_svc+0x30/0xac [ 226.992231] el0_svc+0x48/0x160 [ 226.998528] el0t_64_sync_handler+0xa4/0x130 [ 227.007094] el0t_64_sync+0x1a4/0x1a8 [ 227.01] Code: 9407cf7e d503201f aa1e03e9 d503201f (f9404805) [ 227.026679] ---[ end trace 2aa44f8c6701f98e ]--- [ 236.273308] Kernel panic - not syncing: Oops: Fatal exception [ 236.284885] SMP: stopping secondary CPUs [ 236.292761] Kernel Offset: 0x463201de from 0x8800 [ 236.304996] PHYS_OFFSET: 0xa6ed4000 [ 236.313387] CPU features: 0x800804a1,2846 [ 236.322129] Memory Limit: none [
[Kernel-packages] [Bug 2028197] Re: rshim console truncates dmesg output due to tmfifo issue
** Also affects: linux-bluefield (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2028197 Title: rshim console truncates dmesg output due to tmfifo issue Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Focal: New Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] mlxbf_tmfifo driver does not always schedule work for VIRTIO CONSOLE nottification. The result is with rshim console, the dmesg output would be cut off or simply cat a big text file to stdout. [Fix] * Make sure TM_TX_LWM_IRQ bit is set when there is a notification of virtio console event. [Test Case] * Compare the output of dmesg from rshim console and ssh session. Make sure both are the same. * Issue is reproduced w/o fix and compare the same dmesg output from the same as above. [Regression Potential] The changes affect only virtio console event handling only and been tested fully. [Others] This issue is the same on focal and Jammy. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2028197/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2028309] Re: mlxbf-bootctl: Fix kernel panic due to buffer overflow
** Also affects: linux-bluefield (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux-bluefield (Ubuntu) Status: New => Invalid ** Changed in: linux-bluefield (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2028309 Title: mlxbf-bootctl: Fix kernel panic due to buffer overflow Status in linux-bluefield package in Ubuntu: Invalid Status in linux-bluefield source package in Jammy: Fix Committed Bug description: SRU Justification: [Impact] Running the following LTP (linux-test-project) script, causes a kernel panic and a reboot of the DPU: ltp/testcases/bin/read_all -d /sys -q -r 10 The above test reads all directory and files under /sys. Reading the sysfs entry "large_icm" causes the kernel panic due to a garbage value returned via i2c read. That garbage value causes a buffer overflow in sprintf. [Fix] * Replace sprintf with snprintf. And also add missing lock and increase the buffer size to PAGE_SIZE. [Test Case] * Run from linux: ltp/testcases/bin/read_all -d /sys -q -r 10 [Regression Potential] no known regression To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2028309/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp