[Kernel-packages] [Bug 2064163] Re: mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

2024-04-30 Thread Asmaa Mnebhi
This SW WA didnt work so the HW team will have to review it.

** Changed in: linux-bluefield (Ubuntu)
   Status: New => Invalid

** Changed in: linux-bluefield (Ubuntu Jammy)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2064163

Title:
  mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

Status in linux-bluefield package in Ubuntu:
  Invalid
Status in linux-bluefield source package in Jammy:
  Invalid

Bug description:
  SRU Justification:

  [Impact]

  During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad state, 
resulting in no ip provisioning. The only way to recover is to powercycle.
  We might have found a software workaround to avoid getting in this state in 
the first place: suspend the PHY during graceful shutdown. Suspend the PHY = 
Power down = set bit 11 to 1 in reg 0 of the PHY. This WA passed 1800 reboots 
on QA's setup.

  [Fix]

  * During reboot, the mlxbf_gige_shutdown() function makes a call to 
phy_stop(). phy_stop() calls phy_suspend().
  * Certain Linux PHY drivers, like the Vitesse PHY, don't support suspend() to 
power down the PHY during shutdown.
  * Our Hardware also does not toggle the hard reset signal of the PHY during 
reboot.
  * Hence, when the PHY is in a bad state, it stays in its bad state until 
powercycle.
  * We have found a way to prevent the PHY from entering this bad state by 
suspending the PHY in the case of reboot.

  [Test Case]

  * do the reboot test (at least 2000 reboots): run 'reboot' from linux.
  * Check that the oob_net0 interface is up and the ip is assigned.
  * please note that if the the OOB doesn't get an ip, try reloading the driver 
(rmmod/modprobe). it that solves the issue, that would be a different bug. In 
the bug at stake, nothing recovers the OOB ip except power cycle.

  [Regression Potential]

  * Make sure the redfish DHCP is still working during the reboot test
  * Make sure the OOB gets an ip

  [Other]

  These changes were made both in the mlxbf-gige driver and UEFI

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2064163/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2064163] Re: mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

2024-04-29 Thread Asmaa Mnebhi
** Description changed:

  SRU Justification:
  
  [Impact]
  
- During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad
- state, resulting in no ip provisioning. The only way to recover is to
- toggle the PHY hard reset pin via GPIO17 (or powercycle cycle since it
- achieves just that). We might have found a software workaround to avoid
- getting in this state in the first place: suspend the PHY during
- graceful shutdown. Suspend the PHY = Power down = set bit 11 to 1 in reg
- 0 of the PHY. This WA passed 1800 reboots on QA's setup.
+ During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad state, 
resulting in no ip provisioning. The only way to recover is to powercycle.
+ We might have found a software workaround to avoid getting in this state in 
the first place: suspend the PHY during graceful shutdown. Suspend the PHY = 
Power down = set bit 11 to 1 in reg 0 of the PHY. This WA passed 1800 reboots 
on QA's setup.
  
  [Fix]
  
  * During reboot, the mlxbf_gige_shutdown() function makes a call to 
phy_stop(). phy_stop() calls phy_suspend().
  * Certain Linux PHY drivers, like the Vitesse PHY, don't support suspend() to 
power down the PHY during shutdown.
- * Our Hardware also does not toggle the hard reset signal of the PHY during 
reboot. 
+ * Our Hardware also does not toggle the hard reset signal of the PHY during 
reboot.
  * Hence, when the PHY is in a bad state, it stays in its bad state until 
powercycle.
  * We have found a way to prevent the PHY from entering this bad state by 
suspending the PHY in the case of reboot.
  
- 
  [Test Case]
  
- * do the reboot test (at least 2000 reboots): run 'reboot' from linux. 
+ * do the reboot test (at least 2000 reboots): run 'reboot' from linux.
  * Check that the oob_net0 interface is up and the ip is assigned.
- * please note that if the the OOB doesn't get an ip, try reloading the driver 
(rmmod/modprobe). it that solves the issue, that would be a different bug. In 
the bug at stake, nothing recovers the OOB ip except power cycle. 
+ * please note that if the the OOB doesn't get an ip, try reloading the driver 
(rmmod/modprobe). it that solves the issue, that would be a different bug. In 
the bug at stake, nothing recovers the OOB ip except power cycle.
  
  [Regression Potential]
  
  * Make sure the redfish DHCP is still working during the reboot test
- * Make sure the OOB gets an ip 
+ * Make sure the OOB gets an ip
  
  [Other]
  
  These changes were made both in the mlxbf-gige driver and UEFI

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2064163

Title:
  mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad state, 
resulting in no ip provisioning. The only way to recover is to powercycle.
  We might have found a software workaround to avoid getting in this state in 
the first place: suspend the PHY during graceful shutdown. Suspend the PHY = 
Power down = set bit 11 to 1 in reg 0 of the PHY. This WA passed 1800 reboots 
on QA's setup.

  [Fix]

  * During reboot, the mlxbf_gige_shutdown() function makes a call to 
phy_stop(). phy_stop() calls phy_suspend().
  * Certain Linux PHY drivers, like the Vitesse PHY, don't support suspend() to 
power down the PHY during shutdown.
  * Our Hardware also does not toggle the hard reset signal of the PHY during 
reboot.
  * Hence, when the PHY is in a bad state, it stays in its bad state until 
powercycle.
  * We have found a way to prevent the PHY from entering this bad state by 
suspending the PHY in the case of reboot.

  [Test Case]

  * do the reboot test (at least 2000 reboots): run 'reboot' from linux.
  * Check that the oob_net0 interface is up and the ip is assigned.
  * please note that if the the OOB doesn't get an ip, try reloading the driver 
(rmmod/modprobe). it that solves the issue, that would be a different bug. In 
the bug at stake, nothing recovers the OOB ip except power cycle.

  [Regression Potential]

  * Make sure the redfish DHCP is still working during the reboot test
  * Make sure the OOB gets an ip

  [Other]

  These changes were made both in the mlxbf-gige driver and UEFI

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2064163/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2064163] [NEW] mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

2024-04-29 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad
state, resulting in no ip provisioning. The only way to recover is to
toggle the PHY hard reset pin via GPIO17 (or powercycle cycle since it
achieves just that). We might have found a software workaround to avoid
getting in this state in the first place: suspend the PHY during
graceful shutdown. Suspend the PHY = Power down = set bit 11 to 1 in reg
0 of the PHY. This WA passed 1800 reboots on QA's setup.

[Fix]

* During reboot, the mlxbf_gige_shutdown() function makes a call to phy_stop(). 
phy_stop() calls phy_suspend().
* Certain Linux PHY drivers, like the Vitesse PHY, don't support suspend() to 
power down the PHY during shutdown.
* Our Hardware also does not toggle the hard reset signal of the PHY during 
reboot. 
* Hence, when the PHY is in a bad state, it stays in its bad state until 
powercycle.
* We have found a way to prevent the PHY from entering this bad state by 
suspending the PHY in the case of reboot.


[Test Case]

* do the reboot test (at least 2000 reboots): run 'reboot' from linux. 
* Check that the oob_net0 interface is up and the ip is assigned.
* please note that if the the OOB doesn't get an ip, try reloading the driver 
(rmmod/modprobe). it that solves the issue, that would be a different bug. In 
the bug at stake, nothing recovers the OOB ip except power cycle. 

[Regression Potential]

* Make sure the redfish DHCP is still working during the reboot test
* Make sure the OOB gets an ip 

[Other]

These changes were made both in the mlxbf-gige driver and UEFI

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2064163

Title:
  mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad
  state, resulting in no ip provisioning. The only way to recover is to
  toggle the PHY hard reset pin via GPIO17 (or powercycle cycle since it
  achieves just that). We might have found a software workaround to
  avoid getting in this state in the first place: suspend the PHY during
  graceful shutdown. Suspend the PHY = Power down = set bit 11 to 1 in
  reg 0 of the PHY. This WA passed 1800 reboots on QA's setup.

  [Fix]

  * During reboot, the mlxbf_gige_shutdown() function makes a call to 
phy_stop(). phy_stop() calls phy_suspend().
  * Certain Linux PHY drivers, like the Vitesse PHY, don't support suspend() to 
power down the PHY during shutdown.
  * Our Hardware also does not toggle the hard reset signal of the PHY during 
reboot. 
  * Hence, when the PHY is in a bad state, it stays in its bad state until 
powercycle.
  * We have found a way to prevent the PHY from entering this bad state by 
suspending the PHY in the case of reboot.

  
  [Test Case]

  * do the reboot test (at least 2000 reboots): run 'reboot' from linux. 
  * Check that the oob_net0 interface is up and the ip is assigned.
  * please note that if the the OOB doesn't get an ip, try reloading the driver 
(rmmod/modprobe). it that solves the issue, that would be a different bug. In 
the bug at stake, nothing recovers the OOB ip except power cycle. 

  [Regression Potential]

  * Make sure the redfish DHCP is still working during the reboot test
  * Make sure the OOB gets an ip 

  [Other]

  These changes were made both in the mlxbf-gige driver and UEFI

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2064163/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2062384] [NEW] mlxbf-gige: autonegotiation fails to complete on BF2

2024-04-18 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

During the reboot test, QA found an intermittent issue where the OOB link is 
down. 
The link is down because the KSZ9031 PHY fails to complete autonegotiation.
Even under "normal" circumstances where autonegotiation completes,
it takes an abnormal time to do so (on average, at least 8 seconds).

Hence, the hardware team and Microchip are involved in this debug but the root 
cause is still unknown.
In the meantime, we need to provide a software workaround since customers are 
starting to see this issue as well.

[Fix]

* Restart autonegotiation when it fails the first time.

[Test Case]

* On BF2, Do the reboot test: 2000 loops.
* Check that the OOB link is up and ip is assigned.

[Regression Potential]

* no known regression.

[Other]
* Note that this issue is BF2 hardware specific. The same ethernet code is used 
for BF3 and we don't see any issues. In fact, the link up time on BF3 <= 1s. On 
BF2, the link up time is > 8s.
* we have been aware of this issue for 2 years and have shared this with the 
PHY vendor and the hardware team but there were not root causes identified.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2062384

Title:
  mlxbf-gige: autonegotiation fails to complete on BF2

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  During the reboot test, QA found an intermittent issue where the OOB link is 
down. 
  The link is down because the KSZ9031 PHY fails to complete autonegotiation.
  Even under "normal" circumstances where autonegotiation completes,
  it takes an abnormal time to do so (on average, at least 8 seconds).
  
  Hence, the hardware team and Microchip are involved in this debug but the 
root cause is still unknown.
  In the meantime, we need to provide a software workaround since customers are 
starting to see this issue as well.

  [Fix]

  * Restart autonegotiation when it fails the first time.

  [Test Case]

  * On BF2, Do the reboot test: 2000 loops.
  * Check that the OOB link is up and ip is assigned.

  [Regression Potential]

  * no known regression.

  [Other]
  * Note that this issue is BF2 hardware specific. The same ethernet code is 
used for BF3 and we don't see any issues. In fact, the link up time on BF3 <= 
1s. On BF2, the link up time is > 8s.
  * we have been aware of this issue for 2 years and have shared this with the 
PHY vendor and the hardware team but there were not root causes identified.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2062384/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2058498] [NEW] pwr-mlxbf: support Bobcat graceful shutdown via gpio6

2024-03-20 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

OCP3.0 project was discontinued and canceled so all stale code related to
that was removed.
The HID MLNXBF29 is used now for triggering a graceful shutdown for the Bobcat
board. On the bobcat board, the main board cpld issues the shutdown request to
the dpu cpld. the dpu cpld issues that request to ARM via GPIO6 and ARM should
trigger a graceful shutdown and set the ARM boot progress to 6.


[Fix]

The HID MLNXBF29 is used now for triggering a graceful shutdown for the Bobcat
board. On the bobcat board, the main board cpld issues the shutdown request to
the dpu cpld. the dpu cpld issues that request to ARM via GPIO6 and ARM should
trigger a graceful shutdown and set the ARM boot progress to 6.


[Test Case]

* Test shutdown on Bobcat via CPLD command.

[Regression Potential]

* no known regression as the OCP3.0 board was discontinued.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2058498

Title:
  pwr-mlxbf: support Bobcat graceful shutdown via gpio6

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  OCP3.0 project was discontinued and canceled so all stale code related to
  that was removed.
  The HID MLNXBF29 is used now for triggering a graceful shutdown for the Bobcat
  board. On the bobcat board, the main board cpld issues the shutdown request to
  the dpu cpld. the dpu cpld issues that request to ARM via GPIO6 and ARM should
  trigger a graceful shutdown and set the ARM boot progress to 6.

  
  [Fix]

  The HID MLNXBF29 is used now for triggering a graceful shutdown for the Bobcat
  board. On the bobcat board, the main board cpld issues the shutdown request to
  the dpu cpld. the dpu cpld issues that request to ARM via GPIO6 and ARM should
  trigger a graceful shutdown and set the ARM boot progress to 6.

  
  [Test Case]

  * Test shutdown on Bobcat via CPLD command.

  [Regression Potential]

  * no known regression as the OCP3.0 board was discontinued.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2058498/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2054845] Re: mlxbf-gige: support fixed phy for Bobcat

2024-02-23 Thread Asmaa Mnebhi
** Summary changed:

- mlxbf-gige: support bobcat
+ mlxbf-gige: support fixed phy for Bobcat

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2054845

Title:
  mlxbf-gige: support fixed phy for Bobcat

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Bobcat is a new board which doesn't have an external PHY connected to
  the OOB. There are no MDIO busses involved and no PHY involved. The
  OOB is directly connected to the switch (MAC to MAC). SO we need to
  use the linux fixed phy to be able to emulate the mdio behavior and
  load the ethernet driver.

  [Fix]

  * Register the fixed-phy register in the case of the bobcat.

  [Test Case]

  * Important: For testing on the bobcat board, make sure the corresponding 
UEFI image is also loaded, otherwise, the oob driver will fail to load.  For 
other board, it is backward compatible.
  * Check if the mlxbf-gige is loaded on bobcat
  * Check that it is using the fixed-phy via dmesg | grep PHY
  * check that the oob interface is pingable
  * check that files can be copied over via oob interface
  * rmmod/modprobe
  * ifconfig up/down
  * reboot test

  [Regression Potential]

  * Redo all the OOB testing on other boards (moonraker) and make sure
  the bobcat changes dont trigger any regressions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2054845/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2054845] [NEW] mlxbf-gige: support bobcat

2024-02-23 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Bobcat is a new board which doesn't have an external PHY connected to
the OOB. There are no MDIO busses involved and no PHY involved. The OOB
is directly connected to the switch (MAC to MAC). SO we need to use the
linux fixed phy to be able to emulate the mdio behavior and load the
ethernet driver.

[Fix]

* Register the fixed-phy register in the case of the bobcat.

[Test Case]

* Important: For testing on the bobcat board, make sure the corresponding UEFI 
image is also loaded, otherwise, the oob driver will fail to load.  For other 
board, it is backward compatible.
* Check if the mlxbf-gige is loaded on bobcat
* Check that it is using the fixed-phy via dmesg | grep PHY
* check that the oob interface is pingable
* check that files can be copied over via oob interface
* rmmod/modprobe
* ifconfig up/down
* reboot test

[Regression Potential]

* Redo all the OOB testing on other boards (moonraker) and make sure the
bobcat changes dont trigger any regressions.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2054845

Title:
  mlxbf-gige: support bobcat

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Bobcat is a new board which doesn't have an external PHY connected to
  the OOB. There are no MDIO busses involved and no PHY involved. The
  OOB is directly connected to the switch (MAC to MAC). SO we need to
  use the linux fixed phy to be able to emulate the mdio behavior and
  load the ethernet driver.

  [Fix]

  * Register the fixed-phy register in the case of the bobcat.

  [Test Case]

  * Important: For testing on the bobcat board, make sure the corresponding 
UEFI image is also loaded, otherwise, the oob driver will fail to load.  For 
other board, it is backward compatible.
  * Check if the mlxbf-gige is loaded on bobcat
  * Check that it is using the fixed-phy via dmesg | grep PHY
  * check that the oob interface is pingable
  * check that files can be copied over via oob interface
  * rmmod/modprobe
  * ifconfig up/down
  * reboot test

  [Regression Potential]

  * Redo all the OOB testing on other boards (moonraker) and make sure
  the bobcat changes dont trigger any regressions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2054845/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2041996] [NEW] pwr-mlxbf: Several bug fixes for focal

2023-10-31 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

There is are several changes that needs to be made to pwr-mlxbf in focal:
* There is a race condition between gpio-mlxbf2.c driver being loaded and 
pwr-mlxbf.c being loaded
* When the module is removed, there is a panic due to NULL pointer access
* soft reset needs to be replaced by graceful reboot

[Fix]

* Fix race condition between gpio-mlxbf2.c driver being loaded and pwr-mlxbf.c 
being loaded
* Fix panic due to access to NULL pointer when driver is removed via rmmod
* support graceful reboot instead of soft reset 

[Test Case]

* all test cases are for BF2:
* trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02
  This should trigger a graceful reboot of the DPU.
* rmmod/modprobe
* reboot test and make sure the driver is always loaded

[Regression Potential]

* Run the 100 reboot test and make sure that the driver is loaded with
no issues. That the gpio graceful reboot works.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2041996

Title:
  pwr-mlxbf: Several bug fixes for focal

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  There is are several changes that needs to be made to pwr-mlxbf in focal:
  * There is a race condition between gpio-mlxbf2.c driver being loaded and 
pwr-mlxbf.c being loaded
  * When the module is removed, there is a panic due to NULL pointer access
  * soft reset needs to be replaced by graceful reboot

  [Fix]

  * Fix race condition between gpio-mlxbf2.c driver being loaded and 
pwr-mlxbf.c being loaded
  * Fix panic due to access to NULL pointer when driver is removed via rmmod
  * support graceful reboot instead of soft reset 

  [Test Case]

  * all test cases are for BF2:
  * trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02
This should trigger a graceful reboot of the DPU.
  * rmmod/modprobe
  * reboot test and make sure the driver is always loaded

  [Regression Potential]

  * Run the 100 reboot test and make sure that the driver is loaded with
  no issues. That the gpio graceful reboot works.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2041996/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2041995] [NEW] pwr-mlxbf: support graceful reboot instead of soft reset

2023-10-30 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

There is a new feature request to replace the soft reset with a graceful reboot.
We will use acpi events triggered by the irq in the pwr-mlxbf file to trigger 
the graceful reboot.


[Fix]

* Change the handling of GPIO7 (BF2) and GPIO6 (BF3). This gpio will trigger a 
graceful reboot instead of 
  a forced reboot (soft reset). This is an acpi event.

[Test Case]

* trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02
  This should trigger a graceful reboot of the DPU.

[Regression Potential]

* make sure it works on BF2 and BF3

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2041995

Title:
  pwr-mlxbf: support graceful reboot instead of soft reset

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  There is a new feature request to replace the soft reset with a graceful 
reboot.
  We will use acpi events triggered by the irq in the pwr-mlxbf file to trigger 
the graceful reboot.

  
  [Fix]

  * Change the handling of GPIO7 (BF2) and GPIO6 (BF3). This gpio will trigger 
a graceful reboot instead of 
a forced reboot (soft reset). This is an acpi event.

  [Test Case]

  * trigger the gpio toggling from the BMC: ipmitool raw 0x32 0xA1 0x02
This should trigger a graceful reboot of the DPU.

  [Regression Potential]

  * make sure it works on BF2 and BF3

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2041995/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2038868] [NEW] gpio-mlxbf3: support valid mask

2023-10-09 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

After syncing up the gpio-mlxbf3.c driver with the upstreamed version, we 
dropped the use of the valid_mask variable because kernels greater or equal to 
6.2.0 dont need it.
This is no longer needed in kernel versions >= 6.2.0 because valid_mask is 
populated by core gpio code.
5.15 kernel doesnt support that feature so we still need to explicitly define 
valid_mask like we did before. 
This doesnt impact the functionality of the GPIO driver but it is a security 
breach as it doesnt restrict the access to only gpios defined in the acpi table 
by valid_mask.

[Fix]

* define valid_mask and init_valid_mask

[Test Case]

* Make sure that the user (libgpiod) cannot access any other gpio
besides 0->4 (gpiochip0) and 22-23 in gpiochip1.

[Regression Potential]

no known regression.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2038868

Title:
  gpio-mlxbf3: support valid mask

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  After syncing up the gpio-mlxbf3.c driver with the upstreamed version, we 
dropped the use of the valid_mask variable because kernels greater or equal to 
6.2.0 dont need it.
  This is no longer needed in kernel versions >= 6.2.0 because valid_mask is 
populated by core gpio code.
  5.15 kernel doesnt support that feature so we still need to explicitly define 
valid_mask like we did before. 
  This doesnt impact the functionality of the GPIO driver but it is a security 
breach as it doesnt restrict the access to only gpios defined in the acpi table 
by valid_mask.

  [Fix]

  * define valid_mask and init_valid_mask

  [Test Case]

  * Make sure that the user (libgpiod) cannot access any other gpio
  besides 0->4 (gpiochip0) and 22-23 in gpiochip1.

  [Regression Potential]

  no known regression.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2038868/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2035128] [NEW] mlxbf-gige: Enable the OOB port in mlxbf_gige_open

2023-09-11 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

At the moment, the OOB port is enabled in the mlxbf_gige_probe
function. If the mlxbf_gige_open is not executed, this could cause
pause frames to increase in the case where there is high backgroud
traffic. This results in clogging the BMC port as well.

[Fix]

* Move enabling the OOB port to mlxbf_gige_open.

[Test Case]

* Main test for this bug: Check that the BMC is always pingable while
pushing a BFB

Other tests:
* Check if the gige driver is loaded
* Check that the oob_net0 interface is up and operational
* Do the reboot test and powercycle test and check the oob_net0 interface again
* Push BFB multiple times and make sure the OOB is up and running

[Regression Potential]

Since are moving code that hasn't moved since BF2, it is important to make sure 
that there is no regression.
Make sure that the OOB interface is always up and pingable after the reboot 
test, the powercycle test
and after pushing a BFB.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2035128

Title:
  mlxbf-gige: Enable the OOB port in mlxbf_gige_open

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  At the moment, the OOB port is enabled in the mlxbf_gige_probe
  function. If the mlxbf_gige_open is not executed, this could cause
  pause frames to increase in the case where there is high backgroud
  traffic. This results in clogging the BMC port as well.

  [Fix]

  * Move enabling the OOB port to mlxbf_gige_open.

  [Test Case]

  * Main test for this bug: Check that the BMC is always pingable while
  pushing a BFB

  Other tests:
  * Check if the gige driver is loaded
  * Check that the oob_net0 interface is up and operational
  * Do the reboot test and powercycle test and check the oob_net0 interface 
again
  * Push BFB multiple times and make sure the OOB is up and running

  [Regression Potential]

  Since are moving code that hasn't moved since BF2, it is important to make 
sure that there is no regression.
  Make sure that the OOB interface is always up and pingable after the reboot 
test, the powercycle test
  and after pushing a BFB.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2035128/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2033439] [NEW] Cherry-pick gpio and pinctrl drivers from upstream

2023-08-29 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Cherry pick gpio-mlxbf3.c and pinctrl-mlxbf3.c patches from linux-next.

[Fix]

* Revert existing pinctrl-mlxbf3.c driver changes
* Revert existing gpio-mlxbf3.c driver changes
* Cherry-pick all the following commits from linux-next:
  gpio-mlxbf3.c:
  cd33f216d241520385a5166ae73a0771197a9f0b
  38a700efc51080c7184f71edbf5e49561da9754f
  1d2a22fa6d2511d5871d87c15b4fe7a944fe3b2a

  pinctrl-mlxbf3.c:
  d11f932808dc689717e409bbc81b5093e7902fc9
  743d3336029ffe2bb38e982a3b572ced243c6d43
  c0f84760b01e8d8b59e9e186a4f7fa8f081a4488
  69657e60b8a7faf83b583c658ec7ce1f5ece9eb3

[Test Case]

* All test cases are for BF3 only
* Check that the gpio-mlxbf3 driver gets loaded at boot time without issues
* Check that the pinctrl-mlxbf3 driver gets loaded at boot time without issues
* use libgpiod to test the access to gpio pin 0 through 4 i.e. read and write.
* rmmod and modprobe of both drivers 
* Check that pwr-mlxbf driver is loaded successfully
* Check that mlxbf-gige driver is loaded successfully and the irq is 
initialized properly (dmesg | grep PHY)

[Regression Potential]

* We introduced a dependency of the gpio-mlxbf3 driver on pinctrl-mlxbf3. So we 
need to make sure that
  doesnt trigger any regressions with loading other dependent drivers
* make sure that removing/reloading the driver/restarting the DPU doesnt cause 
any panic related to these drivers.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2033439

Title:
  Cherry-pick gpio and pinctrl drivers from upstream

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Cherry pick gpio-mlxbf3.c and pinctrl-mlxbf3.c patches from linux-
  next.

  [Fix]

  * Revert existing pinctrl-mlxbf3.c driver changes
  * Revert existing gpio-mlxbf3.c driver changes
  * Cherry-pick all the following commits from linux-next:
gpio-mlxbf3.c:
cd33f216d241520385a5166ae73a0771197a9f0b
38a700efc51080c7184f71edbf5e49561da9754f
1d2a22fa6d2511d5871d87c15b4fe7a944fe3b2a

pinctrl-mlxbf3.c:
d11f932808dc689717e409bbc81b5093e7902fc9
743d3336029ffe2bb38e982a3b572ced243c6d43
c0f84760b01e8d8b59e9e186a4f7fa8f081a4488
69657e60b8a7faf83b583c658ec7ce1f5ece9eb3

  [Test Case]

  * All test cases are for BF3 only
  * Check that the gpio-mlxbf3 driver gets loaded at boot time without issues
  * Check that the pinctrl-mlxbf3 driver gets loaded at boot time without issues
  * use libgpiod to test the access to gpio pin 0 through 4 i.e. read and write.
  * rmmod and modprobe of both drivers 
  * Check that pwr-mlxbf driver is loaded successfully
  * Check that mlxbf-gige driver is loaded successfully and the irq is 
initialized properly (dmesg | grep PHY)

  [Regression Potential]

  * We introduced a dependency of the gpio-mlxbf3 driver on pinctrl-mlxbf3. So 
we need to make sure that
doesnt trigger any regressions with loading other dependent drivers
  * make sure that removing/reloading the driver/restarting the DPU doesnt 
cause any panic related to these drivers.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2033439/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2028891] [NEW] HTTP boot grub issue

2023-07-27 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

We have tested PXE boot successfully on the BlueField using grubaa64.efi and 
grub.cfg but HTTP boot does not work. 
During HTTP boot, UEFI is able to load the grubaa64.efi but doesn’t load 
grub.cfg and goes instead into the grub rescue shell.
From the grub shell, I try to load the grub.cfg file manually, but I get the 
error shown below.

grub> set
?=0
cmdpath=(http,192.168.200.1)
color_highlight=black/light-gray
color_normal=light-gray/black
feature_200_final=y
feature_all_video_module=y
feature_chainloader_bpb=y
feature_default_font_path=y
feature_menuentry_id=y
feature_menuentry_options=y
feature_nativedisk_cmd=y
feature_ntldr=y
feature_platform_search_hint=y
feature_timeout_style=y
grub_cpu=arm64
grub_netfs_type=efi
grub_platform=efi
lang=
locale_dir=
net_default_interface=efinet2
net_default_ip=192.168.200.2
net_default_mac=
net_default_server=192.168.200.1
net_efinet2_ip=192.168.200.2
net_efinet2_mac=
package_version=2.06-2ubuntu14.1
pager=
prefix=(http,192.168.200.1)/grub
pxe_default_server=192.168.200.1
root=http,192.168.200.1
secondary_locale_dir=

configfile (http,192.168.200.1)/grub/grub.cfg
this doesn't works: error: Fail to send a request! status=0x8002.

However it works if I run:
configfile (tftp,192.168.200.1)/grub/grub.cfg

This forum discusses a similar issue which points to a grub issue:
https://groups.google.com/g/linux.debian.bugs.dist/c/CqfwbhAd-Xg

Could you please help investigate this? Or point me to the right person?


[Fix]

* Unknown yet. Investigate grub.

[Test Case]

* HTTP boot via OOB
* HTTP boot via Connect X

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2028891

Title:
  HTTP boot grub issue

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  We have tested PXE boot successfully on the BlueField using grubaa64.efi and 
grub.cfg but HTTP boot does not work. 
  During HTTP boot, UEFI is able to load the grubaa64.efi but doesn’t load 
grub.cfg and goes instead into the grub rescue shell.
  From the grub shell, I try to load the grub.cfg file manually, but I get the 
error shown below.

  grub> set
  ?=0
  cmdpath=(http,192.168.200.1)
  color_highlight=black/light-gray
  color_normal=light-gray/black
  feature_200_final=y
  feature_all_video_module=y
  feature_chainloader_bpb=y
  feature_default_font_path=y
  feature_menuentry_id=y
  feature_menuentry_options=y
  feature_nativedisk_cmd=y
  feature_ntldr=y
  feature_platform_search_hint=y
  feature_timeout_style=y
  grub_cpu=arm64
  grub_netfs_type=efi
  grub_platform=efi
  lang=
  locale_dir=
  net_default_interface=efinet2
  net_default_ip=192.168.200.2
  net_default_mac=
  net_default_server=192.168.200.1
  net_efinet2_ip=192.168.200.2
  net_efinet2_mac=
  package_version=2.06-2ubuntu14.1
  pager=
  prefix=(http,192.168.200.1)/grub
  pxe_default_server=192.168.200.1
  root=http,192.168.200.1
  secondary_locale_dir=

  configfile (http,192.168.200.1)/grub/grub.cfg
  this doesn't works: error: Fail to send a request! status=0x8002.

  However it works if I run:
  configfile (tftp,192.168.200.1)/grub/grub.cfg

  This forum discusses a similar issue which points to a grub issue:
  https://groups.google.com/g/linux.debian.bugs.dist/c/CqfwbhAd-Xg

  Could you please help investigate this? Or point me to the right
  person?


  [Fix]

  * Unknown yet. Investigate grub.

  [Test Case]

  * HTTP boot via OOB
  * HTTP boot via Connect X

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2028891/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2028309] [NEW] mlxbf-bootctl: Fix kernel panic due to buffer overflow

2023-07-20 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Running the following LTP (linux-test-project) script, causes
a kernel panic and a reboot of the DPU:
ltp/testcases/bin/read_all -d /sys -q -r 10

The above test reads all directory and files under /sys.
Reading the sysfs entry "large_icm" causes the kernel panic
due to a garbage value returned via i2c read. That garbage
value causes a buffer overflow in sprintf.


[Fix]

* Replace sprintf with snprintf. And also add missing lock and
increase the buffer size to PAGE_SIZE.

[Test Case]

* Run from linux:
ltp/testcases/bin/read_all -d /sys -q -r 10

[Regression Potential]

no known regression

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2028309

Title:
  mlxbf-bootctl: Fix kernel panic due to buffer overflow

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Running the following LTP (linux-test-project) script, causes
  a kernel panic and a reboot of the DPU:
  ltp/testcases/bin/read_all -d /sys -q -r 10

  The above test reads all directory and files under /sys.
  Reading the sysfs entry "large_icm" causes the kernel panic
  due to a garbage value returned via i2c read. That garbage
  value causes a buffer overflow in sprintf.

  
  [Fix]

  * Replace sprintf with snprintf. And also add missing lock and
  increase the buffer size to PAGE_SIZE.

  [Test Case]

  * Run from linux:
  ltp/testcases/bin/read_all -d /sys -q -r 10

  [Regression Potential]

  no known regression

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2028309/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2022387] [NEW] mlxbf-gige: Fix intermittent no ip issue

2023-06-02 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Although the link is up, and the PHY interrupt is cleared, there is no
ip assigned. Nothing is being transmitted, and nothing is received. The
RX error count keeps on increasing (check ifconfig oob_net0). After
several minutes, the RX error count stagnates and the oob finally gets
an ip and is pingable.

[Fix]

The issue is in the mlxbf_gige_rx_init function. As soon as the RX DMA is 
enabled,
the RX CI reaches the max 128, it becomes equal to RX PI. And RX CI doesn't 
decrease
since the code hasn't ran phy_start yet. The solution is to move the rx init 
after phy_start.

[Test Case]

* Check if the gige driver is loaded
* Check that the oob_net0 interface is up and pingable from an external host
* Do at ~1000 resets and powercycles and check the oon_net0 interface again

[Regression Potential]

* No known regressions.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2022387

Title:
  mlxbf-gige: Fix intermittent no ip issue

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Although the link is up, and the PHY interrupt is cleared, there is no
  ip assigned. Nothing is being transmitted, and nothing is received.
  The RX error count keeps on increasing (check ifconfig oob_net0).
  After several minutes, the RX error count stagnates and the oob
  finally gets an ip and is pingable.

  [Fix]

  The issue is in the mlxbf_gige_rx_init function. As soon as the RX DMA is 
enabled,
  the RX CI reaches the max 128, it becomes equal to RX PI. And RX CI doesn't 
decrease
  since the code hasn't ran phy_start yet. The solution is to move the rx init 
after phy_start.

  [Test Case]

  * Check if the gige driver is loaded
  * Check that the oob_net0 interface is up and pingable from an external host
  * Do at ~1000 resets and powercycles and check the oon_net0 interface again

  [Regression Potential]

  * No known regressions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2022387/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2022370] [NEW] mlxbf-gige: Fix kernel panic at shutdown

2023-06-02 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

We occasionally see a race condition (once every 350 reboots) where napi is 
still
running (mlxbf_gige_poll) while a shutdown has been initiated through "reboot".
Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as
a result causes a kernel panic.

[Fix]

The fix is to explicitly disable napi and dequeue it during shutdown.
mlxbf_gige_remove already calls:
unregister_netdev->unregister_netdevice->unregister_netdev_queue->
rollback_registered->rollback_registered_many->dev_close_many->
__dev_close_many->ndo_stop->mlxbf_gige_stop which stops napi

So use mlxbf_gige_remove in place of the existing shutdown logic.

[Test Case]

* Issue at least 1000 reboots from linux and make sure there is no panic
caused by the mlxbf-gige driver.

[Regression Potential]

* since this issue is hard to reproduce, it hasn't been tested
thoroughly yet. so it needs several reboot loops to validate it.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2022370

Title:
  mlxbf-gige: Fix kernel panic at shutdown

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  We occasionally see a race condition (once every 350 reboots) where napi is 
still
  running (mlxbf_gige_poll) while a shutdown has been initiated through 
"reboot".
  Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and 
as
  a result causes a kernel panic.

  [Fix]

  The fix is to explicitly disable napi and dequeue it during shutdown.
  mlxbf_gige_remove already calls:
  unregister_netdev->unregister_netdevice->unregister_netdev_queue->
  rollback_registered->rollback_registered_many->dev_close_many->
  __dev_close_many->ndo_stop->mlxbf_gige_stop which stops napi
  
  So use mlxbf_gige_remove in place of the existing shutdown logic.

  [Test Case]

  * Issue at least 1000 reboots from linux and make sure there is no
  panic caused by the mlxbf-gige driver.

  [Regression Potential]

  * since this issue is hard to reproduce, it hasn't been tested
  thoroughly yet. so it needs several reboot loops to validate it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2022370/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2013383] [NEW] mlxbf-bootctl: support SMC call for setting ARM boot state

2023-03-30 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Add a new SMC call which allows setting the ARM boot progress state to
"OS is up".

[Fix]

* Add a new SMC call in mlxbf-bootctl which allows setting the ARM boot
progress state to "OS is up".

[Test Case]

* Check that /sys/devices/platform/MLNXBF04:00/driver/os_up exists

[Regression Potential]

* No known regression because it is just adding a sysfs entry. It is
backward compatible with older TAF images. If the SMC call is not
supported by ATF, it is ignored.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2013383

Title:
  mlxbf-bootctl: support SMC call for setting ARM boot state

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Add a new SMC call which allows setting the ARM boot progress state to
  "OS is up".

  [Fix]

  * Add a new SMC call in mlxbf-bootctl which allows setting the ARM
  boot progress state to "OS is up".

  [Test Case]

  * Check that /sys/devices/platform/MLNXBF04:00/driver/os_up exists

  [Regression Potential]

  * No known regression because it is just adding a sysfs entry. It is
  backward compatible with older TAF images. If the SMC call is not
  supported by ATF, it is ignored.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2013383/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2012743] Re: Support Nvidia BlueField-3 GPIO driver and pin controller

2023-03-24 Thread Asmaa Mnebhi
Since these patches have been acked for upstreaming but are not yet added to 
any upstreaming branch, we will wait for another week before sending canonical 
the patches.
Patches needed:

- pinctrl: Introduce struct pinfunction and PINCTRL_PINFUNCTION() macro
- [PATCH v4] gpio: mmio: handle "ngpios" properly in bgpio_init()
- [PATCH v1] gpio: mmio: fix calculation of bgpio_bits
- [PATCH v6 1/2] gpio: mlxbf3: Add gpio driver support
- [PATCH v6 2/2] pinctrl: mlxbf3: Add pinctrl driver support

on top of the above, we need to revert the changes related to the gpio-
mlxbf3.c driver, add the Kconfigs for both the GPIO and PIN ctrl drivers
and add the pinctrl driver to the deb.

** Description changed:

  SRU Justification:
  
  [Impact]
  
  Support the BlueField-3 SoC GPIO driver for handling interrupts and providing 
the option to change the direction and value of a GPIO.
  Support the BlueField-3 SoC pin controller driver for allowing a select 
number of GPIO pins to be manipulated from userspace or the kernel.
  
- All these changes have been accepted for upstream but most of them are not 
yet in the tree/branches so
- we need to submit them as SAUCE. 
+ All these changes have been accepted for upstream but most of them are
+ not yet in the tree/branches.
  
  PLEASE NOTE: This change is dependent on changes done in the ACPI
  tables. So the UEFI image needs to be updated accordingly.
  
  [Fix]
  
  * Add support for the BlueField-3 SoC GPIO driver.
  This driver configures and handles GPIO interrupts. It also enables a user to 
manipulate certain GPIO pins via libgpiod tools or other kernel drivers.
  The usable pins are defined via the "gpio-reserved-ranges" property.
  
  * NVIDIA BlueField-3 SoC has a few pins that can be used as GPIOs or
  take the default hardware functionality. Add a driver for the pin
  muxing.
  
  * The following gpiolib commits are bug fixes and are required for the gpio 
driver to work:
  443a0a0f0cf4f432c7af6654b7f2f920d411d379
  
- Although the following have been accepted by maintainers, they are not 
present in any tree/branch yet so 
+ Although the following have been accepted by maintainers, they are not 
present in any tree/branch yet so
  these will be pushed as SAUCE for now:
  [PATCH v4] gpio: mmio: handle "ngpios" properly in bgpio_init()
  [PATCH v1] gpio: mmio: fix calculation of bgpio_bits
  
  [Test Case]
  
  * Check if the gpio-mlxbf3 driver is loaded
  * Check if the pinctrl-mlxbf3 driver is loaded
  * check if the mlxbf-gige driver is loaded
  * check if the pwr-mlxbf driver is loaded
  * Check that the oob_net0 interface is up and operational
  * Do reset and powercycle and check the oob_net0 interface again
  * Test power GPIO interrupt on BF3.
  
  [Regression Potential]
  
  * The Mellanox drivers could fail to be loaded.
  * The mlxbf-gige PHY interrupt or pwr-mlxbf interrupt could fail.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2012743

Title:
  Support Nvidia BlueField-3 GPIO driver and pin controller

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Support the BlueField-3 SoC GPIO driver for handling interrupts and providing 
the option to change the direction and value of a GPIO.
  Support the BlueField-3 SoC pin controller driver for allowing a select 
number of GPIO pins to be manipulated from userspace or the kernel.

  All these changes have been accepted for upstream but most of them are
  not yet in the tree/branches.

  PLEASE NOTE: This change is dependent on changes done in the ACPI
  tables. So the UEFI image needs to be updated accordingly.

  [Fix]

  * Add support for the BlueField-3 SoC GPIO driver.
  This driver configures and handles GPIO interrupts. It also enables a user to 
manipulate certain GPIO pins via libgpiod tools or other kernel drivers.
  The usable pins are defined via the "gpio-reserved-ranges" property.

  * NVIDIA BlueField-3 SoC has a few pins that can be used as GPIOs or
  take the default hardware functionality. Add a driver for the pin
  muxing.

  * The following gpiolib commits are bug fixes and are required for the gpio 
driver to work:
  443a0a0f0cf4f432c7af6654b7f2f920d411d379

  Although the following have been accepted by maintainers, they are not 
present in any tree/branch yet so
  these will be pushed as SAUCE for now:
  [PATCH v4] gpio: mmio: handle "ngpios" properly in bgpio_init()
  [PATCH v1] gpio: mmio: fix calculation of bgpio_bits

  [Test Case]

  * Check if the gpio-mlxbf3 driver is loaded
  * Check if the pinctrl-mlxbf3 driver is loaded
  * check if the mlxbf-gige driver is loaded
  * check if the pwr-mlxbf driver is loaded
  * Check that the oob_net0 interface is up and operational
  * Do reset and powercycle and check the oob_net0 interface again
  * Test power GPIO interrupt on BF3.

  [Regression Potential]

  

[Kernel-packages] [Bug 2012743] [NEW] Support Nvidia BlueField-3 GPIO driver and pin controller

2023-03-24 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Support the BlueField-3 SoC GPIO driver for handling interrupts and providing 
the option to change the direction and value of a GPIO.
Support the BlueField-3 SoC pin controller driver for allowing a select number 
of GPIO pins to be manipulated from userspace or the kernel.

All these changes have been accepted for upstream but most of them are not yet 
in the tree/branches so
we need to submit them as SAUCE. 

PLEASE NOTE: This change is dependent on changes done in the ACPI
tables. So the UEFI image needs to be updated accordingly.

[Fix]

* Add support for the BlueField-3 SoC GPIO driver.
This driver configures and handles GPIO interrupts. It also enables a user to 
manipulate certain GPIO pins via libgpiod tools or other kernel drivers.
The usable pins are defined via the "gpio-reserved-ranges" property.

* NVIDIA BlueField-3 SoC has a few pins that can be used as GPIOs or
take the default hardware functionality. Add a driver for the pin
muxing.

* The following gpiolib commits are bug fixes and are required for the gpio 
driver to work:
443a0a0f0cf4f432c7af6654b7f2f920d411d379

Although the following have been accepted by maintainers, they are not present 
in any tree/branch yet so 
these will be pushed as SAUCE for now:
[PATCH v4] gpio: mmio: handle "ngpios" properly in bgpio_init()
[PATCH v1] gpio: mmio: fix calculation of bgpio_bits

[Test Case]

* Check if the gpio-mlxbf3 driver is loaded
* Check if the pinctrl-mlxbf3 driver is loaded
* check if the mlxbf-gige driver is loaded
* check if the pwr-mlxbf driver is loaded
* Check that the oob_net0 interface is up and operational
* Do reset and powercycle and check the oob_net0 interface again
* Test power GPIO interrupt on BF3.

[Regression Potential]

* The Mellanox drivers could fail to be loaded.
* The mlxbf-gige PHY interrupt or pwr-mlxbf interrupt could fail.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2012743

Title:
  Support Nvidia BlueField-3 GPIO driver and pin controller

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Support the BlueField-3 SoC GPIO driver for handling interrupts and providing 
the option to change the direction and value of a GPIO.
  Support the BlueField-3 SoC pin controller driver for allowing a select 
number of GPIO pins to be manipulated from userspace or the kernel.

  All these changes have been accepted for upstream but most of them are not 
yet in the tree/branches so
  we need to submit them as SAUCE. 

  PLEASE NOTE: This change is dependent on changes done in the ACPI
  tables. So the UEFI image needs to be updated accordingly.

  [Fix]

  * Add support for the BlueField-3 SoC GPIO driver.
  This driver configures and handles GPIO interrupts. It also enables a user to 
manipulate certain GPIO pins via libgpiod tools or other kernel drivers.
  The usable pins are defined via the "gpio-reserved-ranges" property.

  * NVIDIA BlueField-3 SoC has a few pins that can be used as GPIOs or
  take the default hardware functionality. Add a driver for the pin
  muxing.

  * The following gpiolib commits are bug fixes and are required for the gpio 
driver to work:
  443a0a0f0cf4f432c7af6654b7f2f920d411d379

  Although the following have been accepted by maintainers, they are not 
present in any tree/branch yet so 
  these will be pushed as SAUCE for now:
  [PATCH v4] gpio: mmio: handle "ngpios" properly in bgpio_init()
  [PATCH v1] gpio: mmio: fix calculation of bgpio_bits

  [Test Case]

  * Check if the gpio-mlxbf3 driver is loaded
  * Check if the pinctrl-mlxbf3 driver is loaded
  * check if the mlxbf-gige driver is loaded
  * check if the pwr-mlxbf driver is loaded
  * Check that the oob_net0 interface is up and operational
  * Do reset and powercycle and check the oob_net0 interface again
  * Test power GPIO interrupt on BF3.

  [Regression Potential]

  * The Mellanox drivers could fail to be loaded.
  * The mlxbf-gige PHY interrupt or pwr-mlxbf interrupt could fail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2012743/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2008833] [NEW] mlxbf-gige: Fix intermittent no ip issue

2023-02-28 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Although the link is up, and the PHY interrupt is cleared, there is no
ip assigned. Nothing is being transmitted, and nothing is received. The
RX error count keeps on increasing (check ifconfig oob_net0). After
several minutes, the RX error count stagnates and the oob finally gets
an ip and is pingable.

[Fix]

The issue is in the mlxbf_gige_rx_init function. As soon as the RX DMA is 
enabled,
the RX CI reaches the max 128, it becomes equal to RX PI. And RX CI doesn't 
decrease
since the code hasn't ran phy_start yet. The solution is to move the rx init 
after phy_start.
This fix applies to both 5.4 and 5.15.

[Test Case]

* Check if the gige driver is loaded
* Check that the oob_net0 interface is up and pingable from an external host
* Do at ~1000 resets and powercycles and check the oon_net0 interface again

[Regression Potential]

* No known regressions.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2008833

Title:
  mlxbf-gige: Fix intermittent no ip issue

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Although the link is up, and the PHY interrupt is cleared, there is no
  ip assigned. Nothing is being transmitted, and nothing is received.
  The RX error count keeps on increasing (check ifconfig oob_net0).
  After several minutes, the RX error count stagnates and the oob
  finally gets an ip and is pingable.

  [Fix]

  The issue is in the mlxbf_gige_rx_init function. As soon as the RX DMA is 
enabled,
  the RX CI reaches the max 128, it becomes equal to RX PI. And RX CI doesn't 
decrease
  since the code hasn't ran phy_start yet. The solution is to move the rx init 
after phy_start.
  This fix applies to both 5.4 and 5.15.

  [Test Case]

  * Check if the gige driver is loaded
  * Check that the oob_net0 interface is up and pingable from an external host
  * Do at ~1000 resets and powercycles and check the oon_net0 interface again

  [Regression Potential]

  * No known regressions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2008833/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2007581] [NEW] gpio: Restrict usage of GPIO chip irq members before initialization

2023-02-16 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

GPIO chip irq members are exposed before they could be completely
initialized and this leads to race conditions.

One such issue was observed for the gc->irq.domain variable which
was accessed through the pwr-mlxbf.c driver in gpiochip_to_irq() before
it could be initialized by gpiochip_add_irqchip(). This resulted in
Kernel NULL pointer dereference. This is a well known issue in the linux 
community
and was fixed via 2 commits:
5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320
and
06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 (since the previous commit caused a 
regression)

This race condition is intermittent and hard to reproduce.

[Fix]

* Cherry pick: 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320 to fix the bug at stake
* cherry-pick: 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 to fix a regression 
introduced by the previous commit

[Test Case]

* Check that the gpio-mlxbf2.c driver is loaded with no kernel panic
* check that all drivers dependent on gpio-mlxbf2.c driver are loaded 
(mlxbf-gige and pwr-mlxbf)
* do 5000 reboots to make sure this race condition no longer happens

[Regression Potential]

This could cause some regression with the use of gpio interrupts so it is 
important to test the dependent
drivers mlxbf-gige and pwr-mlxbf. Trigger power reset interrupt to test 
pwr-mlxbf and bring down/up the 
oob_net0 interface to test mlxbf-gige.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2007581

Title:
   gpio: Restrict usage of GPIO chip irq members before initialization

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  GPIO chip irq members are exposed before they could be completely
  initialized and this leads to race conditions.

  One such issue was observed for the gc->irq.domain variable which
  was accessed through the pwr-mlxbf.c driver in gpiochip_to_irq() before
  it could be initialized by gpiochip_add_irqchip(). This resulted in
  Kernel NULL pointer dereference. This is a well known issue in the linux 
community
  and was fixed via 2 commits:
  5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320
  and
  06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 (since the previous commit caused a 
regression)

  This race condition is intermittent and hard to reproduce.

  [Fix]

  * Cherry pick: 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320 to fix the bug at 
stake
  * cherry-pick: 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9 to fix a regression 
introduced by the previous commit

  [Test Case]

  * Check that the gpio-mlxbf2.c driver is loaded with no kernel panic
  * check that all drivers dependent on gpio-mlxbf2.c driver are loaded 
(mlxbf-gige and pwr-mlxbf)
  * do 5000 reboots to make sure this race condition no longer happens

  [Regression Potential]

  This could cause some regression with the use of gpio interrupts so it is 
important to test the dependent
  drivers mlxbf-gige and pwr-mlxbf. Trigger power reset interrupt to test 
pwr-mlxbf and bring down/up the 
  oob_net0 interface to test mlxbf-gige.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2007581/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1995296] [NEW] mlx-bootctl: support icm carveout eeprom region read/write

2022-10-31 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

The BlueField-3 ICM carveout feature will enable NIC FW to bypass the SMMU 
block to access DRAM memory.
The amount of memory accessible by FW will be controlled by ARM. This patch 
enables setting the size of the large ICM carveout from userspace. The max size 
is 1TB, has a granularity of 128MB and will be passed
and printed in hex. The size unit is MB.

[Fix]

* Support SMC calls to control the large ICM carveout size.

[Test Case]

* Only valid testing on BlueField-3 real hardware.
* set the region size from sysfs and NIC FW will test that it has access to the 
entire region requested.

[Regression Potential]

* This code doesn't really have a negative impact on the functionality of the 
mlxbf-bootctl driver itself but could be a risk for the overall boot if not 
used/tested properly.
* memory region requested is too large, adn linux cannot boot as a result. The 
memory allocated for the ICM carveout cannot be accessed by linux.
* the icmc size is not passed properly to NIC FW

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1995296

Title:
  mlx-bootctl: support icm carveout eeprom region read/write

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  The BlueField-3 ICM carveout feature will enable NIC FW to bypass the SMMU 
block to access DRAM memory.
  The amount of memory accessible by FW will be controlled by ARM. This patch 
enables setting the size of the large ICM carveout from userspace. The max size 
is 1TB, has a granularity of 128MB and will be passed
  and printed in hex. The size unit is MB.

  [Fix]

  * Support SMC calls to control the large ICM carveout size.

  [Test Case]

  * Only valid testing on BlueField-3 real hardware.
  * set the region size from sysfs and NIC FW will test that it has access to 
the entire region requested.

  [Regression Potential]

  * This code doesn't really have a negative impact on the functionality of the 
mlxbf-bootctl driver itself but could be a risk for the overall boot if not 
used/tested properly.
  * memory region requested is too large, adn linux cannot boot as a result. 
The memory allocated for the ICM carveout cannot be accessed by linux.
  * the icmc size is not passed properly to NIC FW

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1995296/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1994897] [NEW] Support BlueField-3 GPIO driver

2022-10-26 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

* Support GPIO driver for BlueField-3 SoCs.

[Fix]

* Support GPIO driver for BlueField-3 SoCs.
* Allows user to alter GPIO value when direction is set to output
* Support configuring GPIOs as interrupts for dependent drivers such as 
pwr-mlxbf and mlxbf-gige

[Test Case]

* At the moment, this driver can only be tested internally on palladium and via 
the ARM simulator.
* This driver only loaded on BF3 platforms so has no impact on older platforms 
such as BF2 and BF1.

[Regression Potential]

* Code will need to be tested once real HW is ready (bringup)

[Other]
* This driver is in the process of being upstreamed. This code is likely to 
change depending on feedback we receive from maintainers. But the upstreaming 
process is long and we need to have this code to be available before 
BlueField-3 bringup.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1994897

Title:
  Support BlueField-3 GPIO driver

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  * Support GPIO driver for BlueField-3 SoCs.

  [Fix]

  * Support GPIO driver for BlueField-3 SoCs.
  * Allows user to alter GPIO value when direction is set to output
  * Support configuring GPIOs as interrupts for dependent drivers such as 
pwr-mlxbf and mlxbf-gige

  [Test Case]

  * At the moment, this driver can only be tested internally on palladium and 
via the ARM simulator.
  * This driver only loaded on BF3 platforms so has no impact on older 
platforms such as BF2 and BF1.

  [Regression Potential]

  * Code will need to be tested once real HW is ready (bringup)

  [Other]
  * This driver is in the process of being upstreamed. This code is likely to 
change depending on feedback we receive from maintainers. But the upstreaming 
process is long and we need to have this code to be available before 
BlueField-3 bringup.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1994897/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1991551] Re: i2c-mlxbf.c: Sync up driver with upstreaming

2022-10-03 Thread Asmaa Mnebhi
** Description changed:

  SRU Justification:
  
  [Impact]
  
  Several i2c-mlxbf.c patches have been upstreamed recently and are in
  linux-next at the moment. So revert all changes in both Focal (5.4) and
  Jammy (5.15) and cherry-pick all changes from linux-next master.
  
- IMPORTANT NOTE: please make sure you also load the latest UEFI (EDK2)
- firmware because there are dependencies.
+ IMPORTANT NOTE: The new linux i2c support for BF3 also REQUIRES
+ upgrading the UEFI firmware version. For BF1 and BF2 however, this
+ driver should be backward compatible with older UEFI versions.
  
  [Fix]
  
  * There is a total of 17 commits: 8 reverts, 8 cherry-picks from linux-
  next and one internal "UBUNTU: SAUCE:" to ad the driver version.
  Following is the list of commits cherry-picked.
  
  * i2c: mlxbf: incorrect base address passed during io write
  From 

  
  * i2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction()
  From 

  
  * i2c: mlxbf: remove IRQF_ONESHOT
  From 

  
  * i2c: mlxbf: Fix frequency calculation
  From 

  
  * i2c: mlxbf: support lock mechanism
  From 

  
  * i2c: mlxbf: add multi slave functionality
  From 

  
  * i2c: mlxbf: support BlueField-3 SoC
  From 

  
  * i2c: mlxbf: remove device tree support
  From 

  
  * Also add the i2c driver version to keep track of the changes
  internally.
  
  [Test Case]
  
  * Check that the i2c driver loads without errors on BF1 and BF2 (dmesg, and 
check i2c1 sysfs is created)
  * Check that the i2c module can be removed and reloaded without errors.
  * Check that IPMB services work successfully on systems with BMC. This is the 
best way to test the i2c driver. Run ipmitool command from the BF: ipmitool mc 
info
  * Run ipmitool from the BMC as well: ipmitool -I ipmb mc info
+ * This driver should be backward compatible
+ * Make sure this driver is backward compatible with older UEFI version (only 
applicable for BF1 and BF2)
+ * Make sure this driver works for BF1, BF2 and BF3 with the latest UEFI 
version.
  
  [Regression Potential]
  
  * The i2c driver could fail to load because the UEFI firmware hasn't been 
upgraded
  * The i2c driver would fail to load due to a bug
  * IPMB code which utilises the i2c driver fails to work (ipmitool commands)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1991551

Title:
  i2c-mlxbf.c: Sync up driver with upstreaming

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Several i2c-mlxbf.c patches have been upstreamed recently and are in
  linux-next at the moment. So revert all changes in both Focal (5.4)
  and Jammy (5.15) and cherry-pick all changes from linux-next master.

  IMPORTANT NOTE: The new linux i2c support for BF3 also REQUIRES
  upgrading the UEFI firmware version. For BF1 and BF2 however, this
  driver should be backward compatible with older UEFI versions.

  [Fix]

  * There is a total of 17 commits: 8 reverts, 8 cherry-picks from
  linux-next and one internal "UBUNTU: SAUCE:" to ad the driver version.
  Following is the list of commits cherry-picked.

  * i2c: mlxbf: incorrect base address passed during io write
  From 


  * i2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction()
  From 


  * i2c: mlxbf: remove IRQF_ONESHOT
  From 


  * i2c: mlxbf: Fix frequency calculation
  From 


  * i2c: mlxbf: support lock mechanism
  From 

[Kernel-packages] [Bug 1991551] [NEW] i2c-mlxbf.c: Sync up driver with upstreaming

2022-10-03 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Several i2c-mlxbf.c patches have been upstreamed recently and are in
linux-next at the moment. So revert all changes in both Focal (5.4) and
Jammy (5.15) and cherry-pick all changes from linux-next master.

IMPORTANT NOTE: please make sure you also load the latest UEFI (EDK2)
firmware because there are dependencies.

[Fix]

* There is a total of 17 commits: 8 reverts, 8 cherry-picks from linux-
next and one internal "UBUNTU: SAUCE:" to ad the driver version.
Following is the list of commits cherry-picked.

* i2c: mlxbf: incorrect base address passed during io write
>From 
>

* i2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction()
>From 
>

* i2c: mlxbf: remove IRQF_ONESHOT
>From 
>

* i2c: mlxbf: Fix frequency calculation
>From 
>

* i2c: mlxbf: support lock mechanism
>From 
>

* i2c: mlxbf: add multi slave functionality
>From 
>

* i2c: mlxbf: support BlueField-3 SoC
>From 
>

* i2c: mlxbf: remove device tree support
>From 
>

* Also add the i2c driver version to keep track of the changes
internally.

[Test Case]

* Check that the i2c driver loads without errors on BF1 and BF2 (dmesg, and 
check i2c1 sysfs is created)
* Check that the i2c module can be removed and reloaded without errors.
* Check that IPMB services work successfully on systems with BMC. This is the 
best way to test the i2c driver. Run ipmitool command from the BF: ipmitool mc 
info
* Run ipmitool from the BMC as well: ipmitool -I ipmb mc info

[Regression Potential]

* The i2c driver could fail to load because the UEFI firmware hasn't been 
upgraded
* The i2c driver would fail to load due to a bug
* IPMB code which utilises the i2c driver fails to work (ipmitool commands)

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

** Description changed:

  SRU Justification:
  
  [Impact]
  
  Several i2c-mlxbf.c patches have been upstreamed recently and are in
  linux-next at the moment. So revert all changes in both Focal (5.4) and
  Jammy (5.15) and cherry-pick all changes from linux-next master.
  
  IMPORTANT NOTE: please make sure you also load the latest UEFI (EDK2)
  firmware because there are dependencies.
  
  [Fix]
  
+ * There is a total of 17 commits: 8 reverts, 8 cherry-picks from linux-
+ next and one internal "UBUNTU: SAUCE:" to ad the driver version.
+ Following is the list of commits cherry-picked.
+ 
  * i2c: mlxbf: incorrect base address passed during io write
- From 

 
+ From 

  
  * i2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction()
- From 

 
+ From 

  
  * i2c: mlxbf: remove IRQF_ONESHOT
- From 

 
+ From 

  
  * i2c: mlxbf: Fix frequency calculation
- From 

 
+ From 

  
  * i2c: mlxbf: support lock mechanism
- From 

 
+ From 

  
  * i2c: mlxbf: add 

[Kernel-packages] [Bug 1980750] Re: pwr-mlxbf.c: Improve driver dependencies

2022-07-26 Thread Asmaa Mnebhi
** Description changed:

  SRU Justification:
  
  [Impact]
  
  Improve the driver dependency on the gpio driver. Move that dependency
  to the probe as instructed by maintainers. Flush if there is remaining
- work while the driver is removed.
+ work while the driver is removed. fix size for zero allocating memory.
  
  [Fix]
  
  * Instead of using SOFTDEP, return -EPROBE_DEFER if the gpio-mlxbf2.c driver 
is not loaded yet.
  * Flush work when driver is removed.
+ * fix size for zero allocating memory. 
+ * Upgrade the driver version to 1.1 due to all above changes
  
  [Test Case]
  
  * Test case only valid on Board with BMC on it:
  * Make sure driver is loaded and no errors in dmesg.
  * issue a power reset from the BMC via IPMI.
  
  [Regression Potential]
  
  Any of the test cases above could be impacted due to these changes.

** Summary changed:

- pwr-mlxbf.c: Improve driver dependencies
+ pwr-mlxbf.c: Improve driver dependencies and fix zero allocating memory size

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1980750

Title:
  pwr-mlxbf.c: Improve driver dependencies and fix zero allocating
  memory size

Status in linux-bluefield package in Ubuntu:
  Invalid
Status in linux-bluefield source package in Focal:
  In Progress

Bug description:
  SRU Justification:

  [Impact]

  Improve the driver dependency on the gpio driver. Move that dependency
  to the probe as instructed by maintainers. Flush if there is remaining
  work while the driver is removed. fix size for zero allocating memory.

  [Fix]

  * Instead of using SOFTDEP, return -EPROBE_DEFER if the gpio-mlxbf2.c driver 
is not loaded yet.
  * Flush work when driver is removed.
  * fix size for zero allocating memory. 
  * Upgrade the driver version to 1.1 due to all above changes

  [Test Case]

  * Test case only valid on Board with BMC on it:
  * Make sure driver is loaded and no errors in dmesg.
  * issue a power reset from the BMC via IPMI.

  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1980750/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1982357] [NEW] i2c-mlxbf.c: fix wrong variable name

2022-07-20 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

We are using the wrong variable name priv->smbus->io instead of
priv->mst_cause->io. This could result in unexpected i2c behavior.

[Fix]

* replace priv->smbus->io with priv->mst_cause->io

[Test Case]

* Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
* check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

[Regression Potential]

Any of the test cases above could be impacted due to these changes.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1982357

Title:
  i2c-mlxbf.c: fix wrong variable name

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  We are using the wrong variable name priv->smbus->io instead of
  priv->mst_cause->io. This could result in unexpected i2c behavior.

  [Fix]

  * replace priv->smbus->io with priv->mst_cause->io

  [Test Case]

  * Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
  * check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1982357/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1982225] [NEW] i2c-mlxbf.c: replace ioremap_cache with ioremap

2022-07-19 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

On later version of linux, ioremap_cache is deprecated so replace
ioremap_cache with ioremap since it is deprecated in later kernels.


[Fix]

* replace ioremap_nocache with ioremap

[Test Case]

* Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
* check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

[Regression Potential]

Any of the test cases above could be impacted due to these changes.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1982225

Title:
  i2c-mlxbf.c:  replace ioremap_cache with ioremap

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  On later version of linux, ioremap_cache is deprecated so replace
  ioremap_cache with ioremap since it is deprecated in later kernels.

  
  [Fix]

  * replace ioremap_nocache with ioremap

  [Test Case]

  * Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
  * check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1982225/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1981105] Re: i2c-mlxbf.c: support lock mechanism

2022-07-19 Thread Asmaa Mnebhi
** Description changed:

  SRU Justification:
  
  [Impact]
  
- Support the I2C lock mechanism, otherwise there could be unexpected
- behavior when an i2c bus is accessed by several entities like the linux
- driver, ATF driver and UEFI driver. Replace ioremap_cache with ioremap
- since it is deprecated in later kernels.
+ Support the I2C lock mechanism, otherwise there could be unexpected behavior 
when an i2c bus is accessed by several entities like the linux driver, ATF 
driver and UEFI driver. Make sure to pick up the ATF/UEFI image to accompany 
this change
+ because at boot time ATF will ensure that the lock is released.
  
  [Fix]
  
  * Support lock and unlock
  * replace ioremap_nocache with ioremap
  
  [Test Case]
  
  * Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
  * check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)
  
  [Regression Potential]
  
  Any of the test cases above could be impacted due to these changes.
+ Make sure you load the latest ATF/UEFI image to accompany this change.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1981105

Title:
  i2c-mlxbf.c: support lock mechanism

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Support the I2C lock mechanism, otherwise there could be unexpected behavior 
when an i2c bus is accessed by several entities like the linux driver, ATF 
driver and UEFI driver. Make sure to pick up the ATF/UEFI image to accompany 
this change
  because at boot time ATF will ensure that the lock is released.

  [Fix]

  * Support lock and unlock
  * replace ioremap_nocache with ioremap

  [Test Case]

  * Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
  * check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.
  Make sure you load the latest ATF/UEFI image to accompany this change.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1981105/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1981105] Re: i2c-mlxbf.c: support lock mechanism

2022-07-08 Thread Asmaa Mnebhi
** Description changed:

  SRU Justification:
  
  [Impact]
  
  Support the I2C lock mechanism, otherwise there could be unexpected
- behavior when the i2c driver is accessed by several entities like the
- linux driver, ATF driver and UEFI driver.
+ behavior when an i2c bus is accessed by several entities like the linux
+ driver, ATF driver and UEFI driver. Replace ioremap_cache with ioremap
+ since it is deprecated in later kernels.
  
  [Fix]
  
  * Support lock and unlock
  * replace ioremap_nocache with ioremap
  
  [Test Case]
  
  * Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
  * check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)
  
  [Regression Potential]
  
  Any of the test cases above could be impacted due to these changes.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1981105

Title:
  i2c-mlxbf.c: support lock mechanism

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Support the I2C lock mechanism, otherwise there could be unexpected
  behavior when an i2c bus is accessed by several entities like the
  linux driver, ATF driver and UEFI driver. Replace ioremap_cache with
  ioremap since it is deprecated in later kernels.

  [Fix]

  * Support lock and unlock
  * replace ioremap_nocache with ioremap

  [Test Case]

  * Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
  * check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1981105/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1981105] [NEW] i2c-mlxbf.c: support lock mechanism

2022-07-08 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Support the I2C lock mechanism, otherwise there could be unexpected
behavior when the i2c driver is accessed by several entities like the
linux driver, ATF driver and UEFI driver.

[Fix]

* Support lock and unlock
* replace ioremap_nocache with ioremap

[Test Case]

* Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
* check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

[Regression Potential]

Any of the test cases above could be impacted due to these changes.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1981105

Title:
  i2c-mlxbf.c: support lock mechanism

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Support the I2C lock mechanism, otherwise there could be unexpected
  behavior when the i2c driver is accessed by several entities like the
  linux driver, ATF driver and UEFI driver.

  [Fix]

  * Support lock and unlock
  * replace ioremap_nocache with ioremap

  [Test Case]

  * Make sure the i2c-mlxbf.c driver is loaded and /dev/i2c-1 is created
  * check that ipmitool from the BF->BMC and from the BMC->BF work (this only 
applies on boards with a BMC of course)

  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1981105/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1969233] [NEW] mlxbf-gige: sync up with upstreamed version

2022-04-15 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

At the moment, the mlxbf-gige is broken in the ubuntu image because it is out 
of date. This change in the gpio-mlxbf2.c driver broke it: 
https://code.launchpad.net/~asmaam/ubuntu/+source/linux-bluefield/+git/version-seeds/+merge/417771
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1965017

We need to pull the corresponding upstreamed mlxbf-gige driver changes
because there is now a dependency on the gpio-mlxbf2.c driver.

[Fix]

* cherry-pick the following change from the master branch: 
6c2a6ddca763271fa583e22bce10c2805c1ea9f6
* I also took this chance to pull the following change from upstream since it 
was marked as SAUCE:
ee8a9600b5391f434905c46bec7f77d34505083e

[Test Case]

* Check if oob_net0 interface is up
* Test that the oob interface is pingable and can copy files.

[Regression Potential]

Every time I have to port upstreamed changes to the canonical branch, I
have to revert bunch of changes (that are not upstream) and put back the
changes after cherry-picking. Potential regression could be that some of
these changes are not present/functional. we need to test them all.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1969233

Title:
  mlxbf-gige: sync up with upstreamed version

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  At the moment, the mlxbf-gige is broken in the ubuntu image because it is out 
of date. This change in the gpio-mlxbf2.c driver broke it: 
  
https://code.launchpad.net/~asmaam/ubuntu/+source/linux-bluefield/+git/version-seeds/+merge/417771
  https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1965017

  We need to pull the corresponding upstreamed mlxbf-gige driver changes
  because there is now a dependency on the gpio-mlxbf2.c driver.

  [Fix]

  * cherry-pick the following change from the master branch: 
6c2a6ddca763271fa583e22bce10c2805c1ea9f6
  * I also took this chance to pull the following change from upstream since it 
was marked as SAUCE:
  ee8a9600b5391f434905c46bec7f77d34505083e

  [Test Case]

  * Check if oob_net0 interface is up
  * Test that the oob interface is pingable and can copy files.

  [Regression Potential]

  Every time I have to port upstreamed changes to the canonical branch,
  I have to revert bunch of changes (that are not upstream) and put back
  the changes after cherry-picking. Potential regression could be that
  some of these changes are not present/functional. we need to test them
  all.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1969233/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1965017] [NEW] Sync up gpio interrupt handling with upstreamed version

2022-03-15 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

Although the gpio-mlxbf2.c driver has already been upstreamed, it did not 
include gpio interrupt handling. We recently upstreamed the latter and as 
requested by maintainers, moved all gpio interrupt code from the mlxbf-gige 
driver to the gpio-mlxbf2.c to be conform with linux standards.
IMPORTANT: during testing, make sure the latest UEFI (bootloader) is loaded on 
top of these changes, otherwise both the gpio driver and mlxbf-gige driver will 
fail to load.

[Fix]

* reverted 6 commits related to the gpio-mlxbf2 driver so that the canonical 
changes are in sync with what is upstreamed.
* Cherry-picked 11 gpio-mlxbf2.c commits from the linux master branch. 10 
commits are minor commits added by other maintainers. The 11th cherry-picked 
commit is the one adding proper interrupt support in the gpio-mlxbf2.c driver. 
The upstreamed version of the GPIO driver added back the dependency between the 
mlxbf-gige driver and gpio-mlxbf2 driver.
* Added one commit on top of the above to add the driver version and fix the 
SPDX Licence identifier.
* updated the UEFI ACPI table to reflect the above changes (so the bootloader 
and the linux drivers need to be in sync)
* Following the generic way of writing a linux gpio driver, we created a 
separate driver for power handling (low power and reboot) called pwr-mlxbf.c. 
This driver has not been upstreamed yet but will take care of the GPIO7 
software reset and the OCP3.0 GPIO low power mode. Previously all this code was 
integrated within the gpio-mlxbf2.c driver.

[Test Case]

* oob_net0 coming up after several SW_RESET or reboot
* oob_net0 coming up after several powercycles
* oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
* rmmod/modprobe mlxbf_gige several times
* Test that GPIO7 reset still works on BlueSphere like boards


[Regression Potential]

Any of the test cases above could be impacted due to these changes.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1965017

Title:
  Sync up gpio interrupt handling with upstreamed version

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  Although the gpio-mlxbf2.c driver has already been upstreamed, it did not 
include gpio interrupt handling. We recently upstreamed the latter and as 
requested by maintainers, moved all gpio interrupt code from the mlxbf-gige 
driver to the gpio-mlxbf2.c to be conform with linux standards.
  IMPORTANT: during testing, make sure the latest UEFI (bootloader) is loaded 
on top of these changes, otherwise both the gpio driver and mlxbf-gige driver 
will fail to load.

  [Fix]

  * reverted 6 commits related to the gpio-mlxbf2 driver so that the canonical 
changes are in sync with what is upstreamed.
  * Cherry-picked 11 gpio-mlxbf2.c commits from the linux master branch. 10 
commits are minor commits added by other maintainers. The 11th cherry-picked 
commit is the one adding proper interrupt support in the gpio-mlxbf2.c driver. 
The upstreamed version of the GPIO driver added back the dependency between the 
mlxbf-gige driver and gpio-mlxbf2 driver.
  * Added one commit on top of the above to add the driver version and fix the 
SPDX Licence identifier.
  * updated the UEFI ACPI table to reflect the above changes (so the bootloader 
and the linux drivers need to be in sync)
  * Following the generic way of writing a linux gpio driver, we created a 
separate driver for power handling (low power and reboot) called pwr-mlxbf.c. 
This driver has not been upstreamed yet but will take care of the GPIO7 
software reset and the OCP3.0 GPIO low power mode. Previously all this code was 
integrated within the gpio-mlxbf2.c driver.

  [Test Case]

  * oob_net0 coming up after several SW_RESET or reboot
  * oob_net0 coming up after several powercycles
  * oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
  * rmmod/modprobe mlxbf_gige several times
  * Test that GPIO7 reset still works on BlueSphere like boards

  
  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1965017/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1964984] [NEW] Fix OOB handling RX packets in heavy traffic

2022-03-15 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

This is reproducible on systems which already have heavy background
traffic. On top of that, the user issues one of the 2 docker pulls below:
docker pull nvcr.io/ea-doca-hbn/hbn/hbn:latest
OR
docker pull gitlab-master.nvidia.com:5005/dl/dgx/tritonserver:22.02-py3-qa

The second one is a very large container (17GB)

When they run docker pull, the OOB interface stops being pingable,
the docker pull is interrupted for a very long time (>3mn) or
times out.

[Fix]

* Update the RX_CQE_CI before updating the RX_PI to avoid a race condition 
where we wrongly inform HW that there is space for the WQE.
* disable the RX DMA while we are handling incoming packets to avoid overflow.

[Test Case]

* Created a script which loops 200 times and does a docker pull in each loop:
docker pull nvcr.io/ea-doca-hbn/hbn/hbn:latest
OR
docker pull gitlab-master.nvidia.com:5005/dl/dgx/tritonserver:22.02-py3-qa

[Regression Potential]

* This could result in slower handling since we are disabling/enabling the DMA 
periodically.
* Although this fix has been tested by the people who opened the bug, QA needs 
to thoroughly test it to make sure it is not reproducible.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1964984

Title:
  Fix OOB handling RX packets in heavy traffic

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  This is reproducible on systems which already have heavy background
  traffic. On top of that, the user issues one of the 2 docker pulls below:
  docker pull nvcr.io/ea-doca-hbn/hbn/hbn:latest
  OR
  docker pull gitlab-master.nvidia.com:5005/dl/dgx/tritonserver:22.02-py3-qa

  The second one is a very large container (17GB)

  When they run docker pull, the OOB interface stops being pingable,
  the docker pull is interrupted for a very long time (>3mn) or
  times out.

  [Fix]

  * Update the RX_CQE_CI before updating the RX_PI to avoid a race condition 
where we wrongly inform HW that there is space for the WQE.
  * disable the RX DMA while we are handling incoming packets to avoid overflow.

  [Test Case]

  * Created a script which loops 200 times and does a docker pull in each loop:
  docker pull nvcr.io/ea-doca-hbn/hbn/hbn:latest
  OR
  docker pull gitlab-master.nvidia.com:5005/dl/dgx/tritonserver:22.02-py3-qa

  [Regression Potential]

  * This could result in slower handling since we are disabling/enabling the 
DMA periodically.
  * Although this fix has been tested by the people who opened the bug, QA 
needs to thoroughly test it to make sure it is not reproducible.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1964984/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1934923] Re: Sync up mlxbf-gige driver with upstreamed version

2021-07-09 Thread Asmaa Mnebhi
** Description changed:

  SRU Justification:
  
  [Impact]
  
  The mlxbf-gige driver has just been upstreamed so linux-bluefield needs to be 
synced up with what we have upstreamed.
- IMPORTANT: during testing phase, make sure the latest UEFI (bootloader) is 
loaded on top of these changes.
+ IMPORTANT: during testing, make sure the latest UEFI (bootloader) is loaded 
on top of these changes, otherwise both the gpio driver and mlxbf-gige driver 
will fail to load.
  
  [Fix]
  
- * Cleaned up the gige driver as instructed by maintainers
- * removed dependency between the mlxbf-gige driver and gpio-mlxbf2 driver
- * updated the UEFI ACPI table to reflect the above changes
+ * reverted 20 commits related to the mlxbf-gige driver and 1 commit related 
to gpio-mlxbf2 driver since there are dependencies between them.
+ * Cherry-picked f92e1869d74e1acc6551256eb084a1c14a054e19 from net-next 
branch. The upstreamed version of the GPIO driver removed the dependency 
between the mlxbf-gige driver and gpio-mlxbf2 driver.
+ * added code that was left out of the upstreamed version. and added code that 
got reverted in gpio-mlxbf2.c
+ * updated the UEFI ACPI table to reflect the above changes (so the bootloader 
and the linux drivers need to be in sync)
  
  [Test Case]
  
  * oob_net0 coming up after several SW_RESET or reboot
  * oob_net0 coming up after several powercycles
  * oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
  * rmmod/modprove mlxbf_gige several times
  * OOB PXE boot multiple times from UEFI menu
  * automate OOB PXE boot and do reboot
  * automate OOB PXE boot and do powercycle
  * Test that GPIO7 reset still works on BlueSphere like boards
  
  [Regression Potential]
  
- Any of the test cases above could be impacted due to these new changes.
+ Any of the test cases above could be impacted due to these changes.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1934923

Title:
  Sync up mlxbf-gige driver with upstreamed version

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  The mlxbf-gige driver has just been upstreamed so linux-bluefield needs to be 
synced up with what we have upstreamed.
  IMPORTANT: during testing, make sure the latest UEFI (bootloader) is loaded 
on top of these changes, otherwise both the gpio driver and mlxbf-gige driver 
will fail to load.

  [Fix]

  * reverted 20 commits related to the mlxbf-gige driver and 1 commit related 
to gpio-mlxbf2 driver since there are dependencies between them.
  * Cherry-picked f92e1869d74e1acc6551256eb084a1c14a054e19 from net-next 
branch. The upstreamed version of the GPIO driver removed the dependency 
between the mlxbf-gige driver and gpio-mlxbf2 driver.
  * added code that was left out of the upstreamed version. and added code that 
got reverted in gpio-mlxbf2.c
  * updated the UEFI ACPI table to reflect the above changes (so the bootloader 
and the linux drivers need to be in sync)

  [Test Case]

  * oob_net0 coming up after several SW_RESET or reboot
  * oob_net0 coming up after several powercycles
  * oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
  * rmmod/modprove mlxbf_gige several times
  * OOB PXE boot multiple times from UEFI menu
  * automate OOB PXE boot and do reboot
  * automate OOB PXE boot and do powercycle
  * Test that GPIO7 reset still works on BlueSphere like boards

  [Regression Potential]

  Any of the test cases above could be impacted due to these changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1934923/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1934923] Re: Sync up mlxbf-gige driver with upstreamed version

2021-07-07 Thread Asmaa Mnebhi
** Description changed:

  SRU Justification:
  
  [Impact]
  
- The mlxbf-gige driver has just been upstreamed so linux-bluefield needs
- to be synced up with what we have upstreamed.
+ The mlxbf-gige driver has just been upstreamed so linux-bluefield needs to be 
synced up with what we have upstreamed.
+ IMPORTANT: during testing phase, make sure the latest UEFI (bootloader) is 
loaded on top of these changes.
  
  [Fix]
  
  * Cleaned up the gige driver as instructed by maintainers
  * removed dependency between the mlxbf-gige driver and gpio-mlxbf2 driver
  * updated the UEFI ACPI table to reflect the above changes
  
  [Test Case]
  
  * oob_net0 coming up after several SW_RESET or reboot
  * oob_net0 coming up after several powercycles
  * oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
  * rmmod/modprove mlxbf_gige several times
  * OOB PXE boot multiple times from UEFI menu
  * automate OOB PXE boot and do reboot
  * automate OOB PXE boot and do powercycle
  * Test that GPIO7 reset still works on BlueSphere like boards
  
  [Regression Potential]
  
  Any of the test cases above could be impacted due to these new changes.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1934923

Title:
  Sync up mlxbf-gige driver with upstreamed version

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  The mlxbf-gige driver has just been upstreamed so linux-bluefield needs to be 
synced up with what we have upstreamed.
  IMPORTANT: during testing phase, make sure the latest UEFI (bootloader) is 
loaded on top of these changes.

  [Fix]

  * Cleaned up the gige driver as instructed by maintainers
  * removed dependency between the mlxbf-gige driver and gpio-mlxbf2 driver
  * updated the UEFI ACPI table to reflect the above changes

  [Test Case]

  * oob_net0 coming up after several SW_RESET or reboot
  * oob_net0 coming up after several powercycles
  * oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
  * rmmod/modprove mlxbf_gige several times
  * OOB PXE boot multiple times from UEFI menu
  * automate OOB PXE boot and do reboot
  * automate OOB PXE boot and do powercycle
  * Test that GPIO7 reset still works on BlueSphere like boards

  [Regression Potential]

  Any of the test cases above could be impacted due to these new
  changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1934923/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1934923] [NEW] Sync up mlxbf-gige driver with upstreamed version

2021-07-07 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

The mlxbf-gige driver has just been upstreamed so linux-bluefield needs
to be synced up with what we have upstreamed.

[Fix]

* Cleaned up the gige driver as instructed by maintainers
* removed dependency between the mlxbf-gige driver and gpio-mlxbf2 driver
* updated the UEFI ACPI table to reflect the above changes

[Test Case]

* oob_net0 coming up after several SW_RESET or reboot
* oob_net0 coming up after several powercycles
* oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
* rmmod/modprove mlxbf_gige several times
* OOB PXE boot multiple times from UEFI menu
* automate OOB PXE boot and do reboot
* automate OOB PXE boot and do powercycle
* Test that GPIO7 reset still works on BlueSphere like boards

[Regression Potential]

Any of the test cases above could be impacted due to these new changes.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1934923

Title:
  Sync up mlxbf-gige driver with upstreamed version

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  The mlxbf-gige driver has just been upstreamed so linux-bluefield
  needs to be synced up with what we have upstreamed.

  [Fix]

  * Cleaned up the gige driver as instructed by maintainers
  * removed dependency between the mlxbf-gige driver and gpio-mlxbf2 driver
  * updated the UEFI ACPI table to reflect the above changes

  [Test Case]

  * oob_net0 coming up after several SW_RESET or reboot
  * oob_net0 coming up after several powercycles
  * oob_net0 coming up after pushing a new Ubuntu/CentOS/Yocto
  * rmmod/modprove mlxbf_gige several times
  * OOB PXE boot multiple times from UEFI menu
  * automate OOB PXE boot and do reboot
  * automate OOB PXE boot and do powercycle
  * Test that GPIO7 reset still works on BlueSphere like boards

  [Regression Potential]

  Any of the test cases above could be impacted due to these new
  changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1934923/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1934304] [NEW] i2c-mlxbf.c: prevent stack overflow in mlxbf_i2c_smbus_start_transaction

2021-07-01 Thread Asmaa Mnebhi
Public bug reported:

SRU Justification:

[Impact]

There could be stack overflow in mlxbf_i2c_smbus_start_transaction().
memcpy() is called in a loop while 'operation->length' upper bound is not
checked and 'data_idx' also increments.

More details:
The operation length is verified by the caller functions so it cannot exceed 
I2C_SMBUS_BLOCK_MAX bytes (32 bytes) for each operation that is a part of the 
write. Data_desc array is 128 bytes in size. So potentially a request which 
consists of 4 writes, 32 bytes each can trigger an off-by-one or off-by-two 
overflow, because the first byte of data_desc is used by addr, effectively 
decreasing the available data_desc buffer size by one. Functions like 
mlx_smbus_i2c_block_func() that prepare the request also set the length of the 
first write operation to one and store the command id there, so the target 
buffer size again decreases data_desc by one, making it two bytes less than 
expected.

[Fix]

* Add a check for "operation->length" and data_idx and return error if
reached upper bound.

[Test Case]

* Test the i2c-mlxbf.c driver using IPMB functionality.

[Regression Potential]

This fix returns a negative value to indicate that a transaction has
failed. So it will catch more transactions failures.

** Affects: linux-bluefield (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/1934304

Title:
  i2c-mlxbf.c: prevent stack overflow in
  mlxbf_i2c_smbus_start_transaction

Status in linux-bluefield package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [Impact]

  There could be stack overflow in mlxbf_i2c_smbus_start_transaction().
  memcpy() is called in a loop while 'operation->length' upper bound is not
  checked and 'data_idx' also increments.

  More details:
  The operation length is verified by the caller functions so it cannot exceed 
I2C_SMBUS_BLOCK_MAX bytes (32 bytes) for each operation that is a part of the 
write. Data_desc array is 128 bytes in size. So potentially a request which 
consists of 4 writes, 32 bytes each can trigger an off-by-one or off-by-two 
overflow, because the first byte of data_desc is used by addr, effectively 
decreasing the available data_desc buffer size by one. Functions like 
mlx_smbus_i2c_block_func() that prepare the request also set the length of the 
first write operation to one and store the command id there, so the target 
buffer size again decreases data_desc by one, making it two bytes less than 
expected.

  [Fix]

  * Add a check for "operation->length" and data_idx and return error if
  reached upper bound.

  [Test Case]

  * Test the i2c-mlxbf.c driver using IPMB functionality.

  [Regression Potential]

  This fix returns a negative value to indicate that a transaction has
  failed. So it will catch more transactions failures.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1934304/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp