https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273263

--- Comment #5 from Joerg Pulz <joerg.p...@frm2.tum.de> ---
I kind of solved this or at least found a solution how to work around this
issue.
Also I know now why this all happened at all.

Some background:
The 27xx and 28xx based controllers feature two firmware images in flash, a
primary and a secondary one.
While the 27xx based controllers still share some amount of information between
both firmware images (e.g. NVRAM, NVMe parameters, ...) the 28xx based
controllers even have separate areas for those.
In penguin land the driver knows and handles this correctly as firmware loading
there is somewhat different compared to ours.
We just send a MBOX command the get the controller do some magic stuff to load
the firmware from flash without ever thinking about primary/secondary firmware
images. This is working quite well if either
  - primary/secondary firmware versions in flash are equal
or 
  - the primary firmware image is set as active.
The penguin driver is very different in this area as it checks for valid
primary/secondary firmware images and detects which one is the active one.
Firmware loading is done by reading the active firmware directly from flash and
writing it into the controllers RAM. Depending on the active firmware the
correct NVRAM (and other) areas are used too.

With all this in mind, back to the original PR:

Arne Steinkamm kindly provided two controllers out of a random system to me.
The first one had RISC firmware 8.8.231 in the primary and no firmware at all
in the secondary flash area (verified with the vendors firmware flash utility).
This controller was detected and initialized without problems.
The second controller was shown with firmware 9.8.2 in UEFI but our driver
reported 8.8.231 and initialization failed - as reported here.
After looking at this controller with the vendors firmware flash utility (tried
the EFI and the Linux one) it turns out that there was firmware 8.8.231 in the
primary and 9.8.2 in the secondary area. The secondary area was marked active.
Both Lenovo branded controllers where sold in this state, so it seems one can't
trust that newly sold controllers are working out of the box.
I got the most recent firmware update package from Marvell/QLogic only to find
out that the Lenovo branded controllers can't be flashed with this one.
Luckily I was able to find a Lenovo firmware update package for Windows that
contained the firmware as a separate file usable with the vendors EFI and Linux
flash utility. This was exactly the RISC firmware version 9.8.2 that already
was in the second controllers secondary firmware area.
After flashing the card, it showed RISC firmware version 9.8.2 in the primary
area and the primary area activated.
Starting FreeBSD the driver reported firmware 9.8.2 and initialization was
successful.
Only for testing I flashed the firmware again what resulted in both firmware
regions containing the same version but the secondary area active.
In this state the driver again reported firmware 9.8.2 and initialization was
successful.
For verification I took the first controller with no secondary firmware at all
and flashed this one resulting in primary firmware 8.8.231 secondary firmware
9.8.2 and secondary firmware active.
In this state I instantly got the same initialization error like with the
second card before and reported here.
After flashing this controller a second time (primary and secondary at version
9.8.2 with primary active) everything was fine and working again.
As a side note: I found no way to change the active firmware image without
flashing (neither the EFI nor the Linux flash utility provides an option for
this). This puts the whole concept of two firmware images in question.

I provided all this with the required firmware and flash utility files with
some documentation to Arne Steinkamm. With those he was able to get another
machine with previously failing controllers to work correctly.

Final advise for isp(4) with 27xx and 28xx based controllers:
To work around the shortcomings of handling primary/secondary firmware images
correctly, one has to make sure that either:
1. the primary firmware is active or
2. the primary and secondary firmware contain the same version, regardless
which one is active.
This alone makes the controllers work but the second option is problematic with
28xx based ones where NVRAM and so on is specific to the activated firmware.

So to be on the really safe side, either use option one or make sure primary
and secondary firmware are the same and primary is active for the time being.

One can use either the Marvell/QLogic EFI (preferred) or Linux utility to
verify or achieve this. If firmware flashing is required, one could try to use
the vendors native firmware or has to use specific firmware from his hardware
vendor (e.g. Lenovo, Dell, ...).

This whole situation should be documented in the BUGS section of the isp(4)
manpage for all supported FreeBSD versions (at least CURRENT, 14-STABLE and
13-STABLE) that support at least 27xx based controllers.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to