On Tue, May 19, 2026 at 12:20 AM ron minnich <[email protected]> wrote:
> There's a reason I started the RISC-V port, a year ago.
>
> background: I wrote the first SBI after the berkeley one, in 2014. I
> put a lot of work into making it small, in every sense: space, time,
> complexity.
>
> The current standard, OpenSBI, is big in every sense: space, time, complexity.

My sense was that they started out trying to do something analogous to
DEC's "PALcode" for Alpha, but it's metastasized into something closer
to UEFI in scope. Is that basically correct?

> So I intend to show an alternate path, starting with this port.
>
> As part of main(), the kernel will do 3 things:
> 1) wipe out the opensbi firmware from RAM
> 2) replace it with a minimal, simple, version which does not even have
>     its own memory.  This version will hark back to an earlier day, when
>     all sbi did was fiddle bits in registers, at the direction of the kernel
> 3) all higher level functions will be done in the kernel or even a server

This begs the immediate question: if the _kernel_ is overwriting SBI,
then do you need SBI at all once the kernel is loaded? Is the kernel
still making use of SBI after that point?

> I am fairly sure I can do this, having done something like it before.
>
> I intend to show that risc-v does not need OpenSBI, UEFI, or ACPI.
>
> I do think there's a way out of the firmware mess we're in; I don't
> expect it to ever happen on x86 or ARM.

It can be done, and what we did at Oxide is an existence proof
(https://rfd.shared.oxide.computer/rfd/0241). Whether it can be done
_generally_ is an open question. It only works at Oxide because of the
luxurious flexibility we built ourselves by co-designing the hardware
and software.

> Even though I did do an experiment, years ago, with moving SMM to
> kernel space: 
> https://docs.google.com/presentation/d/1ECTPzrt4hT37Ef729GAXh0o9lMvx4MJrkJxN88-k9Ps/edit?usp=sharing
>
> I've fought and lost this SMM battle for 27 years. Why do vendors like
> it? Because UEFI/ACPI/SMM are a way to implement vendor lockin. I'm
> too old for this nonsense. RISC-V offers a way out and I'm going to
> try to take it before things get much worse.

I think it's slightly more complex than just lock-in.  Everyone is
over everyone else's barrel, the system has found some kind of
uncomfortable equilibrium at a local maxima, and the fear is that if
anyone tries to shift, everyone's spines will snap.

I know you know all of this, Ron, but others may not, so indulge me a
bit: a BIOS (by which I mean a general category, including things like
UEFI and OpenBoot) has three primary uses:

1. It provides a platform-independent, least-common-denominator, way to do IO;
2. It provides a place for vendors to put all of their device-specific
platform enablement code;
3. Related to (2), it addresses the boot problem.

(1) is rather obvious. The concept of a BIOS form the CP/M and DOS
days meant that simple early OSes could offload the nuts and bolts of
driving IO devices to code stored in ROM, freeing up precious RAM for
user programs. It's also helpful for things like standalone boot
loaders, where you needs need a minimum of functionality to load the
actual OS, which (presumably) has real device drivers (cf the 9front
loaders).

The other two are a little less obvious.

Starting a modern CPU is a bit like how I imagine starting a container
ship: you don't just press a button and boom, the ship's main engines
are going and the screws are turning and you're ready to get under
way. Rather, you press a button, and that engages a small electric
starter motor, which starts something analogous to a car engine, that
in turn drives a pump that starts building compression in the main
ship's engines, primes the fuel system, and so on.  Only after all of
that is ready can you start the _actual_ enginges.

Similarly, when you press the power button on a recent machine, it's
not like the main CPUs immediately come out of reset and start trying
to boot the OS; instead, a little microcontroller or FPGA or CPLD or
something springs to life driven by firmware that lives in an onboard
flash part, and that starts on a long list of tasks including power
sequencing, ramping up the voltage regulators, turning on the DRAM
controllers, applying early power to the DIMMs and the main SoC so
they can charge up internal capacitors and start speaking whatever
protocol (e.g., the DIMMs probably speak I2C to identify themselves),
turning _off_ things that are not electrically present, and so on.
Eventually you're far enough along you can signal the main CPU to
start, and another little embedded controller inside of that thing
starts loading the BIOS out of flash and arranging for the big cores
start; after all of that, the main CPU comes comes out of reset, and
now you're running BIOS code. But wait, we're not done: depending on
the CPU/vendor, DRAM may still need to be trained and the full BIOS
image loaded, the IO buses might need to be configured and trained,
configuration needs to be loaded from non-volatile storage, the CPUs
are enumerated, tables built describing the system for the actual OS,
and so on. Once all of that is done, the system can try booting from
the configured devices, and assuming that's successful, only now are
you running your actual system software; by this point, many millions
(if not billions of instructions) have already been retired.
Critically, much of the pre-OS work depends on the exact configuration
of CPU, board, and devices.

I mentioned the "boot problem": If the OS is going to be loaded from
some device at the far end of a PCIe link that has to be set up and
initialized by software running on the main CPU cores (as it is on AMD
EPYC SoCs, for example) then you can't boot it until _that's_ been
done, and _something_ has to do that work. The solution so far has
been to shove it into UEFI.

Similarly, operating systems have abdicated much of the responsibility
for actually running the machine, giving it over to the BIOS, which
doesn't actually go away after you've booted. For example, Ron
mentioned SMM on x86: this was meant to patch over the mismatch
between old OSes and newer machines so that firmware could emulate old
systems for compatiblity with old software. Windows 95 doesn't have an
xhci stack for the HID devices on your workstation? No problem; BIOS
implements an emulation layer that makes those look like a PS/2
keyboard and mouse, and runs it in SMM mode: Win95 is none the wiser,
and users are happy because their software runs. But if you've got an
OS that _does_ have drivers for that stuff, and doesn't want it, you
can't get rid of it: oh well, can't make all the people happy all the
time. And since we've got system services in UEFI, we can stick all of
the deeply platform-specific C-state and power management stuff in
there; the OS doesn't have to do the super fiddly bits.

Why do they do all of this? Because otherwise, they have to have
precise details of every machine they may run on, and drivers for all
of the parts that the user cares about. The BIOS

The upshot is that unless you're tying the version of the OS tightly
to hardware, you need some kind of interface between the machine and
the OS; interfaces in that abstract sense are good. The problem is
that the concrete interfaces we have, and their implementations, are
bad; they were not well-designed, and instead (as Mothy Roscoe said at
OSDI'21) they have "congealed" into something big, heavy, and utter
opaque.

In the other thread, Vester mentioned this nightmare world where the
firmware actually runs the machine and you can't get around it. Well,
we're already there, and have been for quite some time.

Btw, I mentioned we worked around this at Oxide, and that's true. We
are using EPYC CPUs, but we have no UEFI and we do not use any AGESA
code; the x86 cores come out of reset running Oxide-authored code from
the first instruction. However, we are still reliant on some AMD
firmware blobs to drive the ASP (nee PSP) which does DRAM training and
is responsible for loading the "BIOS image" out of flash. The trick we
use is to package up our code to look like the BIOS, as far as the ASP
is concerned: it assumes it's loading AGESA but it's really loading
the Oxide OS. But obviously that's not going to work generally; first,
we have a fair bit of code in the OS that is specific to the
microarchitectures we explicitly support (and in some cases, even the
silicon spin, if we need to work around chip errata). Second, since we
designed the boards, we know exactly how the IO devices are lined out
to the PHYs on individual pins on the CPU: that's something that the
third-party board vendors modify AGESA for, but not something you can
(easily) tell the OS or a boot loader in a far more loosely coupled
environment.

        - Dan C.




> Anyway, there are three of us now working on the port, we've gotten to
> the point of 'root is from' working over 9p, and we're going to rebase
> on 9front tip in the next week or so. At that point, I hope to start
> working on the simple SBI.

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T0ca6c03d6fda29a7-Mf1b066a5bdc642765ae17514
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to