On Wed, Jan 07, 2026 at 09:50:07PM -0500, Dan Cross wrote: > On Wed, Jan 7, 2026 at 3:12?PM ron minnich <[email protected]> wrote: > > I don't much like IOMMU on x86, they are just awful to program, and come > > with a performance hit. They're also here to stay. If you can disable > > x2apic on your amd, then you should be able to turn off iommu I believe? > > Two independent things, really. You can put the LAPIC into x2 mode as > long as it's supported; you only need the IOMMU if you have more than > 255 CPU threads and want to serve external interrupts on the high > numbered ones (or if the APIC ID space is sparse, which it often is if > you have >255 CPUs or a two socket system that supports large > core-counts). > > And even then, you're really only using the IOMMU's interrupt > remapping functionality; you don't _have_ to do the page-level > translation stuff for IO devices if you don't want to.
With the hardware I bought for testing, whatever the use or not of IOMMU, the firmware doesn't offer the possibility to put aside the IOMMU (and if I'm not mistaken, even the UEFI EDK doesn't provide a mean to do so) and it is enable by default, so even for not using any part of the IOMMU functionnalities, code has to be added to the kernel to reinitialize the IOMMU mapping to a no-op. Thus the question about IOMMU vs Plan9/Nix because, for recent x86 like hardware for example, the IOMMU will have to be dealt with, if only to neutralize it. > > On Wed, Jan 7, 2026 at 9:58?AM <[email protected]> wrote: > >> > >> Concerning IOMMU, what is the general position regarding it? I had > >> tested an AMD64 8 physical cores (for Nix eventually) but run into > >> problems regarding USB because IOMMU is not handled and apparently > >> recent firmware unable it by default so one has to put it out the > >> way---in my case, this is USB that was put out of the way, but it is > >> a PITA (keyboard...). So how does it usage fit regarding Plan9, Nix? > >> > >> On Tue, Jan 06, 2026 at 03:13:45PM -0800, ron minnich wrote: > >> > Something RIchard Miller said has got stuck in my head. > >> > > >> > I am wondering about memory layout on riscv64. > >> > > >> > The old tradition of "kernel at bottom of physical, top of virtual" is > >> > something we've always done. > >> > > >> > But do we have to? There are good riscv reasons to flip this. M mode, for > >> > example, has no virtual addressing, and it would be useful (to say the > >> > least) to have kernel and M mode have a common set of addresses. > >> > > >> > Further, the PMP registers, which can be used to manage/limit physical > >> > memory accesses, only gate addresses: they don't come with an offset. If > >> > we > >> > had kernel addresses that were 0-based and identity mapped, then the > >> > addresses would be the same for kernel, M mode, PMP, and IOMMU. > >> > > >> > A convenience of the "kernel in high memory" was the fact that an > >> > immediate > >> > 32-bit number, e.g. 0x80000000, sign extends to 0xffffffff_80000000, such > >> > that you can address a KVA with a 32-bit immediate, and a UVA with a > >> > 32-bit > >> > immediate, as long as it is < 0x80000000. > >> > > >> > But this comes with a headache: KVA breaks into a 2G region and "the > >> > rest", > >> > so you end up with two kernel VA ranges. It's annoying at least. The sign > >> > extend hack was convenient when 2G was a lot of memory, but after that, > >> > it's a bit of a pain. > >> > > >> > Finally, FWIW, loading a risc-v register with 0x4000_0000_0000_0000 is > >> > one > >> > instruction, so having a big number for a base virtual address is not the > >> > issue it is on amd64 (amd64 is, in many ways, a 32-bit architecture with > >> > a > >> > 64-bit RAX -- it is SO WEIRD, but it had to be to make the heroic move to > >> > 64 bits). > >> > > >> > So, the proposal: on riscv64, kernel address are 0 to (1<<62)-1, and user > >> > addresses are 1<<62 to (1<<62)-1. This means valid addresses are always > >> > int64, but will never be negative; we can keep using u64int. 62 bits of > >> > address space ought to be enough for everybody. > >> > > >> > Again, this makes a lot of RISC-V things easier. It would make it much > >> > easier to use kernel addresses for M mode code, because no translation > >> > would be needed, and a bunch of other risc-v mechanisms would be > >> > similarly > >> > simplified. > >> > > >> > I'm not sure the toolchain can handle having user text start at > >> > 0x4000_0000_0000_0000, but maybe it's not so hard: for gvisor, back in > >> > 2014, Russ added the code to 6l to allow us to link text in very high > >> > memory. So it has to be doable. The code's there in go1.4 :-) > >> > > >> > Comments? > >> > >> -- > >> Thierry Laronde <tlaronde +AT+ kergis +dot+ com> > >> http://www.kergis.com/ > >> http://kertex.kergis.com/ > >> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C > > > > 9fans / 9fans / see discussions + participants + delivery options Permalink -- Thierry Laronde <tlaronde +AT+ kergis +dot+ com> http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/Tf6e0b1b3f80df821-M89246f7bae7e5344376fca4f Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
