Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone

Alex Ghiti Wed, 22 Jul 2020 22:39:35 -0700



Le 7/21/20 à 7:36 PM, Palmer Dabbelt a écrit :

On Tue, 21 Jul 2020 16:11:02 PDT (-0700), b...@kernel.crashing.org wrote:
On Tue, 2020-07-21 at 14:36 -0400, Alex Ghiti wrote:
> > I guess I don't understand why this is necessary at all.
> > Specifically: why
> > can't we just relocate the kernel within the linear map?  That would
> > let the
> > bootloader put the kernel wherever it wants, modulo the physical
> > memory size we
> > support.  We'd need to handle the regions that are coupled to the
> > kernel's
> > execution address, but we could just put them in an explicit memory
> > region
> > which is what we should probably be doing anyway.
>
> Virtual relocation in the linear mapping requires to move the kernel
> physically too. Zong implemented this physical move in its KASLR RFC
> patchset, which is cumbersome since finding an available physical spot
> is harder than just selecting a virtual range in the vmalloc range.
>
> In addition, having the kernel mapping in the linear mapping prevents
> the use of hugepage for the linear mapping resulting in performanceloss
> (at least for the GB that encompasses the kernel).
>
> Why do you find this "ugly" ? The vmalloc region is just a bunch of
> available virtual addresses to whatever purpose we want, and asnoted by
> Zong, arm64 uses the same scheme.
I don't get it :-)

At least on powerpc we move the kernel in the linear mapping and it
works fine with huge pages, what is your problem there ? You rely on
punching small-page size holes in there ?
That was my original suggestion, and I'm not actually sure it'sinvalid. Itwould mean that both the kernel's physical and virtual addresses are setby thebootloader, which may or may not be workable if we want to have ansv48+sv39kernel. My initial approach to sv48+sv39 kernels would be to just throwawaythe sv39 memory on sv48 kernels, which would preserve the linear map butmean
that there is no single physical address that's accessible for both.  That
would require some coordination between the bootloader and the kernel as to
where it should be loaded, but maybe there's a better way to design thelinearmap. Right now we have a bunch of unwritten rules about where thingsneed to
be loaded, which is a recipe for disaster.
We could copy the kernel around, but I'm not sure I really like thatidea. Wedo zero the BSS right now, so it's not like we entirely rely on thebootloaderto set up the kernel image, but with the hart race boot scheme we haveright
now we'd at least need to leave a stub sitting around.  Maybe we just throw
away SBI v0.1, though, that's why we called it all legacy in the firstplace.
My bigger worry is that anything that involves running the kernel atarbitraryvirtual addresses means we need a PIC kernel, which means every globalsymbolneeds an indirection. That's probably not so bad for shared libraries,but thekernel has a lot of global symbols. PLT references probably aren't soscary,as we have an incoherent instruction cache so the virtual functionpredictor
isn't that hard to build, but making all global data accesses GOT-relative
seems like a disaster for performance. This fixed-VA thing really justexists
so we don't have to be full-on PIC.
In theory I think we could just get away with pretending that medany isPIC,which I believe works as long as the data and text offset staysconstant, you
you don't have any symbols between 2GiB and -2GiB (as those may stay fixed,
even in medany), and you deal with GP accordingly (which should workitself outin the current startup code). We rely on this for some of the earlyboot code
(and will soon for kexec), but that's a very controlled code base and we've
already had some issues.  I'd be much more comfortable adding an explicit
semi-PIC code model, as I tend to miss something when doing these sorts of
things and then we could at least add it to the GCC test runs andguarantee itactually works. Not really sure I want to deal with that, though. Itwould,
however, be the only way to get random virtual addresses during kernel
execution.
At least in the old days, there were a number of assumptions that
the kernel text/data/bss resides in the linear mapping.
Ya, it terrified me as well. Alex says arm64 puts the kernel in thevmalloc
region, so assuming that's the case it must be possible.  I didn't get that
from reading the arm64 port (I guess it's no secret that pretty much allI do
is copy their code)


See https://elixir.bootlin.com/linux/latest/source/arch/arm64/mm/mmu.c#L615.

If you change that you need to ensure that it's still physically
contiguous and you'll have to tweak __va and __pa, which might induce
extra overhead.
I'm operating under the assumption that we don't want to add anadditional loadto virt2phys conversions. arm64 bends over backwards to avoid the load,and
I'm assuming they have a reason for doing so.  Of course, if we're PIC then
maybe performance just doesn't matter, but I'm not sure I want to justgive up.Distros will probably build the sv48+sv39 kernels as soon as they showup, even
if there's no sv48 hardware for a while.

Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone

Reply via email to