On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel <a...@kernel.org> wrote: > > On Thu, 1 Dec 2022 at 08:15, chenxiang (M) <chenxian...@hisilicon.com> wrote: > > > > Hi Ard, > > > > > > 在 2022/11/30 16:18, Ard Biesheuvel 写道: > > > On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <m...@kernel.org> wrote: > > >> On Wed, 30 Nov 2022 02:52:35 +0000, > > >> "chenxiang (M)" <chenxian...@hisilicon.com> wrote: > > >>> Hi, > > >>> > > >>> We boot the VM using following commands (with nvdimm on) (qemu > > >>> version 6.1.50, kernel 6.0-r4): > > >> How relevant is the presence of the nvdimm? Do you observe the failure > > >> without this? > > >> > > >>> qemu-system-aarch64 -machine > > >>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on -kernel > > >>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios > > >>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m > > >>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0 > > >>> ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1' > > >>> -object memory-backend-ram,id=ram1,size=10G -device > > >>> nvdimm,id=dimm1,memdev=ram1 -device ioh3420,id=root_port1,chassis=1 > > >>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1 > > >>> > > >>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel > > >>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4): > > >>> > > >>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko > > >>> [ 8.186563] vmap allocation for size 20480 failed: use > > >>> vmalloc=<size> to increase size > > >> Have you tried increasing the vmalloc size to check that this is > > >> indeed the problem? > > >> > > >> [...] > > >> > > >>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr: > > >>> defer initialization to initcall where permitted"). > > >> I guess you mean commit fc5a89f75d2a instead, right? > > >> > > >>> Do you have any idea about the issue? > > >> I sort of suspect that the nvdimm gets vmap-ed and consumes a large > > >> portion of the vmalloc space, but you give very little information > > >> that could help here... > > >> > > > Ouch. I suspect what's going on here: that patch defers the > > > randomization of the module region, so that we can decouple it from > > > the very early init code. > > > > > > Obviously, it is happening too late now, and the randomized module > > > region is overlapping with a vmalloc region that is in use by the time > > > the randomization occurs. > > > > > > Does the below fix the issue? > > > > The issue still occurs, but it seems decrease the probability, before it > > occured almost every time, after the change, i tried 2-3 times, and it > > occurs. > > But i change back "subsys_initcall" to "core_initcall", and i test more > > than 20 times, and it is still ok. > > > > Thank you for confirming. I will send out a patch today. >
...but before I do that, could you please check whether the change below fixes your issue as well? diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c index 6ccc7ef600e7c1e1..c8c205b630da1951 100644 --- a/arch/arm64/kernel/kaslr.c +++ b/arch/arm64/kernel/kaslr.c @@ -20,7 +20,11 @@ #include <asm/sections.h> #include <asm/setup.h> -u64 __ro_after_init module_alloc_base; +/* + * Set a reasonable default for module_alloc_base in case + * we end up running with module randomization disabled. + */ +u64 __ro_after_init module_alloc_base = (u64)_etext - MODULES_VSIZE; u16 __initdata memstart_offset_seed; struct arm64_ftr_override kaslr_feature_override __initdata; @@ -30,12 +34,6 @@ static int __init kaslr_init(void) u64 module_range; u32 seed; - /* - * Set a reasonable default for module_alloc_base in case - * we end up running with module randomization disabled. - */ - module_alloc_base = (u64)_etext - MODULES_VSIZE; - if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) { pr_info("KASLR disabled on command line\n"); return 0;