Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
Le 15/02/2020 à 03:42, Larry Finger a écrit : Christophe, On 2/14/20 1:35 PM, Christophe Leroy wrote: --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -270,6 +270,9 @@ __secondary_hold_acknowledge: * pointer when we take an exception from supervisor mode.) * -- paulus. */ +#ifdef CONFIG_PPC_CHRP +1: b machine_check_in_rtas +#endif . = 0x200 DO_KVM 0x200 MachineCheck: @@ -290,12 +293,9 @@ MachineCheck: 7: EXCEPTION_PROLOG_2 addi r3,r1,STACK_FRAME_OVERHEAD #ifdef CONFIG_PPC_CHRP - bne cr1,1f + bne cr1,1b #endif EXC_XFER_STD(0x200, machine_check_exception) -#ifdef CONFIG_PPC_CHRP -1: b machine_check_in_rtas -#endif I'll need to make it a bit different because it shoehorns into your config but won't fit if CONFIG_KVM_BOOK3S_32 is added. /* Data access exception. */ . = 0x300 With the above changes and all the other patches applied, the machine finally boots. It is so bloody slow that it takes a long time to do anything, but you finally got all the places that needed patches. I really lost track of how many bugs were fixed in the process, but I can now put that old box aside until time for v5.7.0-rc1. As you can tell, it only gets used to verify that PPC32 is working on real G4 hardware. It has no real value for any other function. Yes, I don't have a G4 myself but this is so much nested with other stuff for the powerpc 83xx than we can't avoid the changes impacting the G4 and other hash-MMU based PPC32 allthough the changes I'm doing are not targetted at those platform at first. And as the 83xx is a 603 core, it is non-hash so all hash related things can't be verified. Plus all those small parts like power saving, RTAS, etc... which are more specific. And checking with all possible options is also not easy. VMAP-STACK was really a challenging functionnality, I'm happy it made its way to mainline though. Thanks for the help, Thanks to you for testing and for your patience. Christophe
Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
Christophe, On 2/14/20 1:35 PM, Christophe Leroy wrote: --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -270,6 +270,9 @@ __secondary_hold_acknowledge: * pointer when we take an exception from supervisor mode.) * -- paulus. */ +#ifdef CONFIG_PPC_CHRP +1: b machine_check_in_rtas +#endif . = 0x200 DO_KVM 0x200 MachineCheck: @@ -290,12 +293,9 @@ MachineCheck: 7: EXCEPTION_PROLOG_2 addi r3,r1,STACK_FRAME_OVERHEAD #ifdef CONFIG_PPC_CHRP - bne cr1,1f + bne cr1,1b #endif EXC_XFER_STD(0x200, machine_check_exception) -#ifdef CONFIG_PPC_CHRP -1: b machine_check_in_rtas -#endif /* Data access exception. */ . = 0x300 With the above changes and all the other patches applied, the machine finally boots. It is so bloody slow that it takes a long time to do anything, but you finally got all the places that needed patches. I really lost track of how many bugs were fixed in the process, but I can now put that old box aside until time for v5.7.0-rc1. As you can tell, it only gets used to verify that PPC32 is working on real G4 hardware. It has no real value for any other function. Thanks for the help, Larry
Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
On 02/14/2020 06:24 PM, Larry Finger wrote: On 2/14/20 12:24 AM, Christophe Leroy wrote: Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? Christophe, When I apply that patch, there is an error at --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -301,6 +301,39 @@ MachineCheck: . = 0x300 DO_KVM 0x300 DataAccess: It complains about "an attempt to move .org backwards". Argh ! When I change the 0x300 to 0x310 in two places, it builds OK. Is that OK? No you can't do that. The following should solve it for your case. --- diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S index 32875afb3319..f9941b766f63 100644 --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -270,6 +270,9 @@ __secondary_hold_acknowledge: * pointer when we take an exception from supervisor mode.) * -- paulus. */ +#ifdef CONFIG_PPC_CHRP +1: b machine_check_in_rtas +#endif . = 0x200 DO_KVM 0x200 MachineCheck: @@ -290,12 +293,9 @@ MachineCheck: 7: EXCEPTION_PROLOG_2 addir3,r1,STACK_FRAME_OVERHEAD #ifdef CONFIG_PPC_CHRP - bne cr1,1f + bne cr1,1b #endif EXC_XFER_STD(0x200, machine_check_exception) -#ifdef CONFIG_PPC_CHRP -1: b machine_check_in_rtas -#endif /* Data access exception. */ . = 0x300 --- Christophe
Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
On 2/14/20 12:24 AM, Christophe Leroy wrote: Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? Christophe, When I apply that patch, there is an error at --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -301,6 +301,39 @@ MachineCheck: . = 0x300 DO_KVM 0x300 DataAccess: It complains about "an attempt to move .org backwards". When I change the 0x300 to 0x310 in two places, it builds OK. Is that OK? Larry
Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
On 2/14/20 12:24 AM, Christophe Leroy wrote: Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? Christophe, When I apply that patch, there is an error at --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -301,6 +301,39 @@ MachineCheck: . = 0x300 DO_KVM 0x300 DataAccess: It complains about "an attempt to move .org backwards". Larry
Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
Le 14/02/2020 à 07:24, Christophe Leroy a écrit : Larry, Le 14/02/2020 à 00:09, Larry Finger a écrit : Christophe, With this patch, it gets further. Sometime after the boot process tries to start process init, it crashes with the unable to read data at 0x000157a0 with a faulting address of 0xc001683c. The screenshot is attached and the gzipped vmlinux is at http://www.lwfinger.com/download/vmlinux2.gz. The patches that were applied for this kernel are also attached, Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? I see the problem happens in kprobe_handler(). Can you try without CONFIG_KPROBE ? In fact, you hit two bugs. The first one is due to CONFIG_VMAP_STACK. The second one has always existed (at least since kernel source tree has been in git). First bug is in function enter_rtas() which tries to read data on stack by using the linear physical address translation. This cannot be used with VM stack, it must re-enable data MMU translation to access data on the stack. Second bug is in kprobe_handler() function, which does: if (*addr != BREAKPOINT_INSTRUCTION) addr is the address where the 'trap' happened. When a trap happens with MMU disabled, addr contains the physical address of the trap. kprobe_handler() tries to read the instruction using physical address whereas MMU is enabled, so you get a bad access either because the said address is not mapped, or because access to userspace is not allowed. Due to the first bug, you get a 'machine check', and as current->thread.rtas_sp has not been cleared yet, the machine check handler jumps to 'machine_check_in_rtas'. machine_check_in_rtas does a trap, which in turn triggers the second bug. Once the first bug is fixed, the second one should not popup. Can you test patch https://patchwork.ozlabs.org/patch/1237929/ that fixes the first bug ? Christophe
Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
Larry, Le 14/02/2020 à 00:09, Larry Finger a écrit : Christophe, With this patch, it gets further. Sometime after the boot process tries to start process init, it crashes with the unable to read data at 0x000157a0 with a faulting address of 0xc001683c. The screenshot is attached and the gzipped vmlinux is at http://www.lwfinger.com/download/vmlinux2.gz. The patches that were applied for this kernel are also attached, Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? I see the problem happens in kprobe_handler(). Can you try without CONFIG_KPROBE ? Christophe
Re: RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y
On 02/13/2020 02:28 PM, Larry Finger wrote: On 2/11/20 1:23 PM, Christophe Leroy wrote: Can you send me a picture of that BUG Unable to handle kernel data access with all the registers values etc..., together with the matching vmlinux ? First thing is to identify where we are when that happens. That mean see what is at 0xc0013674. Can be done with 'ppc-linux-objdump -d vmlinux' (Or whatever your PPC objdump is named) and get the function code. Then we need to understand how we reach that function and why it tries to access a physical address. Another thing I'm thinking about, not necessarily related to that problem: Some buggy drivers do DMA from stack. This doesn't work anymore with CONFIG_VMAP_STACK. Most of them can be detected with CONFIG_DEBUG_VIRTUAL so you should activate it. Christophe, The previous send of this message failed because the attached vmlinux was too large. I have gone about as far as I can in debugging the problem. Setting CONFIG_DEBUG_VIRTUAL made no difference. Attached are the final screenshot, and the patches that I have applied. You already have the gzipped vmlinux. This screenshot makes more sense with the vmlinux you provided, problem at 0xc00136dc. That's in function power_save_ppc32_restore() in arch/powerpc/kernel/idle_6xx.S. c00136c0 : c00136c0: 81 2b 00 a0 lwz r9,160(r11) c00136c4: 91 2b 00 90 stw r9,144(r11) c00136c8: 39 60 00 00 li r11,0 c00136cc: 7d 30 fa a6 mfspr r9,1008 c00136d0: 75 29 00 40 andis. r9,r9,64 c00136d4: 41 82 00 18 beq c00136ec c00136d8: 3d 2b 00 7c addis r9,r11,124 >> c00136dc: 81 29 92 5c lwz r9,-28068(r9) c00136e0: 7d 36 fb a6 mtspr 1014,r9 c00136e4: 7c 00 04 ac hwsync c00136e8: 4c 00 01 2c isync c00136ec: 3d 2b 00 7c addis r9,r11,124 c00136f0: 81 29 92 60 lwz r9,-28064(r9) c00136f4: 7d 31 fb a6 mtspr 1009,r9 c00136f8: 48 00 19 c4 b c00150bc c00136fc: 00 00 00 00 .long 0x0 Can you try the change below (won't work anymore without CONFIG_VMAP_STACK, will fix it properly later when you confirm it is OK). diff --git a/arch/powerpc/kernel/idle_6xx.S b/arch/powerpc/kernel/idle_6xx.S index 0ffdd18b9f26..7be8a0f3fac8 100644 --- a/arch/powerpc/kernel/idle_6xx.S +++ b/arch/powerpc/kernel/idle_6xx.S @@ -166,7 +166,7 @@ BEGIN_FTR_SECTION mfspr r9,SPRN_HID0 andis. r9,r9,HID0_NAP@h beq 1f - addis r9,r11,(nap_save_msscr0-KERNELBASE)@ha + addis r9,r11,nap_save_msscr0@ha lwz r9,nap_save_msscr0@l(r9) mtspr SPRN_MSSCR0, r9 sync @@ -174,7 +174,7 @@ BEGIN_FTR_SECTION 1: END_FTR_SECTION_IFSET(CPU_FTR_NAP_DISABLE_L2_PR) BEGIN_FTR_SECTION - addis r9,r11,(nap_save_hid1-KERNELBASE)@ha + addis r9,r11,nap_save_hid1@ha lwz r9,nap_save_hid1@l(r9) mtspr SPRN_HID1, r9 END_FTR_SECTION_IFSET(CPU_FTR_DUAL_PLL_750FX) Thanks Christophe