Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
On Wed, Feb 6, 2019 at 3:44 PM Michael Ellerman wrote: > > Balbir Singh writes: > > On Tue, Feb 5, 2019 at 10:24 PM Michael Ellerman > > wrote: > >> Balbir Singh writes: > >> > On Sat, Feb 2, 2019 at 12:14 PM Balbir Singh > >> > wrote: > >> >> On Tue, Jan 22, 2019 at 10:57:21AM -0500, Joe Lawrence wrote: > >> >> > From: Nicolai Stange > >> >> > > >> >> > The ppc64 specific implementation of the reliable stacktracer, > >> >> > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable > >> >> > trace" whenever it finds an exception frame on the stack. Stack frames > >> >> > are classified as exception frames if the STACK_FRAME_REGS_MARKER > >> >> > magic, > >> >> > as written by exception prologues, is found at a particular location. > >> >> > > >> >> > However, as observed by Joe Lawrence, it is possible in practice that > >> >> > non-exception stack frames can alias with prior exception frames and > >> >> > thus, > >> >> > that the reliable stacktracer can find a stale > >> >> > STACK_FRAME_REGS_MARKER on > >> >> > the stack. It in turn falsely reports an unreliable stacktrace and > >> >> > blocks > >> >> > any live patching transition to finish. Said condition lasts until the > >> >> > stack frame is overwritten/initialized by function call or other > >> >> > means. > >> >> > > >> >> > In principle, we could mitigate this by making the exception frame > >> >> > classification condition in save_stack_trace_tsk_reliable() stronger: > >> >> > in addition to testing for STACK_FRAME_REGS_MARKER, we could also > >> >> > take into > >> >> > account that for all exceptions executing on the kernel stack > >> >> > - their stack frames's backlink pointers always match what is saved > >> >> > in their pt_regs instance's ->gpr[1] slot and that > >> >> > - their exception frame size equals STACK_INT_FRAME_SIZE, a value > >> >> > uncommonly large for non-exception frames. > >> >> > > >> >> > However, while these are currently true, relying on them would make > >> >> > the > >> >> > reliable stacktrace implementation more sensitive towards future > >> >> > changes in > >> >> > the exception entry code. Note that false negatives, i.e. not > >> >> > detecting > >> >> > exception frames, would silently break the live patching consistency > >> >> > model. > >> >> > > >> >> > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) > >> >> > rely on STACK_FRAME_REGS_MARKER as well. > >> >> > > >> >> > Make the exception exit code clear the on-stack > >> >> > STACK_FRAME_REGS_MARKER > >> >> > for those exceptions running on the "normal" kernel stack and > >> >> > returning > >> >> > to kernelspace: because the topmost frame is ignored by the reliable > >> >> > stack > >> >> > tracer anyway, returns to userspace don't need to take care of > >> >> > clearing > >> >> > the marker. > >> >> > > >> >> > Furthermore, as I don't have the ability to test this on Book 3E or > >> >> > 32 bits, limit the change to Book 3S and 64 bits. > >> >> > > >> >> > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on > >> >> > PPC_BOOK3S_64 for documentation purposes. Before this patch, it > >> >> > depended > >> >> > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies > >> >> > PPC_BOOK3S_64, there's no functional change here. > >> >> > > >> >> > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack > >> >> > tracing for the consistency model") > >> >> > Reported-by: Joe Lawrence > >> >> > Signed-off-by: Nicolai Stange > >> >> > Signed-off-by: Joe Lawrence > >> >> > --- > >> >> > arch/powerpc/Kconfig | 2 +- > >> >> > arch/powerpc/kernel/entry_64.S | 7 +++ > >> >> > 2 files changed, 8 insertions(+), 1 deletion(-) > >> >> > > >> >> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > >> >> > index 2890d36eb531..73bf87b1d274 100644 > >> >> > --- a/arch/powerpc/Kconfig > >> >> > +++ b/arch/powerpc/Kconfig > >> >> > @@ -220,7 +220,7 @@ config PPC > >> >> > select HAVE_PERF_USER_STACK_DUMP > >> >> > select HAVE_RCU_TABLE_FREE if SMP > >> >> > select HAVE_REGS_AND_STACK_ACCESS_API > >> >> > - select HAVE_RELIABLE_STACKTRACE if PPC64 && > >> >> > CPU_LITTLE_ENDIAN > >> >> > + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && > >> >> > CPU_LITTLE_ENDIAN > >> >> > select HAVE_SYSCALL_TRACEPOINTS > >> >> > select HAVE_VIRT_CPU_ACCOUNTING > >> >> > select HAVE_IRQ_TIME_ACCOUNTING > >> >> > diff --git a/arch/powerpc/kernel/entry_64.S > >> >> > b/arch/powerpc/kernel/entry_64.S > >> >> > index 435927f549c4..a2c168b395d2 100644 > >> >> > --- a/arch/powerpc/kernel/entry_64.S > >> >> > +++ b/arch/powerpc/kernel/entry_64.S > >> >> > @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) > >> >> > ld r2,_NIP(r1) > >> >> > mtspr SPRN_SRR0,r2 > >> >> > > >> >> > + /* > >> >> > + * Leaving a stale exception_marker on the stack can
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
Balbir Singh writes: > On Tue, Feb 5, 2019 at 10:24 PM Michael Ellerman wrote: >> Balbir Singh writes: >> > On Sat, Feb 2, 2019 at 12:14 PM Balbir Singh wrote: >> >> On Tue, Jan 22, 2019 at 10:57:21AM -0500, Joe Lawrence wrote: >> >> > From: Nicolai Stange >> >> > >> >> > The ppc64 specific implementation of the reliable stacktracer, >> >> > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable >> >> > trace" whenever it finds an exception frame on the stack. Stack frames >> >> > are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, >> >> > as written by exception prologues, is found at a particular location. >> >> > >> >> > However, as observed by Joe Lawrence, it is possible in practice that >> >> > non-exception stack frames can alias with prior exception frames and >> >> > thus, >> >> > that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER >> >> > on >> >> > the stack. It in turn falsely reports an unreliable stacktrace and >> >> > blocks >> >> > any live patching transition to finish. Said condition lasts until the >> >> > stack frame is overwritten/initialized by function call or other means. >> >> > >> >> > In principle, we could mitigate this by making the exception frame >> >> > classification condition in save_stack_trace_tsk_reliable() stronger: >> >> > in addition to testing for STACK_FRAME_REGS_MARKER, we could also take >> >> > into >> >> > account that for all exceptions executing on the kernel stack >> >> > - their stack frames's backlink pointers always match what is saved >> >> > in their pt_regs instance's ->gpr[1] slot and that >> >> > - their exception frame size equals STACK_INT_FRAME_SIZE, a value >> >> > uncommonly large for non-exception frames. >> >> > >> >> > However, while these are currently true, relying on them would make the >> >> > reliable stacktrace implementation more sensitive towards future >> >> > changes in >> >> > the exception entry code. Note that false negatives, i.e. not detecting >> >> > exception frames, would silently break the live patching consistency >> >> > model. >> >> > >> >> > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) >> >> > rely on STACK_FRAME_REGS_MARKER as well. >> >> > >> >> > Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER >> >> > for those exceptions running on the "normal" kernel stack and returning >> >> > to kernelspace: because the topmost frame is ignored by the reliable >> >> > stack >> >> > tracer anyway, returns to userspace don't need to take care of clearing >> >> > the marker. >> >> > >> >> > Furthermore, as I don't have the ability to test this on Book 3E or >> >> > 32 bits, limit the change to Book 3S and 64 bits. >> >> > >> >> > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on >> >> > PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended >> >> > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies >> >> > PPC_BOOK3S_64, there's no functional change here. >> >> > >> >> > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack >> >> > tracing for the consistency model") >> >> > Reported-by: Joe Lawrence >> >> > Signed-off-by: Nicolai Stange >> >> > Signed-off-by: Joe Lawrence >> >> > --- >> >> > arch/powerpc/Kconfig | 2 +- >> >> > arch/powerpc/kernel/entry_64.S | 7 +++ >> >> > 2 files changed, 8 insertions(+), 1 deletion(-) >> >> > >> >> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> >> > index 2890d36eb531..73bf87b1d274 100644 >> >> > --- a/arch/powerpc/Kconfig >> >> > +++ b/arch/powerpc/Kconfig >> >> > @@ -220,7 +220,7 @@ config PPC >> >> > select HAVE_PERF_USER_STACK_DUMP >> >> > select HAVE_RCU_TABLE_FREE if SMP >> >> > select HAVE_REGS_AND_STACK_ACCESS_API >> >> > - select HAVE_RELIABLE_STACKTRACE if PPC64 && >> >> > CPU_LITTLE_ENDIAN >> >> > + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && >> >> > CPU_LITTLE_ENDIAN >> >> > select HAVE_SYSCALL_TRACEPOINTS >> >> > select HAVE_VIRT_CPU_ACCOUNTING >> >> > select HAVE_IRQ_TIME_ACCOUNTING >> >> > diff --git a/arch/powerpc/kernel/entry_64.S >> >> > b/arch/powerpc/kernel/entry_64.S >> >> > index 435927f549c4..a2c168b395d2 100644 >> >> > --- a/arch/powerpc/kernel/entry_64.S >> >> > +++ b/arch/powerpc/kernel/entry_64.S >> >> > @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> >> > ld r2,_NIP(r1) >> >> > mtspr SPRN_SRR0,r2 >> >> > >> >> > + /* >> >> > + * Leaving a stale exception_marker on the stack can confuse >> >> > + * the reliable stack unwinder later on. Clear it. >> >> > + */ >> >> > + li r2,0 >> >> > + std r2,STACK_FRAME_OVERHEAD-16(r1) >> >> > + >> >> >> >> Could you please double check, r4 is already 0 at this point >> >> IIUC. So the change might be a simple >> >> >> >> std r4,STACK_FRAME_OVERHEAD-16(r1) >> >>
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
On Tue, Feb 5, 2019 at 10:24 PM Michael Ellerman wrote: > > Balbir Singh writes: > > On Sat, Feb 2, 2019 at 12:14 PM Balbir Singh wrote: > >> > >> On Tue, Jan 22, 2019 at 10:57:21AM -0500, Joe Lawrence wrote: > >> > From: Nicolai Stange > >> > > >> > The ppc64 specific implementation of the reliable stacktracer, > >> > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable > >> > trace" whenever it finds an exception frame on the stack. Stack frames > >> > are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, > >> > as written by exception prologues, is found at a particular location. > >> > > >> > However, as observed by Joe Lawrence, it is possible in practice that > >> > non-exception stack frames can alias with prior exception frames and > >> > thus, > >> > that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on > >> > the stack. It in turn falsely reports an unreliable stacktrace and blocks > >> > any live patching transition to finish. Said condition lasts until the > >> > stack frame is overwritten/initialized by function call or other means. > >> > > >> > In principle, we could mitigate this by making the exception frame > >> > classification condition in save_stack_trace_tsk_reliable() stronger: > >> > in addition to testing for STACK_FRAME_REGS_MARKER, we could also take > >> > into > >> > account that for all exceptions executing on the kernel stack > >> > - their stack frames's backlink pointers always match what is saved > >> > in their pt_regs instance's ->gpr[1] slot and that > >> > - their exception frame size equals STACK_INT_FRAME_SIZE, a value > >> > uncommonly large for non-exception frames. > >> > > >> > However, while these are currently true, relying on them would make the > >> > reliable stacktrace implementation more sensitive towards future changes > >> > in > >> > the exception entry code. Note that false negatives, i.e. not detecting > >> > exception frames, would silently break the live patching consistency > >> > model. > >> > > >> > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) > >> > rely on STACK_FRAME_REGS_MARKER as well. > >> > > >> > Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER > >> > for those exceptions running on the "normal" kernel stack and returning > >> > to kernelspace: because the topmost frame is ignored by the reliable > >> > stack > >> > tracer anyway, returns to userspace don't need to take care of clearing > >> > the marker. > >> > > >> > Furthermore, as I don't have the ability to test this on Book 3E or > >> > 32 bits, limit the change to Book 3S and 64 bits. > >> > > >> > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on > >> > PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended > >> > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies > >> > PPC_BOOK3S_64, there's no functional change here. > >> > > >> > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack > >> > tracing for the consistency model") > >> > Reported-by: Joe Lawrence > >> > Signed-off-by: Nicolai Stange > >> > Signed-off-by: Joe Lawrence > >> > --- > >> > arch/powerpc/Kconfig | 2 +- > >> > arch/powerpc/kernel/entry_64.S | 7 +++ > >> > 2 files changed, 8 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > >> > index 2890d36eb531..73bf87b1d274 100644 > >> > --- a/arch/powerpc/Kconfig > >> > +++ b/arch/powerpc/Kconfig > >> > @@ -220,7 +220,7 @@ config PPC > >> > select HAVE_PERF_USER_STACK_DUMP > >> > select HAVE_RCU_TABLE_FREE if SMP > >> > select HAVE_REGS_AND_STACK_ACCESS_API > >> > - select HAVE_RELIABLE_STACKTRACE if PPC64 && > >> > CPU_LITTLE_ENDIAN > >> > + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && > >> > CPU_LITTLE_ENDIAN > >> > select HAVE_SYSCALL_TRACEPOINTS > >> > select HAVE_VIRT_CPU_ACCOUNTING > >> > select HAVE_IRQ_TIME_ACCOUNTING > >> > diff --git a/arch/powerpc/kernel/entry_64.S > >> > b/arch/powerpc/kernel/entry_64.S > >> > index 435927f549c4..a2c168b395d2 100644 > >> > --- a/arch/powerpc/kernel/entry_64.S > >> > +++ b/arch/powerpc/kernel/entry_64.S > >> > @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) > >> > ld r2,_NIP(r1) > >> > mtspr SPRN_SRR0,r2 > >> > > >> > + /* > >> > + * Leaving a stale exception_marker on the stack can confuse > >> > + * the reliable stack unwinder later on. Clear it. > >> > + */ > >> > + li r2,0 > >> > + std r2,STACK_FRAME_OVERHEAD-16(r1) > >> > + > >> > >> Could you please double check, r4 is already 0 at this point > >> IIUC. So the change might be a simple > >> > >> std r4,STACK_FRAME_OVERHEAD-16(r1) > >> > > > > r4 is not 0, sorry for the noise > > Isn't it? It is, I seem to be reading the wrong bits and confused myself, had to re-read
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
Balbir Singh writes: > On Sat, Feb 2, 2019 at 12:14 PM Balbir Singh wrote: >> >> On Tue, Jan 22, 2019 at 10:57:21AM -0500, Joe Lawrence wrote: >> > From: Nicolai Stange >> > >> > The ppc64 specific implementation of the reliable stacktracer, >> > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable >> > trace" whenever it finds an exception frame on the stack. Stack frames >> > are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, >> > as written by exception prologues, is found at a particular location. >> > >> > However, as observed by Joe Lawrence, it is possible in practice that >> > non-exception stack frames can alias with prior exception frames and thus, >> > that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on >> > the stack. It in turn falsely reports an unreliable stacktrace and blocks >> > any live patching transition to finish. Said condition lasts until the >> > stack frame is overwritten/initialized by function call or other means. >> > >> > In principle, we could mitigate this by making the exception frame >> > classification condition in save_stack_trace_tsk_reliable() stronger: >> > in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into >> > account that for all exceptions executing on the kernel stack >> > - their stack frames's backlink pointers always match what is saved >> > in their pt_regs instance's ->gpr[1] slot and that >> > - their exception frame size equals STACK_INT_FRAME_SIZE, a value >> > uncommonly large for non-exception frames. >> > >> > However, while these are currently true, relying on them would make the >> > reliable stacktrace implementation more sensitive towards future changes in >> > the exception entry code. Note that false negatives, i.e. not detecting >> > exception frames, would silently break the live patching consistency model. >> > >> > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) >> > rely on STACK_FRAME_REGS_MARKER as well. >> > >> > Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER >> > for those exceptions running on the "normal" kernel stack and returning >> > to kernelspace: because the topmost frame is ignored by the reliable stack >> > tracer anyway, returns to userspace don't need to take care of clearing >> > the marker. >> > >> > Furthermore, as I don't have the ability to test this on Book 3E or >> > 32 bits, limit the change to Book 3S and 64 bits. >> > >> > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on >> > PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended >> > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies >> > PPC_BOOK3S_64, there's no functional change here. >> > >> > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing >> > for the consistency model") >> > Reported-by: Joe Lawrence >> > Signed-off-by: Nicolai Stange >> > Signed-off-by: Joe Lawrence >> > --- >> > arch/powerpc/Kconfig | 2 +- >> > arch/powerpc/kernel/entry_64.S | 7 +++ >> > 2 files changed, 8 insertions(+), 1 deletion(-) >> > >> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> > index 2890d36eb531..73bf87b1d274 100644 >> > --- a/arch/powerpc/Kconfig >> > +++ b/arch/powerpc/Kconfig >> > @@ -220,7 +220,7 @@ config PPC >> > select HAVE_PERF_USER_STACK_DUMP >> > select HAVE_RCU_TABLE_FREE if SMP >> > select HAVE_REGS_AND_STACK_ACCESS_API >> > - select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN >> > + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && >> > CPU_LITTLE_ENDIAN >> > select HAVE_SYSCALL_TRACEPOINTS >> > select HAVE_VIRT_CPU_ACCOUNTING >> > select HAVE_IRQ_TIME_ACCOUNTING >> > diff --git a/arch/powerpc/kernel/entry_64.S >> > b/arch/powerpc/kernel/entry_64.S >> > index 435927f549c4..a2c168b395d2 100644 >> > --- a/arch/powerpc/kernel/entry_64.S >> > +++ b/arch/powerpc/kernel/entry_64.S >> > @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> > ld r2,_NIP(r1) >> > mtspr SPRN_SRR0,r2 >> > >> > + /* >> > + * Leaving a stale exception_marker on the stack can confuse >> > + * the reliable stack unwinder later on. Clear it. >> > + */ >> > + li r2,0 >> > + std r2,STACK_FRAME_OVERHEAD-16(r1) >> > + >> >> Could you please double check, r4 is already 0 at this point >> IIUC. So the change might be a simple >> >> std r4,STACK_FRAME_OVERHEAD-16(r1) >> > > r4 is not 0, sorry for the noise Isn't it? cheers
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
On Sat, Feb 2, 2019 at 12:14 PM Balbir Singh wrote: > > On Tue, Jan 22, 2019 at 10:57:21AM -0500, Joe Lawrence wrote: > > From: Nicolai Stange > > > > The ppc64 specific implementation of the reliable stacktracer, > > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable > > trace" whenever it finds an exception frame on the stack. Stack frames > > are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, > > as written by exception prologues, is found at a particular location. > > > > However, as observed by Joe Lawrence, it is possible in practice that > > non-exception stack frames can alias with prior exception frames and thus, > > that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on > > the stack. It in turn falsely reports an unreliable stacktrace and blocks > > any live patching transition to finish. Said condition lasts until the > > stack frame is overwritten/initialized by function call or other means. > > > > In principle, we could mitigate this by making the exception frame > > classification condition in save_stack_trace_tsk_reliable() stronger: > > in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into > > account that for all exceptions executing on the kernel stack > > - their stack frames's backlink pointers always match what is saved > > in their pt_regs instance's ->gpr[1] slot and that > > - their exception frame size equals STACK_INT_FRAME_SIZE, a value > > uncommonly large for non-exception frames. > > > > However, while these are currently true, relying on them would make the > > reliable stacktrace implementation more sensitive towards future changes in > > the exception entry code. Note that false negatives, i.e. not detecting > > exception frames, would silently break the live patching consistency model. > > > > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) > > rely on STACK_FRAME_REGS_MARKER as well. > > > > Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER > > for those exceptions running on the "normal" kernel stack and returning > > to kernelspace: because the topmost frame is ignored by the reliable stack > > tracer anyway, returns to userspace don't need to take care of clearing > > the marker. > > > > Furthermore, as I don't have the ability to test this on Book 3E or > > 32 bits, limit the change to Book 3S and 64 bits. > > > > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on > > PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended > > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies > > PPC_BOOK3S_64, there's no functional change here. > > > > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing > > for the consistency model") > > Reported-by: Joe Lawrence > > Signed-off-by: Nicolai Stange > > Signed-off-by: Joe Lawrence > > --- > > arch/powerpc/Kconfig | 2 +- > > arch/powerpc/kernel/entry_64.S | 7 +++ > > 2 files changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > > index 2890d36eb531..73bf87b1d274 100644 > > --- a/arch/powerpc/Kconfig > > +++ b/arch/powerpc/Kconfig > > @@ -220,7 +220,7 @@ config PPC > > select HAVE_PERF_USER_STACK_DUMP > > select HAVE_RCU_TABLE_FREE if SMP > > select HAVE_REGS_AND_STACK_ACCESS_API > > - select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN > > + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && > > CPU_LITTLE_ENDIAN > > select HAVE_SYSCALL_TRACEPOINTS > > select HAVE_VIRT_CPU_ACCOUNTING > > select HAVE_IRQ_TIME_ACCOUNTING > > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S > > index 435927f549c4..a2c168b395d2 100644 > > --- a/arch/powerpc/kernel/entry_64.S > > +++ b/arch/powerpc/kernel/entry_64.S > > @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) > > ld r2,_NIP(r1) > > mtspr SPRN_SRR0,r2 > > > > + /* > > + * Leaving a stale exception_marker on the stack can confuse > > + * the reliable stack unwinder later on. Clear it. > > + */ > > + li r2,0 > > + std r2,STACK_FRAME_OVERHEAD-16(r1) > > + > > Could you please double check, r4 is already 0 at this point > IIUC. So the change might be a simple > > std r4,STACK_FRAME_OVERHEAD-16(r1) > r4 is not 0, sorry for the noise Balbir
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
On Tue, Jan 22, 2019 at 10:57:21AM -0500, Joe Lawrence wrote: > From: Nicolai Stange > > The ppc64 specific implementation of the reliable stacktracer, > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable > trace" whenever it finds an exception frame on the stack. Stack frames > are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, > as written by exception prologues, is found at a particular location. > > However, as observed by Joe Lawrence, it is possible in practice that > non-exception stack frames can alias with prior exception frames and thus, > that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on > the stack. It in turn falsely reports an unreliable stacktrace and blocks > any live patching transition to finish. Said condition lasts until the > stack frame is overwritten/initialized by function call or other means. > > In principle, we could mitigate this by making the exception frame > classification condition in save_stack_trace_tsk_reliable() stronger: > in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into > account that for all exceptions executing on the kernel stack > - their stack frames's backlink pointers always match what is saved > in their pt_regs instance's ->gpr[1] slot and that > - their exception frame size equals STACK_INT_FRAME_SIZE, a value > uncommonly large for non-exception frames. > > However, while these are currently true, relying on them would make the > reliable stacktrace implementation more sensitive towards future changes in > the exception entry code. Note that false negatives, i.e. not detecting > exception frames, would silently break the live patching consistency model. > > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) > rely on STACK_FRAME_REGS_MARKER as well. > > Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER > for those exceptions running on the "normal" kernel stack and returning > to kernelspace: because the topmost frame is ignored by the reliable stack > tracer anyway, returns to userspace don't need to take care of clearing > the marker. > > Furthermore, as I don't have the ability to test this on Book 3E or > 32 bits, limit the change to Book 3S and 64 bits. > > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on > PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies > PPC_BOOK3S_64, there's no functional change here. > > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for > the consistency model") > Reported-by: Joe Lawrence > Signed-off-by: Nicolai Stange > Signed-off-by: Joe Lawrence > --- > arch/powerpc/Kconfig | 2 +- > arch/powerpc/kernel/entry_64.S | 7 +++ > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 2890d36eb531..73bf87b1d274 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -220,7 +220,7 @@ config PPC > select HAVE_PERF_USER_STACK_DUMP > select HAVE_RCU_TABLE_FREE if SMP > select HAVE_REGS_AND_STACK_ACCESS_API > - select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN > + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && > CPU_LITTLE_ENDIAN > select HAVE_SYSCALL_TRACEPOINTS > select HAVE_VIRT_CPU_ACCOUNTING > select HAVE_IRQ_TIME_ACCOUNTING > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S > index 435927f549c4..a2c168b395d2 100644 > --- a/arch/powerpc/kernel/entry_64.S > +++ b/arch/powerpc/kernel/entry_64.S > @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) > ld r2,_NIP(r1) > mtspr SPRN_SRR0,r2 > > + /* > + * Leaving a stale exception_marker on the stack can confuse > + * the reliable stack unwinder later on. Clear it. > + */ > + li r2,0 > + std r2,STACK_FRAME_OVERHEAD-16(r1) > + Could you please double check, r4 is already 0 at this point IIUC. So the change might be a simple std r4,STACK_FRAME_OVERHEAD-16(r1) Balbir
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
Nicolai Stange writes: > Michael Ellerman writes: > >> Joe Lawrence writes: >>> From: Nicolai Stange >>> >>> The ppc64 specific implementation of the reliable stacktracer, >>> save_stack_trace_tsk_reliable(), bails out and reports an "unreliable >>> trace" whenever it finds an exception frame on the stack. Stack frames >>> are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, >>> as written by exception prologues, is found at a particular location. >>> >>> However, as observed by Joe Lawrence, it is possible in practice that >>> non-exception stack frames can alias with prior exception frames and thus, >>> that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on >>> the stack. It in turn falsely reports an unreliable stacktrace and blocks >>> any live patching transition to finish. Said condition lasts until the >>> stack frame is overwritten/initialized by function call or other means. >>> >>> In principle, we could mitigate this by making the exception frame >>> classification condition in save_stack_trace_tsk_reliable() stronger: >>> in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into >>> account that for all exceptions executing on the kernel stack >>> - their stack frames's backlink pointers always match what is saved >>> in their pt_regs instance's ->gpr[1] slot and that >>> - their exception frame size equals STACK_INT_FRAME_SIZE, a value >>> uncommonly large for non-exception frames. >>> >>> However, while these are currently true, relying on them would make the >>> reliable stacktrace implementation more sensitive towards future changes in >>> the exception entry code. Note that false negatives, i.e. not detecting >>> exception frames, would silently break the live patching consistency model. >>> >>> Furthermore, certain other places (diagnostic stacktraces, perf, xmon) >>> rely on STACK_FRAME_REGS_MARKER as well. >>> >>> Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER >>> for those exceptions running on the "normal" kernel stack and returning >>> to kernelspace: because the topmost frame is ignored by the reliable stack >>> tracer anyway, returns to userspace don't need to take care of clearing >>> the marker. >>> >>> Furthermore, as I don't have the ability to test this on Book 3E or >>> 32 bits, limit the change to Book 3S and 64 bits. >>> >>> Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on >>> PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended >>> on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies >>> PPC_BOOK3S_64, there's no functional change here. >> >> That has nothing to do with the fix and should really be in a separate >> patch. >> >> I can split it when applying. > > If you don't mind, that would be nice! Or simply drop that > chunk... Otherwise, let me know if I shall send a split v2 for this > patch [1/4] only. No worries, I split it out: commit a50d3250d7ae34c561177a1f9cfb79816fcbcff1 Author: Nicolai Stange AuthorDate: Thu Jan 31 16:41:50 2019 +1100 Commit: Michael Ellerman CommitDate: Thu Jan 31 16:43:29 2019 +1100 powerpc/64s: Make reliable stacktrace dependency clearer Make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies PPC_BOOK3S_64, there's no functional change here. Signed-off-by: Nicolai Stange Signed-off-by: Joe Lawrence [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 2890d36eb531..73bf87b1d274 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -220,7 +220,7 @@ config PPC select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE if SMP select HAVE_REGS_AND_STACK_ACCESS_API - select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN select HAVE_SYSCALL_TRACEPOINTS select HAVE_VIRT_CPU_ACCOUNTING select HAVE_IRQ_TIME_ACCOUNTING cheers
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
Michael Ellerman writes: > Joe Lawrence writes: >> From: Nicolai Stange >> >> The ppc64 specific implementation of the reliable stacktracer, >> save_stack_trace_tsk_reliable(), bails out and reports an "unreliable >> trace" whenever it finds an exception frame on the stack. Stack frames >> are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, >> as written by exception prologues, is found at a particular location. >> >> However, as observed by Joe Lawrence, it is possible in practice that >> non-exception stack frames can alias with prior exception frames and thus, >> that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on >> the stack. It in turn falsely reports an unreliable stacktrace and blocks >> any live patching transition to finish. Said condition lasts until the >> stack frame is overwritten/initialized by function call or other means. >> >> In principle, we could mitigate this by making the exception frame >> classification condition in save_stack_trace_tsk_reliable() stronger: >> in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into >> account that for all exceptions executing on the kernel stack >> - their stack frames's backlink pointers always match what is saved >> in their pt_regs instance's ->gpr[1] slot and that >> - their exception frame size equals STACK_INT_FRAME_SIZE, a value >> uncommonly large for non-exception frames. >> >> However, while these are currently true, relying on them would make the >> reliable stacktrace implementation more sensitive towards future changes in >> the exception entry code. Note that false negatives, i.e. not detecting >> exception frames, would silently break the live patching consistency model. >> >> Furthermore, certain other places (diagnostic stacktraces, perf, xmon) >> rely on STACK_FRAME_REGS_MARKER as well. >> >> Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER >> for those exceptions running on the "normal" kernel stack and returning >> to kernelspace: because the topmost frame is ignored by the reliable stack >> tracer anyway, returns to userspace don't need to take care of clearing >> the marker. >> >> Furthermore, as I don't have the ability to test this on Book 3E or >> 32 bits, limit the change to Book 3S and 64 bits. >> >> Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on >> PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended >> on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies >> PPC_BOOK3S_64, there's no functional change here. > > That has nothing to do with the fix and should really be in a separate > patch. > > I can split it when applying. If you don't mind, that would be nice! Or simply drop that chunk... Otherwise, let me know if I shall send a split v2 for this patch [1/4] only. Thanks, Nicolai -- SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
Joe Lawrence writes: > From: Nicolai Stange > > The ppc64 specific implementation of the reliable stacktracer, > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable > trace" whenever it finds an exception frame on the stack. Stack frames > are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, > as written by exception prologues, is found at a particular location. > > However, as observed by Joe Lawrence, it is possible in practice that > non-exception stack frames can alias with prior exception frames and thus, > that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on > the stack. It in turn falsely reports an unreliable stacktrace and blocks > any live patching transition to finish. Said condition lasts until the > stack frame is overwritten/initialized by function call or other means. > > In principle, we could mitigate this by making the exception frame > classification condition in save_stack_trace_tsk_reliable() stronger: > in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into > account that for all exceptions executing on the kernel stack > - their stack frames's backlink pointers always match what is saved > in their pt_regs instance's ->gpr[1] slot and that > - their exception frame size equals STACK_INT_FRAME_SIZE, a value > uncommonly large for non-exception frames. > > However, while these are currently true, relying on them would make the > reliable stacktrace implementation more sensitive towards future changes in > the exception entry code. Note that false negatives, i.e. not detecting > exception frames, would silently break the live patching consistency model. > > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) > rely on STACK_FRAME_REGS_MARKER as well. > > Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER > for those exceptions running on the "normal" kernel stack and returning > to kernelspace: because the topmost frame is ignored by the reliable stack > tracer anyway, returns to userspace don't need to take care of clearing > the marker. > > Furthermore, as I don't have the ability to test this on Book 3E or > 32 bits, limit the change to Book 3S and 64 bits. > > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on > PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies > PPC_BOOK3S_64, there's no functional change here. That has nothing to do with the fix and should really be in a separate patch. I can split it when applying. cheers > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for > the consistency model") > Reported-by: Joe Lawrence > Signed-off-by: Nicolai Stange > Signed-off-by: Joe Lawrence > --- > arch/powerpc/Kconfig | 2 +- > arch/powerpc/kernel/entry_64.S | 7 +++ > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 2890d36eb531..73bf87b1d274 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -220,7 +220,7 @@ config PPC > select HAVE_PERF_USER_STACK_DUMP > select HAVE_RCU_TABLE_FREE if SMP > select HAVE_REGS_AND_STACK_ACCESS_API > - select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN > + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && > CPU_LITTLE_ENDIAN > select HAVE_SYSCALL_TRACEPOINTS > select HAVE_VIRT_CPU_ACCOUNTING > select HAVE_IRQ_TIME_ACCOUNTING > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S > index 435927f549c4..a2c168b395d2 100644 > --- a/arch/powerpc/kernel/entry_64.S > +++ b/arch/powerpc/kernel/entry_64.S > @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) > ld r2,_NIP(r1) > mtspr SPRN_SRR0,r2 > > + /* > + * Leaving a stale exception_marker on the stack can confuse > + * the reliable stack unwinder later on. Clear it. > + */ > + li r2,0 > + std r2,STACK_FRAME_OVERHEAD-16(r1) > + > ld r0,GPR0(r1) > ld r2,GPR2(r1) > ld r3,GPR3(r1) > -- > 2.20.1
[PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return
From: Nicolai Stange The ppc64 specific implementation of the reliable stacktracer, save_stack_trace_tsk_reliable(), bails out and reports an "unreliable trace" whenever it finds an exception frame on the stack. Stack frames are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, as written by exception prologues, is found at a particular location. However, as observed by Joe Lawrence, it is possible in practice that non-exception stack frames can alias with prior exception frames and thus, that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on the stack. It in turn falsely reports an unreliable stacktrace and blocks any live patching transition to finish. Said condition lasts until the stack frame is overwritten/initialized by function call or other means. In principle, we could mitigate this by making the exception frame classification condition in save_stack_trace_tsk_reliable() stronger: in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into account that for all exceptions executing on the kernel stack - their stack frames's backlink pointers always match what is saved in their pt_regs instance's ->gpr[1] slot and that - their exception frame size equals STACK_INT_FRAME_SIZE, a value uncommonly large for non-exception frames. However, while these are currently true, relying on them would make the reliable stacktrace implementation more sensitive towards future changes in the exception entry code. Note that false negatives, i.e. not detecting exception frames, would silently break the live patching consistency model. Furthermore, certain other places (diagnostic stacktraces, perf, xmon) rely on STACK_FRAME_REGS_MARKER as well. Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER for those exceptions running on the "normal" kernel stack and returning to kernelspace: because the topmost frame is ignored by the reliable stack tracer anyway, returns to userspace don't need to take care of clearing the marker. Furthermore, as I don't have the ability to test this on Book 3E or 32 bits, limit the change to Book 3S and 64 bits. Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies PPC_BOOK3S_64, there's no functional change here. Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model") Reported-by: Joe Lawrence Signed-off-by: Nicolai Stange Signed-off-by: Joe Lawrence --- arch/powerpc/Kconfig | 2 +- arch/powerpc/kernel/entry_64.S | 7 +++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 2890d36eb531..73bf87b1d274 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -220,7 +220,7 @@ config PPC select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE if SMP select HAVE_REGS_AND_STACK_ACCESS_API - select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN select HAVE_SYSCALL_TRACEPOINTS select HAVE_VIRT_CPU_ACCOUNTING select HAVE_IRQ_TIME_ACCOUNTING diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 435927f549c4..a2c168b395d2 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r2,_NIP(r1) mtspr SPRN_SRR0,r2 + /* +* Leaving a stale exception_marker on the stack can confuse +* the reliable stack unwinder later on. Clear it. +*/ + li r2,0 + std r2,STACK_FRAME_OVERHEAD-16(r1) + ld r0,GPR0(r1) ld r2,GPR2(r1) ld r3,GPR3(r1) -- 2.20.1