Hi, I wanted to ask about the status of the patch. Let us know if there are any other steps we can undertake.
Kind regards Lorena On Fri 2020-12-18 16:44:21, Daniel Thompson wrote: > Hi Stefan > > On Mon, Dec 14, 2020 at 03:13:12PM +0100, Stefan Saecherl wrote: > > The problem is that breakpoints that are set early (e.g. via kgdbwait) > > cannot be deleted after boot completed (to be precise after mark_rodata_ro > > ran). > > > > When setting a breakpoint early there are executable pages that are > > writable so the copy_to_kernel_nofault call in kgdb_arch_set_breakpoint > > succeeds and the breakpoint is saved as type BP_BREAKPOINT. > > > > Later in the boot write access to these pages is restricted. So when > > removing the breakpoint the copy_to_kernel_nofault call in > > kgdb_arch_remove_breakpoint is destined to fail and the breakpoint removal > > fails. So after copy_to_kernel_nofault failed try to text_poke_kgdb which > > can work around nonwriteability. > > > > One thing to consider when doing this is that code can go away during boot > > (e.g. .init.text). Previously kgdb_arch_remove_breakpoint handled this case > > gracefully by just having copy_to_kernel_nofault fail but if one then calls > > text_poke_kgdb the system dies due to the BUG_ON we moved out of > > __text_poke. To avoid this __text_poke now returns an error in case of a > > nonpresent code page and the error is handled at call site. > > > > Checkpatch complains about two uses of BUG_ON but the new code should not > > trigger BUG_ON in cases where the old didn't. > > > > Co-developed-by: Lorena Kretzschmar <qy15s...@cip.cs.fau.de> > > Signed-off-by: Lorena Kretzschmar <qy15s...@cip.cs.fau.de> > > Signed-off-by: Stefan Saecherl <stefan.saech...@fau.de> > > I took this to be a gap in the kgdbtest suite so I added a couple of > tests that cover this area. Before this patch they failed now they > pass (at least they do for ARCH=x86). > > I don't see any new failures either, so: > > Tested-by: Daniel Thompson <daniel.thomp...@linaro.org> > > > Daniel. > > > > > --- > > arch/x86/kernel/alternative.c | 16 +++++++---- > > arch/x86/kernel/kgdb.c | 54 ++++++++++++++++++++++++----------- > > 2 files changed, 48 insertions(+), 22 deletions(-) > > > > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > > index 2400ad62f330..0f145d837885 100644 > > --- a/arch/x86/kernel/alternative.c > > +++ b/arch/x86/kernel/alternative.c > > @@ -878,11 +878,9 @@ static void *__text_poke(void *addr, const void > > *opcode, size_t len) > > if (cross_page_boundary) > > pages[1] = virt_to_page(addr + PAGE_SIZE); > > } > > - /* > > - * If something went wrong, crash and burn since recovery paths are not > > - * implemented. > > - */ > > - BUG_ON(!pages[0] || (cross_page_boundary && !pages[1])); > > + > > + if (!pages[0] || (cross_page_boundary && !pages[1])) > > + return ERR_PTR(-EFAULT); > > > > /* > > * Map the page without the global bit, as TLB flushing is done with > > @@ -976,7 +974,13 @@ void *text_poke(void *addr, const void *opcode, size_t > > len) > > { > > lockdep_assert_held(&text_mutex); > > > > - return __text_poke(addr, opcode, len); > > + addr = __text_poke(addr, opcode, len); > > + /* > > + * If something went wrong, crash and burn since recovery paths are not > > + * implemented. > > + */ > > + BUG_ON(IS_ERR(addr)); > > + return addr; > > } > > > > /** > > diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c > > index ff7878df96b4..e98c9c43db7c 100644 > > --- a/arch/x86/kernel/kgdb.c > > +++ b/arch/x86/kernel/kgdb.c > > @@ -731,6 +731,7 @@ void kgdb_arch_set_pc(struct pt_regs *regs, unsigned > > long ip) > > int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt) > > { > > int err; > > + void *addr; > > > > bpt->type = BP_BREAKPOINT; > > err = copy_from_kernel_nofault(bpt->saved_instr, (char *)bpt->bpt_addr, > > @@ -747,8 +748,14 @@ int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt) > > */ > > if (mutex_is_locked(&text_mutex)) > > return -EBUSY; > > - text_poke_kgdb((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr, > > - BREAK_INSTR_SIZE); > > + > > + addr = text_poke_kgdb((void *)bpt->bpt_addr, > > arch_kgdb_ops.gdb_bpt_instr, > > + BREAK_INSTR_SIZE); > > + /* This should never trigger because the above call to > > copy_from_kernel_nofault > > + * already succeeded. > > + */ > > + BUG_ON(IS_ERR(addr)); > > + > > bpt->type = BP_POKE_BREAKPOINT; > > > > return 0; > > @@ -756,21 +763,36 @@ int kgdb_arch_set_breakpoint(struct kgdb_bkpt *bpt) > > > > int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt) > > { > > - if (bpt->type != BP_POKE_BREAKPOINT) > > - goto knl_write; > > - /* > > - * It is safe to call text_poke_kgdb() because normal kernel execution > > - * is stopped on all cores, so long as the text_mutex is not locked. > > - */ > > - if (mutex_is_locked(&text_mutex)) > > - goto knl_write; > > - text_poke_kgdb((void *)bpt->bpt_addr, bpt->saved_instr, > > - BREAK_INSTR_SIZE); > > - return 0; > > + void *addr; > > + int err; > > > > -knl_write: > > - return copy_to_kernel_nofault((char *)bpt->bpt_addr, > > - (char *)bpt->saved_instr, BREAK_INSTR_SIZE); > > + if (bpt->type == BP_POKE_BREAKPOINT) { > > + if (mutex_is_locked(&text_mutex)) { > > + err = copy_to_kernel_nofault((char *)bpt->bpt_addr, > > + (char > > *)bpt->saved_instr, > > + BREAK_INSTR_SIZE); > > + } else { > > + /* > > + * It is safe to call text_poke_kgdb() because normal > > kernel execution > > + * is stopped on all cores, so long as the text_mutex > > is not locked. > > + */ > > + addr = text_poke_kgdb((void *)bpt->bpt_addr, > > + bpt->saved_instr, > > + BREAK_INSTR_SIZE); > > + err = PTR_ERR_OR_ZERO(addr); > > + } > > + } else { > > + err = copy_to_kernel_nofault((char *)bpt->bpt_addr, > > + (char *)bpt->saved_instr, > > + BREAK_INSTR_SIZE); > > + if (err == -EFAULT && !mutex_is_locked(&text_mutex)) { > > + addr = text_poke_kgdb((void *)bpt->bpt_addr, > > + bpt->saved_instr, > > + BREAK_INSTR_SIZE); > > + err = PTR_ERR_OR_ZERO(addr); > > + } > > + } > > + return err; > > } > > > > const struct kgdb_arch arch_kgdb_ops = { > > -- > > 2.20.1