On Tue, Aug 20, 2019 at 7:20 AM Waldemar Kozaczuk <[email protected]> wrote:
> The commit > https://github.com/cloudius-systems/osv/commit/f5cc12d56dd986d2c7982c5738b1f859702b07fb > addressed the issue #993 to relax handling of missing symbols in > relocate_pltgot() > when loading ELF objects with BIND_NOW. This patch on other hand attempts > to close > the gap of handling missing symbols this time in relocate_rela() during > loading phase > of ELF objects by dynamic linker. > Do you have an understanding *why* issue #1023 is different from anything we've seen before, where the function is relocated by relocate_rela() and not relocate_pltgot()? My guess was that this function is somehow accessed as a *variable* (e.g., its pointer is copied or something) instead of as a real function call, but this was just a guess... > The solution is similar to what we do in relocate_pltgot() > - let missing symbol get ignored and hope they will not be used. > However in this case we cannot rely on lazy symbol resolution > logic that would catch missing symbol later when used and abort. > Instead we place an indicator of missing symbol (special invalid address) > which would trigger page fault when symbol accessed. This indicator allows > us to distinguish missing symbol used scenario from other page fault > related ones. > This unfortunately does not tell us which missing symbol was used. > In future we could place indicator + offset where offset would point > to a name of the missing symbol in some missing symbols table. > > Fixes #1023 > Did you test this on an application (like the rust example #1023) where before the application couldn't run, and now it can? Or does the application really try to use the invalid pointer, and crash at that point? > > Signed-off-by: Waldemar Kozaczuk <[email protected]> > --- > arch/x64/arch-elf.cc | 20 ++++++++++++++++---- > arch/x64/arch-elf.hh | 2 ++ > arch/x64/mmu.cc | 4 ++++ > 3 files changed, 22 insertions(+), 4 deletions(-) > > diff --git a/arch/x64/arch-elf.cc b/arch/x64/arch-elf.cc > index ebe40996..4c513150 100644 > --- a/arch/x64/arch-elf.cc > +++ b/arch/x64/arch-elf.cc > @@ -70,16 +70,28 @@ bool object::arch_relocate_rela(u32 type, u32 sym, > void *addr, > memcpy(addr, sm.relocated_addr(), sm.size()); > break; > } > - case R_X86_64_64: > - *static_cast<void**>(addr) = symbol(sym).relocated_addr() + > addend; > + case R_X86_64_64: { > + auto _sym = symbol(sym, true); > + if (_sym.symbol) { > + *static_cast<void**>(addr) = _sym.relocated_addr() + addend; > + } else { > + *static_cast<void**>(addr) = MISSING_SYMBOL_INDICATOR; > + } > break; > + } > case R_X86_64_RELATIVE: > *static_cast<void**>(addr) = _base + addend; > break; > case R_X86_64_JUMP_SLOT: > - case R_X86_64_GLOB_DAT: > - *static_cast<void**>(addr) = symbol(sym).relocated_addr(); > + case R_X86_64_GLOB_DAT: { > + auto _sym = symbol(sym, true); > + if (_sym.symbol) { > + *static_cast<void**>(addr) = _sym.relocated_addr(); > + } else { > + *static_cast<void**>(addr) = MISSING_SYMBOL_INDICATOR; > + } > break; > + } > // The next 3 types are intended to relocate symbols of thread local > variables > // defined with __thread modifier > // > diff --git a/arch/x64/arch-elf.hh b/arch/x64/arch-elf.hh > index 1811ceb5..b4c63967 100644 > --- a/arch/x64/arch-elf.hh > +++ b/arch/x64/arch-elf.hh > @@ -34,6 +34,8 @@ enum { > R_X86_64_IRELATIVE = 37, // word64 indirect(B + A) > }; > > +void * const MISSING_SYMBOL_INDICATOR = (void*)0xffffeeeeddddcccc; > Can you please remind me why this is an invalid pointer? Does it not have enough f's in the beginning to be a valid pointer? + > /* for pltgot relocation */ > #define ARCH_JUMP_SLOT R_X86_64_JUMP_SLOT > > diff --git a/arch/x64/mmu.cc b/arch/x64/mmu.cc > index 2f1ba5e2..441b6c45 100644 > --- a/arch/x64/mmu.cc > +++ b/arch/x64/mmu.cc > @@ -6,6 +6,7 @@ > */ > > #include "arch-cpu.hh" > +#include "arch-elf.hh" > #include <osv/debug.hh> > #include <osv/sched.hh> > #include <osv/mmu.hh> > @@ -28,6 +29,9 @@ void page_fault(exception_frame *ef) > if (!pc) { > abort("trying to execute null pointer"); > } > + if (pc == MISSING_SYMBOL_INDICATOR) { > + abort("trying to execute missing symbol"); > Do you have a test case where you actually see this message? Because I wonder if the invalid pointer actually gets *executed* (so pc = ...) - it is also possible the pointer get followed, not executed. I think "pc" isn't the general indication of where the page fault happened. // The following code may sleep. So let's verify the fault did not > happen > // when preemption was disabled, or interrupts were disabled. > assert(sched::preemptable()); > -- > 2.20.1 > > -- > You received this message because you are subscribed to the Google Groups > "OSv Development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/osv-dev/20190820042043.25133-1-jwkozaczuk%40gmail.com > . > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CANEVyjs%2Bx-hKp30u2Zgoq1a3k%2B0zgKZYiP5JX8D2B%2BQUPZodrA%40mail.gmail.com.
