On Tue, Aug 20, 2019 at 7:20 AM Waldemar Kozaczuk <[email protected]>
wrote:

> The commit
> https://github.com/cloudius-systems/osv/commit/f5cc12d56dd986d2c7982c5738b1f859702b07fb
> addressed the issue #993 to relax handling of missing symbols in
> relocate_pltgot()
> when loading ELF objects with BIND_NOW. This patch on other hand attempts
> to close
> the gap of handling missing symbols this time in relocate_rela() during
> loading phase
> of ELF objects by dynamic linker.
>

Do you have an understanding *why* issue #1023 is different from anything
we've seen
before, where the function is relocated by relocate_rela() and not
relocate_pltgot()?
My guess was that this function is somehow accessed as a *variable* (e.g.,
its pointer is
copied or something) instead of as a real function call, but this was just
a guess...


> The solution is similar to what we do in relocate_pltgot()
> - let missing symbol get ignored and hope they will not be used.
> However in this case we cannot rely on lazy symbol resolution
> logic that would catch missing symbol later when used and abort.
> Instead we place an indicator of missing symbol (special invalid address)
> which would trigger page fault when symbol accessed. This indicator allows
> us to distinguish missing symbol used scenario from other page fault
> related ones.
> This unfortunately does not tell us which missing symbol was used.
> In future we could place indicator + offset where offset would point
> to a name of the missing symbol in some missing symbols table.
>
> Fixes #1023
>

Did you test this on an application (like the rust example #1023) where
before the
application couldn't run, and now it can? Or does the application really
try to use
the invalid pointer, and crash at that point?




>
> Signed-off-by: Waldemar Kozaczuk <[email protected]>
> ---
>  arch/x64/arch-elf.cc | 20 ++++++++++++++++----
>  arch/x64/arch-elf.hh |  2 ++
>  arch/x64/mmu.cc      |  4 ++++
>  3 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x64/arch-elf.cc b/arch/x64/arch-elf.cc
> index ebe40996..4c513150 100644
> --- a/arch/x64/arch-elf.cc
> +++ b/arch/x64/arch-elf.cc
> @@ -70,16 +70,28 @@ bool object::arch_relocate_rela(u32 type, u32 sym,
> void *addr,
>          memcpy(addr, sm.relocated_addr(), sm.size());
>          break;
>      }
> -    case R_X86_64_64:
> -        *static_cast<void**>(addr) = symbol(sym).relocated_addr() +
> addend;
> +    case R_X86_64_64: {
> +        auto _sym = symbol(sym, true);
> +        if (_sym.symbol) {
> +            *static_cast<void**>(addr) = _sym.relocated_addr() + addend;
> +        } else {
> +            *static_cast<void**>(addr) = MISSING_SYMBOL_INDICATOR;
> +        }
>          break;
> +    }
>      case R_X86_64_RELATIVE:
>          *static_cast<void**>(addr) = _base + addend;
>          break;
>      case R_X86_64_JUMP_SLOT:
> -    case R_X86_64_GLOB_DAT:
> -        *static_cast<void**>(addr) = symbol(sym).relocated_addr();
> +    case R_X86_64_GLOB_DAT: {
> +        auto _sym = symbol(sym, true);
> +        if (_sym.symbol) {
> +            *static_cast<void**>(addr) = _sym.relocated_addr();
> +        } else {
> +            *static_cast<void**>(addr) = MISSING_SYMBOL_INDICATOR;
> +        }
>          break;
> +    }
>      // The next 3 types are intended to relocate symbols of thread local
> variables
>      // defined with __thread modifier
>      //
> diff --git a/arch/x64/arch-elf.hh b/arch/x64/arch-elf.hh
> index 1811ceb5..b4c63967 100644
> --- a/arch/x64/arch-elf.hh
> +++ b/arch/x64/arch-elf.hh
> @@ -34,6 +34,8 @@ enum {
>      R_X86_64_IRELATIVE = 37, //  word64 indirect(B + A)
>  };
>
> +void * const MISSING_SYMBOL_INDICATOR = (void*)0xffffeeeeddddcccc;
>

Can you please remind me why this is an invalid pointer? Does it not have
enough f's in the beginning to be a valid pointer?

+
>  /* for pltgot relocation */
>  #define ARCH_JUMP_SLOT R_X86_64_JUMP_SLOT
>
> diff --git a/arch/x64/mmu.cc b/arch/x64/mmu.cc
> index 2f1ba5e2..441b6c45 100644
> --- a/arch/x64/mmu.cc
> +++ b/arch/x64/mmu.cc
> @@ -6,6 +6,7 @@
>   */
>
>  #include "arch-cpu.hh"
> +#include "arch-elf.hh"
>  #include <osv/debug.hh>
>  #include <osv/sched.hh>
>  #include <osv/mmu.hh>
> @@ -28,6 +29,9 @@ void page_fault(exception_frame *ef)
>      if (!pc) {
>          abort("trying to execute null pointer");
>      }
> +    if (pc == MISSING_SYMBOL_INDICATOR) {
> +        abort("trying to execute missing symbol");
>

Do you have a test case where you actually see this message?
Because I wonder if the invalid pointer actually gets *executed* (so pc =
...) - it is also possible the pointer get followed, not executed. I think
"pc" isn't the general indication of where the page fault happened.


     // The following code may sleep. So let's verify the fault did not
> happen
>      // when preemption was disabled, or interrupts were disabled.
>      assert(sched::preemptable());
> --
> 2.20.1
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/20190820042043.25133-1-jwkozaczuk%40gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CANEVyjs%2Bx-hKp30u2Zgoq1a3k%2B0zgKZYiP5JX8D2B%2BQUPZodrA%40mail.gmail.com.

Reply via email to