On Sun, Aug 25, 2019 at 8:37 AM Waldek Kozaczuk <[email protected]>
wrote:

> +void * const MISSING_SYMBOL_INDICATOR = (void*)0xffffeeeeddddcccc;
>
>>
>>>> Can you please remind me why this is an invalid pointer? Does it not
>>>> have enough f's in the beginning to be a valid pointer?
>>>>
>>> I tried to pick something that would never be a valid pointer. Is there
>>> a better address I should use?
>>>
>>
>> Although x86_64 supports 64-bit pointers, they knew that applications
>> don't really need - yet - to address the full 2^64 bits of memory.
>> So the processor allows you to choose how many bits you *really* want.
>> The default is 48 bits, allowing you to address 262 terabytes of memory,
>> and quite enough for today's standards. This uses 4-level page tables. You
>> can also choose 57 bits, and 5-level page tables (
>> https://en.wikipedia.org/wiki/Intel_5-level_paging) but OSv neither does
>> this nor want it.
>>
>> When 48-bit addresses are supported, valid addresses are so-called
>> "canonical" addresses, where all the highest bits cannot be the same. For
>> 48 bits it means the last legal positive addresses is 0x00007FFFFFFFFFFF,
>> while the most negative address is 0xFFFF800000000000. If you take an
>> address which is not canonical in that sense , for example,
>> 0xF000000000000000, then this address is completely illegal. If you take an
>> address which is canonical, but simply not in the page table, you get a
>> regular page fault - this is what happens with the "0" address. But you
>> need to make sure you never allocate it.
>>
>> You chose 0xffffeeeeddddcccc. This is in fact a legal canonical address.
>> In it may be possible that our malloc() (see mempool.cc) allocates it.
>> Whether it does, I don't know.
>> What happens if you use a non-canonical address? I'm not even sure if you
>> get the normal #PF or something else like #GP.
>>
> Do you suggest to use a different address then? Or maybe a different
> mechanism? The bottom line is that if I only ignore the missing symbol in
> *arch_relocate_rela* and unfortunately it does get used later, the user
> will get regular "0" address page fault in most cases which would not tell
> him that missing symbol was used in such case. Thoughts?
>

Yes, I would choose an address we can actually be sure is illegal - either
a non-canonical address as I explained above (but please verify if you
still get a #PF and not a #GP), or a canonical address which we know for
some reason is not allocated.
E.g., we could allocate a single page and mprotect() it to have zero
permissions, and use that address.
We could also do the same without allocating a physical page at all - just
mark the *virtual* address taken, without actually allocating any physical
page. This will require more elaborate coding so probably not worth the
extra effort just to save 4K of memory but you could leave it as a FIXME.

If you really want to spend a long time on this feature(you probably
shouldn't), you can even make each variable point to a different page (or
different positions in the same page, but watch out for the possibility of
someone accessing an offset into a variable..) so we can print a different
error message for each of the missing variables.




>
>>
>>>> +
>>>>>  /* for pltgot relocation */
>>>>>  #define ARCH_JUMP_SLOT R_X86_64_JUMP_SLOT
>>>>>
>>>>> diff --git a/arch/x64/mmu.cc b/arch/x64/mmu.cc
>>>>> index 2f1ba5e2..441b6c45 100644
>>>>> --- a/arch/x64/mmu.cc
>>>>> +++ b/arch/x64/mmu.cc
>>>>> @@ -6,6 +6,7 @@
>>>>>   */
>>>>>
>>>>>  #include "arch-cpu.hh"
>>>>> +#include "arch-elf.hh"
>>>>>  #include <osv/debug.hh>
>>>>>  #include <osv/sched.hh>
>>>>>  #include <osv/mmu.hh>
>>>>> @@ -28,6 +29,9 @@ void page_fault(exception_frame *ef)
>>>>>      if (!pc) {
>>>>>          abort("trying to execute null pointer");
>>>>>      }
>>>>> +    if (pc == MISSING_SYMBOL_INDICATOR) {
>>>>> +        abort("trying to execute missing symbol");
>>>>>
>>>>
>>>> Do you have a test case where you actually see this message?
>>>>
>>> Yes. I was able to manually create tests that would trigger this with
>>> missing symbol scenario with R_X86_64_GLOB_DAT and R_X86_64_64 relocation
>>> type. In the latter case, I used the graalvm app that uses mprotect and I
>>> manually misspelled to make it not found and the message was properly
>>> handled. I also conducted a similar test for R_X86_64_GLOB_DAT.
>>>
>>
>> I didn't understand how you recreated this. Did you have a test case
>> where the code really used the function relocated via R_X86_64_GLOB_DAT?
>>
>>
>>>
>>>> Because I wonder if the invalid pointer actually gets *executed* (so pc
>>>> = ...) - it is also possible the pointer get followed, not executed. I
>>>> think "pc" isn't the general indication of where the page fault happened.
>>>>
>>> Not sure I understand what you are saying here. I did happen in the 2
>>> test scenarios I ran. I see that this particular page fault logic looks at
>>> RIP value which only apply to function execution, right? Do we have another
>>> page fault handler when data is actually read/written to where we should
>>> supply similar logic?
>>>
>>
>> When you get a page fault, there can be two things that happened: One
>> option is that the instruction tried to access some memory address which
>> isn't mapped, or mapped for wrong permissions (e.g., the instruction tried
>> to write into memory mapped read-only). In this case, we have the broken
>> address in the cr2 register. Another option is that the *instruction*
>> itself could not be executed - because the current pc (program counter)
>> points to memory not mapped as executable - or - often - pc is 0 means that
>> someone tried to *execute* the null pointer.
>>
>> Because this last case happens commonly, we have a special message for it
>> in our page_fault() handler. But it only happens if someone tries to
>> *execute* a null pointer. If you try to read or write from a null pointer,
>> you won't get that message.
>>
>> Similarly, your test for pc will only catch trying to executing this fake
>> address - not reading or writing for it. To catch the latter you would need
>> to also check "addr" (i.e., cr2).
>>
>> In all the examples we discussed so far, these relocations were used for
>> functions, so presumably if ever used, these addresses will indeed be
>> *executed*, so your test would be good enough. But if this relocation is
>> used for a variable, then this address might be read or written, not just
>> executed. Moreover, if the address is "close" (e.g., same page) as the fake
>> address, it is probably an indication that someone tried to read or write a
>> field inside that variable, instead of its first byte.
>>
>>
>>>>      // The following code may sleep. So let's verify the fault did not
>>>>> happen
>>>>>      // when preemption was disabled, or interrupts were disabled.
>>>>>      assert(sched::preemptable());
>>>>> --
>>>>> 2.20.1
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OSv Development" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/osv-dev/20190820042043.25133-1-jwkozaczuk%40gmail.com
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "OSv Development" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/osv-dev/3a4fe9b0-e2a1-46fc-8982-b8a06fee2be3%40googlegroups.com
>>> <https://groups.google.com/d/msgid/osv-dev/3a4fe9b0-e2a1-46fc-8982-b8a06fee2be3%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/15086b1a-553a-496b-9c0b-498973da1ec1%40googlegroups.com
> <https://groups.google.com/d/msgid/osv-dev/15086b1a-553a-496b-9c0b-498973da1ec1%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CANEVyjv%2BQqMzvwSx0TUSVANDXJxbNuv_baJaY7Nnjg5zCAhnAg%40mail.gmail.com.

Reply via email to