On Tue, Jul 7, 2020 at 2:44 PM Deepak Krishnan <[email protected]> wrote:
> Thank you for the response. > I agree - whatever I had in my email would not cover all cases here - was > just a quick and dirty fix I tried while debugging. > Since that code hasn't changed in years, I ended up second guessing - if I > was missing something very obvious in my understanding of it and eventually > decided to report it. > The patch you suggested should fix it for all cases - I tested it too, it > fixes my case. > > Thanks. I will send the proper patch soon. > Thanks > Deepak > > On Tue, 7 Jul 2020 at 00:31, Waldek Kozaczuk <[email protected]> wrote: > >> Hi, >> >> On Saturday, July 4, 2020 at 10:50:21 AM UTC-4, Deepak Krishnan wrote: >>> >>> Hello all >>> Quite new to OSv, just started playing around with OSv recently and I >>> came across this. I had a problem where when I do `scripts/run.py -V -c 1` >>> on a few C++ test binaries, I was getting errors and eventually a failure >>> like so: >>> >>> ``` >>> could not load f(YYDX\XA^A^(f.f.fE >>> ignoring missing symbol >>> ignoring missing symbol tA >>> ... >>> failed looking up symbol UHH!33333333H)HH!!HHHH!HD >>> ``` >>> >>> which seemed odd since it looks like it was getting junk values for >>> symbol names. On debugging a bit, I found that the .strtab section of the >>> failing library was somehow getting overwritten from some arbitrary >>> address. On inspecting the memory, I could see an ELF header there which >>> made me think that the libraries are getting loaded at overlapping memory >>> regions - which eventually led me to debugging the load part - specifically >>> the elf::object::set_base function where it calculates the _base and >>> _end. Here is what I found in from gdb for the problematic library: >>> >>> (gdb) info locals >>> p = {p_type = 1, p_flags = 5, p_offset = 0, p_vaddr = 0, p_paddr = 0, >>> p_filesz = 110200, p_memsz = 110200, p_align = 4096} >>> q = {p_type = 6, p_flags = 4, p_offset = 598016, p_vaddr = 122880, >>> p_paddr = 122880, p_filesz = 448, p_memsz = 448, p_align = 8} >>> (gdb) p _ phdrs >>> $8 = std::vector of length 8, capacity 8 = { >>> {p_type = 6, p_flags = 4, p_offset = 598016, p_vaddr = 122880, p_paddr = >>> 122880, p_filesz = 448, p_memsz = 448, p_align = 8}, >>> {p_type = 1, p_flags = 5, p_offset = 0, p_vaddr = 0, p_paddr = 0, >>> p_filesz = 110200, p_memsz = 110200, p_align = 4096}, >>> {p_type = 1685382481, p_flags = 6, p_offset = 0, p_vaddr = 0, p_paddr = >>> 0, p_filesz = 0, p_memsz = 0, p_align = 16}, >>> {p_type = 1685382480, p_flags = 4, p_offset = 107228, p_vaddr = 107228, >>> p_paddr = 107228, p_filesz = 2972, p_memsz = 2972, p_align = 4}, >>> {p_type = 1, p_flags = 6, p_offset = 112560, p_vaddr = 116656, p_paddr = >>> 116656, p_filesz = 3736, p_memsz = 4073, p_align = 4096}, >>> {p_type = 1685382482, p_flags = 6, p_offset = 112560, p_vaddr = 116656, >>> p_paddr = 116656, p_filesz = 2128, p_memsz = 2128, p_align = 16}, >>> {p_type = 2, p_flags = 6, p_offset = 113648, p_vaddr = 117744, p_paddr = >>> 117744, p_filesz = 592, p_memsz = 592, p_align = 8}, >>> {p_type = 1, p_flags = 6, p_offset = 598016, p_vaddr = 122880, p_paddr = >>> 122880, p_filesz = 22584, p_memsz = 22584, p_align = 4096} >>> } >>> (gdb) >>> >>> As I understand, 'q' should be the PT_LOAD header with the highest >>> p_vaddr but note that it is pointing to the PT_PHDR section - as both of >>> them has the same p_vaddr but the PT_PHDR obviously comes first in the >>> program headers and std:min_element starts comparison starting the first >>> element. That causes it to set the next offset to a lower address and seems >>> to be the cause of the issue. I can get past it with this patch: >>> >>> diff --git a/core/elf.cc b/core/elf.cc >>> index ffb16004..af5dca02 100644 >>> --- a/core/elf.cc >>> +++ b/core/elf.cc >>> @@ -354,7 +354,7 @@ void object::set_base(void* base) >>> auto q = std::min_element(_phdrs.begin(), _phdrs.end(), >>> [](Elf64_Phdr a, Elf64_Phdr b) >>> { return a.p_type == PT_LOAD >>> - && a.p_vaddr > b.p_vaddr; }); >>> + && a.p_vaddr >= b.p_vaddr; }); >>> >>> I am compiling the test binaries and OSv with gcc-7.3.0. Is that a real >>> bug or am I missing something very obvious? >>> >> It seems that you indeed have found an old bug which seems to cause a >> harm when PT_LOAD and non-PT_LOAD headers share the same p_vaddr maybe. >> Thanks for reporting it. >> >> I am also not 100% sure what the exact intention behind what p and q >> should be but the fact that q ends up being a non-PT_LOAD headers seems to >> prove there is a bug. I think the issue lies in the fact that we are >> filtering and comparing at the same time. And while your patch might fix >> your specific case it may not be 100% correct either. I think we should >> separate filtering from comparing. My proposal is this: >> >> diff --git a/core/elf.cc b/core/elf.cc >> index ffb16004..61537756 100644 >> --- a/core/elf.cc >> +++ b/core/elf.cc >> @@ -347,14 +347,17 @@ static bool intersects_with_kernel(Elf64_Addr >> elf_addr) >> >> void object::set_base(void* base) >> { >> - auto p = std::min_element(_phdrs.begin(), _phdrs.end(), >> + std::vector<Elf64_Phdr> pt_load_headers; >> + std::copy_if(_phdrs.begin(), _phdrs.end(), std::back_inserter( >> pt_load_headers), >> + [](Elf64_Phdr p) >> + { return p.p_type == PT_LOAD; }); >> + >> + auto p = std::min_element(pt_load_headers.begin(), pt_load_headers. >> end(), >> [](Elf64_Phdr a, Elf64_Phdr b) >> - { return a.p_type == PT_LOAD >> - && a.p_vaddr < b.p_vaddr; }); >> - auto q = std::min_element(_phdrs.begin(), _phdrs.end(), >> + { return a.p_vaddr < b.p_vaddr; }); >> + auto q = std::max_element(pt_load_headers.begin(), pt_load_headers. >> end(), >> [](Elf64_Phdr a, Elf64_Phdr b) >> - { return a.p_type == PT_LOAD >> - && a.p_vaddr > b.p_vaddr; }); >> + { return a.p_vaddr < b.p_vaddr; }); >> >> if (!is_core() && is_non_pie_executable()) { >> // Verify non-PIE executable does not collide with the kernel >> >> Does it fix you case? >> >> I wonder what others think about. I also wonder if we should add a member >> variable or a helper method that will give us pt_load_headers as we filter >> PT_LOAD segments in other places in core/elf.cc >> >> Waldek >> >> >>> Thanks >>> Deepak >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "OSv Development" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/osv-dev/6035c111-a678-4a3f-b844-725c425e3bf9o%40googlegroups.com >> <https://groups.google.com/d/msgid/osv-dev/6035c111-a678-4a3f-b844-725c425e3bf9o%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CAL9cFfNSRKMZ1vVHizudq7N12WKjuUhi-aM2LJEYCrYC7VVJog%40mail.gmail.com.
