Re: [Lldb-commits] FreeBSD kernel debugging fixes

2017-09-21 Thread Koropoff, Brian via lldb-commits
I've created a Phabricator review.

DWARF sections in executables/dylibs seem to specify 0 as the section address, 
and lldb depends on the abbreviation addresses in the debug_info section being 
0-relative.
My synthetic file address logic explicitly excludes non-program (e.g. symtab) 
and debug sections to avoid disturbing this arrangement.  It also doesn't touch 
non-.o files,
and neither does the relocation logic.

-- Brian


From: Greg Clayton [clayb...@gmail.com]
Sent: Thursday, September 21, 2017 8:26 AM
To: Koropoff, Brian
Cc: Jason Molenda; lldb-commits@lists.llvm.org
Subject: Re: [Lldb-commits] FreeBSD kernel debugging fixes

I am worried that this will adversely affect the normal DWARF that is in ELF 
files. What does typical DWARF look like when all of the file addresses in 
DWARF and the symtab are set correctly to unique virtual addresses? No 
relocations on any addresses? If all "file addresses" are set to unique offsets 
in each section, we don't need relocations. If there are no relocations on the 
DWARF, then we are probably ok.

Using .o files is tricky indeed as you are finding out. Are your modifications 
enabled by the fact that the object file is a .o file? As I mentioned above, we 
don't want any relocations applied to linked binaries. We do special things 
with .o files that we load in mach-o to ensure the segments are all happy, so 
there is definitely extra work required.

Please submit these patches via reviews.llvm.org web site. For details please 
see:

https://llvm.org/docs/Phabricator.html

Then it is easier to look at the changes because we can see the surrounding 
code. Once that happens we can start looking at the patches and offering 
inlined comments and have a discussion on any needed changes.

Greg Clayton


> On Sep 20, 2017, at 4:37 PM, Koropoff, Brian via lldb-commits 
> <lldb-commits@lists.llvm.org> wrote:
>
> Jason,
>
> I'm performing address to symbol resolution after setting load addresses for 
> all sections.  It correctly identifies the module
> and section where the address resides, and in many cases gives correct 
> results.  However, because it translates the
> load address to a file address before indexing into the symtab, overlapping 
> file addresses for sections in the same module
> can cause the wrong name to be returned, such as returning a symbol in the 
> bss section even though the address is
> in the text section.
>
> An alternative way to fix this would be to split m_file_addr_to_index into 
> per-section maps, but that doesn't solve the
> problem of ResolveFileAddress being unusable, or the general expectation 
> within lldb that file addresses uniquely
> identify something within a module.
>
> -- Brian
> 
> From: Jason Molenda [jmole...@apple.com]
> Sent: Wednesday, September 20, 2017 4:18 PM
> To: Koropoff, Brian
> Cc: lldb-commits@lists.llvm.org
> Subject: Re: [Lldb-commits] FreeBSD kernel debugging fixes
>
> Right, we always record symbol addresses as the offset to the section that 
> contains them.  The Address class in lldb is used everywhere for this.  The 
> Target has a SectionLoadList which tells us where each Section is loaded in 
> memory -- this is how you translate an Address object to a load address.
>
> When sections have not been given their load addresses yet, lldb will treat 
> file addresses == load addresses.  Which sounds like what you're seeing.  So 
> we are often in the situation where an address->symbol lookup results in 
> multiple symbols being matched; they are all overlapping at this point.
>
> As soon as the sections are given load addresses in the Target, then this 
> overlapping problem is resolved.
>
> gdb didn't have the difference between load address and file address and we 
> had to play games with shuffling things around to arbitrary addresses so they 
> don't overlap.  (and from what I can recall, changing the load address of a 
> binary in lldb meant going through the symbol table to update all the 
> addresses -- we wanted to separate the symbol table addresses from the load 
> addresses in a given target, so we came up with this system.)
>
>
> Are you trying to do address->symbol resolution before you know where the 
> binaries are actually loaded in the address space?  Or are you missing the 
> part that sets the load addresses for the sections in the Target?  I suspect 
> it's the latter.
>
>
>
>> On Sep 20, 2017, at 4:12 PM, Koropoff, Brian <brian.korop...@dell.com> wrote:
>>
>> Jason,
>>
>> I'm setting the load addresses appropriately for all sections in my script.  
>> The problem is that the symbol map
>> is internally indexed by the "file address", which is the virtual 

Re: [Lldb-commits] FreeBSD kernel debugging fixes

2017-09-21 Thread Greg Clayton via lldb-commits
I am worried that this will adversely affect the normal DWARF that is in ELF 
files. What does typical DWARF look like when all of the file addresses in 
DWARF and the symtab are set correctly to unique virtual addresses? No 
relocations on any addresses? If all "file addresses" are set to unique offsets 
in each section, we don't need relocations. If there are no relocations on the 
DWARF, then we are probably ok.

Using .o files is tricky indeed as you are finding out. Are your modifications 
enabled by the fact that the object file is a .o file? As I mentioned above, we 
don't want any relocations applied to linked binaries. We do special things 
with .o files that we load in mach-o to ensure the segments are all happy, so 
there is definitely extra work required.

Please submit these patches via reviews.llvm.org web site. For details please 
see:

https://llvm.org/docs/Phabricator.html

Then it is easier to look at the changes because we can see the surrounding 
code. Once that happens we can start looking at the patches and offering 
inlined comments and have a discussion on any needed changes.

Greg Clayton


> On Sep 20, 2017, at 4:37 PM, Koropoff, Brian via lldb-commits 
> <lldb-commits@lists.llvm.org> wrote:
> 
> Jason,
> 
> I'm performing address to symbol resolution after setting load addresses for 
> all sections.  It correctly identifies the module
> and section where the address resides, and in many cases gives correct 
> results.  However, because it translates the
> load address to a file address before indexing into the symtab, overlapping 
> file addresses for sections in the same module
> can cause the wrong name to be returned, such as returning a symbol in the 
> bss section even though the address is
> in the text section.
> 
> An alternative way to fix this would be to split m_file_addr_to_index into 
> per-section maps, but that doesn't solve the
> problem of ResolveFileAddress being unusable, or the general expectation 
> within lldb that file addresses uniquely
> identify something within a module.
> 
> -- Brian
> 
> From: Jason Molenda [jmole...@apple.com]
> Sent: Wednesday, September 20, 2017 4:18 PM
> To: Koropoff, Brian
> Cc: lldb-commits@lists.llvm.org
> Subject: Re: [Lldb-commits] FreeBSD kernel debugging fixes
> 
> Right, we always record symbol addresses as the offset to the section that 
> contains them.  The Address class in lldb is used everywhere for this.  The 
> Target has a SectionLoadList which tells us where each Section is loaded in 
> memory -- this is how you translate an Address object to a load address.
> 
> When sections have not been given their load addresses yet, lldb will treat 
> file addresses == load addresses.  Which sounds like what you're seeing.  So 
> we are often in the situation where an address->symbol lookup results in 
> multiple symbols being matched; they are all overlapping at this point.
> 
> As soon as the sections are given load addresses in the Target, then this 
> overlapping problem is resolved.
> 
> gdb didn't have the difference between load address and file address and we 
> had to play games with shuffling things around to arbitrary addresses so they 
> don't overlap.  (and from what I can recall, changing the load address of a 
> binary in lldb meant going through the symbol table to update all the 
> addresses -- we wanted to separate the symbol table addresses from the load 
> addresses in a given target, so we came up with this system.)
> 
> 
> Are you trying to do address->symbol resolution before you know where the 
> binaries are actually loaded in the address space?  Or are you missing the 
> part that sets the load addresses for the sections in the Target?  I suspect 
> it's the latter.
> 
> 
> 
>> On Sep 20, 2017, at 4:12 PM, Koropoff, Brian <brian.korop...@dell.com> wrote:
>> 
>> Jason,
>> 
>> I'm setting the load addresses appropriately for all sections in my script.  
>> The problem is that the symbol map
>> is internally indexed by the "file address", which is the virtual address 
>> that the ELF section asks to
>> be loaded at, regardless of what the actual load address turns out to be:
>> 
>> https://github.com/llvm-mirror/lldb/blob/master/source/Symbol/Symtab.cpp#L878
>> 
>> Symbol lookup proceeds via the file address:
>> 
>> https://github.com/llvm-mirror/lldb/blob/master/source/Core/Module.cpp#L510
>> 
>> From what I can gather, the use of file addresses is to avoid needing to 
>> recompute the symtab
>> when a load address is changed.  This implementation detail means that file 
>> addresses
>> must be non-overlapping even if the load ad

Re: [Lldb-commits] FreeBSD kernel debugging fixes

2017-09-20 Thread Koropoff, Brian via lldb-commits
Jason,

I'm performing address to symbol resolution after setting load addresses for 
all sections.  It correctly identifies the module
and section where the address resides, and in many cases gives correct results. 
 However, because it translates the
load address to a file address before indexing into the symtab, overlapping 
file addresses for sections in the same module
can cause the wrong name to be returned, such as returning a symbol in the bss 
section even though the address is
in the text section.

An alternative way to fix this would be to split m_file_addr_to_index into 
per-section maps, but that doesn't solve the
problem of ResolveFileAddress being unusable, or the general expectation within 
lldb that file addresses uniquely
identify something within a module.

-- Brian

From: Jason Molenda [jmole...@apple.com]
Sent: Wednesday, September 20, 2017 4:18 PM
To: Koropoff, Brian
Cc: lldb-commits@lists.llvm.org
Subject: Re: [Lldb-commits] FreeBSD kernel debugging fixes

Right, we always record symbol addresses as the offset to the section that 
contains them.  The Address class in lldb is used everywhere for this.  The 
Target has a SectionLoadList which tells us where each Section is loaded in 
memory -- this is how you translate an Address object to a load address.

When sections have not been given their load addresses yet, lldb will treat 
file addresses == load addresses.  Which sounds like what you're seeing.  So we 
are often in the situation where an address->symbol lookup results in multiple 
symbols being matched; they are all overlapping at this point.

As soon as the sections are given load addresses in the Target, then this 
overlapping problem is resolved.

gdb didn't have the difference between load address and file address and we had 
to play games with shuffling things around to arbitrary addresses so they don't 
overlap.  (and from what I can recall, changing the load address of a binary in 
lldb meant going through the symbol table to update all the addresses -- we 
wanted to separate the symbol table addresses from the load addresses in a 
given target, so we came up with this system.)


Are you trying to do address->symbol resolution before you know where the 
binaries are actually loaded in the address space?  Or are you missing the part 
that sets the load addresses for the sections in the Target?  I suspect it's 
the latter.



> On Sep 20, 2017, at 4:12 PM, Koropoff, Brian <brian.korop...@dell.com> wrote:
>
> Jason,
>
> I'm setting the load addresses appropriately for all sections in my script.  
> The problem is that the symbol map
> is internally indexed by the "file address", which is the virtual address 
> that the ELF section asks to
> be loaded at, regardless of what the actual load address turns out to be:
>
> https://github.com/llvm-mirror/lldb/blob/master/source/Symbol/Symtab.cpp#L878
>
> Symbol lookup proceeds via the file address:
>
> https://github.com/llvm-mirror/lldb/blob/master/source/Core/Module.cpp#L510
>
> From what I can gather, the use of file addresses is to avoid needing to 
> recompute the symtab
> when a load address is changed.  This implementation detail means that file 
> addresses
> must be non-overlapping even if the load addresses are correctly set.  The 
> generation of synthetic
> file addresses has the added benefit of permitting offline symbolication by 
> (module, file address) pair
> without needing to know the load map, which appears to be an intended use 
> case, e.g.
> SBModule::ResolveFileAddress():
>
> https://github.com/llvm-mirror/lldb/blob/master/include/lldb/API/SBModule.h#L120
>
> Regards,
> Brian Koropoff
> Dell EMC
>
> 
> From: Jason Molenda [jmole...@apple.com]
> Sent: Wednesday, September 20, 2017 3:47 PM
> To: Koropoff, Brian
> Cc: lldb-commits@lists.llvm.org
> Subject: Re: [Lldb-commits] FreeBSD kernel debugging fixes
>
> Regarding the overlapping files -- when lldb first loads multiple binaries 
> (but does not have a running process), it doesn't know where to set the load 
> addresses of these binaries so they are all 0-based (or if they have a 
> specified load address in the object file, at that address).
>
> We rely on the dynamic linker on the system to tell us where libc.so is, and 
> then we update the target's section load list with that address.
>
> For macos kernel debugging, we have a DynamicLoaderDarwinKernel that knows 
> how to load all the modules ("kexts") at the correct addresses for the 
> program.  The user can also do this manually in command line lldb, like
>
> target modules add 
> target modules load -f  -s  address>
>
> but it is correct behavior that in the absence of being told where the 
> binaries are loaded

Re: [Lldb-commits] FreeBSD kernel debugging fixes

2017-09-20 Thread Jason Molenda via lldb-commits
Right, we always record symbol addresses as the offset to the section that 
contains them.  The Address class in lldb is used everywhere for this.  The 
Target has a SectionLoadList which tells us where each Section is loaded in 
memory -- this is how you translate an Address object to a load address.

When sections have not been given their load addresses yet, lldb will treat 
file addresses == load addresses.  Which sounds like what you're seeing.  So we 
are often in the situation where an address->symbol lookup results in multiple 
symbols being matched; they are all overlapping at this point.

As soon as the sections are given load addresses in the Target, then this 
overlapping problem is resolved.

gdb didn't have the difference between load address and file address and we had 
to play games with shuffling things around to arbitrary addresses so they don't 
overlap.  (and from what I can recall, changing the load address of a binary in 
lldb meant going through the symbol table to update all the addresses -- we 
wanted to separate the symbol table addresses from the load addresses in a 
given target, so we came up with this system.)


Are you trying to do address->symbol resolution before you know where the 
binaries are actually loaded in the address space?  Or are you missing the part 
that sets the load addresses for the sections in the Target?  I suspect it's 
the latter.



> On Sep 20, 2017, at 4:12 PM, Koropoff, Brian <brian.korop...@dell.com> wrote:
> 
> Jason,
> 
> I'm setting the load addresses appropriately for all sections in my script.  
> The problem is that the symbol map
> is internally indexed by the "file address", which is the virtual address 
> that the ELF section asks to
> be loaded at, regardless of what the actual load address turns out to be:
> 
> https://github.com/llvm-mirror/lldb/blob/master/source/Symbol/Symtab.cpp#L878
> 
> Symbol lookup proceeds via the file address:
> 
> https://github.com/llvm-mirror/lldb/blob/master/source/Core/Module.cpp#L510
> 
> From what I can gather, the use of file addresses is to avoid needing to 
> recompute the symtab
> when a load address is changed.  This implementation detail means that file 
> addresses
> must be non-overlapping even if the load addresses are correctly set.  The 
> generation of synthetic
> file addresses has the added benefit of permitting offline symbolication by 
> (module, file address) pair
> without needing to know the load map, which appears to be an intended use 
> case, e.g.
> SBModule::ResolveFileAddress():
> 
> https://github.com/llvm-mirror/lldb/blob/master/include/lldb/API/SBModule.h#L120
> 
> Regards,
> Brian Koropoff
> Dell EMC
> 
> 
> From: Jason Molenda [jmole...@apple.com]
> Sent: Wednesday, September 20, 2017 3:47 PM
> To: Koropoff, Brian
> Cc: lldb-commits@lists.llvm.org
> Subject: Re: [Lldb-commits] FreeBSD kernel debugging fixes
> 
> Regarding the overlapping files -- when lldb first loads multiple binaries 
> (but does not have a running process), it doesn't know where to set the load 
> addresses of these binaries so they are all 0-based (or if they have a 
> specified load address in the object file, at that address).
> 
> We rely on the dynamic linker on the system to tell us where libc.so is, and 
> then we update the target's section load list with that address.
> 
> For macos kernel debugging, we have a DynamicLoaderDarwinKernel that knows 
> how to load all the modules ("kexts") at the correct addresses for the 
> program.  The user can also do this manually in command line lldb, like
> 
> target modules add 
> target modules load -f  -s  address>
> 
> but it is correct behavior that in the absence of being told where the 
> binaries are loaded in memory, lldb will load them all at their base address, 
> often 0 in the modern days of pic code.
> 
> 
> I haven't looked at the patch, but a long time ago I did hack that sounds 
> similar to yours for gdb, where I would assign binaries random addresses 
> until we had connected to a live process & learned where they should be.  So 
> address -> symbol resolution would work.  It never worked great and we 
> decided to avoid doing that in lldb.
> 
> 
>> On Sep 20, 2017, at 3:41 PM, Koropoff, Brian via lldb-commits 
>> <lldb-commits@lists.llvm.org> wrote:
>> 
>> Greetings.  I'm submitting a few patches that resolve issues I
>> encountered when using lldb to symbolicate FreeBSD kernel backtraces.
>> The problems mostly centered around FreeBSD kernel modules actually
>> being relocatable (.o) ELF Files.
>> 
>> The major problems:
>> 
>> - Relocations were not being applied to the DWARF debug in

Re: [Lldb-commits] FreeBSD kernel debugging fixes

2017-09-20 Thread Koropoff, Brian via lldb-commits
Jason,

I'm setting the load addresses appropriately for all sections in my script.  
The problem is that the symbol map
is internally indexed by the "file address", which is the virtual address that 
the ELF section asks to
be loaded at, regardless of what the actual load address turns out to be:

https://github.com/llvm-mirror/lldb/blob/master/source/Symbol/Symtab.cpp#L878

Symbol lookup proceeds via the file address:

https://github.com/llvm-mirror/lldb/blob/master/source/Core/Module.cpp#L510

From what I can gather, the use of file addresses is to avoid needing to 
recompute the symtab
when a load address is changed.  This implementation detail means that file 
addresses
must be non-overlapping even if the load addresses are correctly set.  The 
generation of synthetic
file addresses has the added benefit of permitting offline symbolication by 
(module, file address) pair
without needing to know the load map, which appears to be an intended use case, 
e.g.
SBModule::ResolveFileAddress():

https://github.com/llvm-mirror/lldb/blob/master/include/lldb/API/SBModule.h#L120

Regards,
Brian Koropoff
Dell EMC


From: Jason Molenda [jmole...@apple.com]
Sent: Wednesday, September 20, 2017 3:47 PM
To: Koropoff, Brian
Cc: lldb-commits@lists.llvm.org
Subject: Re: [Lldb-commits] FreeBSD kernel debugging fixes

Regarding the overlapping files -- when lldb first loads multiple binaries (but 
does not have a running process), it doesn't know where to set the load 
addresses of these binaries so they are all 0-based (or if they have a 
specified load address in the object file, at that address).

We rely on the dynamic linker on the system to tell us where libc.so is, and 
then we update the target's section load list with that address.

For macos kernel debugging, we have a DynamicLoaderDarwinKernel that knows how 
to load all the modules ("kexts") at the correct addresses for the program.  
The user can also do this manually in command line lldb, like

target modules add 
target modules load -f  -s 

but it is correct behavior that in the absence of being told where the binaries 
are loaded in memory, lldb will load them all at their base address, often 0 in 
the modern days of pic code.


I haven't looked at the patch, but a long time ago I did hack that sounds 
similar to yours for gdb, where I would assign binaries random addresses until 
we had connected to a live process & learned where they should be.  So address 
-> symbol resolution would work.  It never worked great and we decided to avoid 
doing that in lldb.


> On Sep 20, 2017, at 3:41 PM, Koropoff, Brian via lldb-commits 
> <lldb-commits@lists.llvm.org> wrote:
>
> Greetings.  I'm submitting a few patches that resolve issues I
> encountered when using lldb to symbolicate FreeBSD kernel backtraces.
> The problems mostly centered around FreeBSD kernel modules actually
> being relocatable (.o) ELF Files.
>
> The major problems:
>
> - Relocations were not being applied to the DWARF debug info despite
>   there being code to do this.  Several issues prevented it from working:
>
>   * Relocations are computed at the same time as the symbol table, but
> in the case of split debug files, symbol table parsing always
> redirects to the primary object file, meaning that relocations
> would never be applied in the debug file.
>
>   * There's actually no guarantee that the symbol table has been
> parsed yet when trying to parse debug information.
>
>   * When actually applying relocations, it will segfault because the
> object files are not mapped with MAP_PRIVATE and PROT_WRITE.
>
> - LLDB returned invalid results when performing ordinary
>   address-to-symbol resolution. It turned out that the addresses
>   specified in the section headers were all 0, so LLDB believed all the
>   sections had overlapping "file addresses" and would sometimes
>   return a symbol from the wrong section.
>
> I rearranged some of the symbol table parsing code to ensure
> relocations would get applied consistently and added manual calls to
> make sure it happens before trying to use DWARF info, but it feels
> kind of hacky.  I'm open to suggestions for refactoring it.
>
> I solved the file address problem by computing synthetic addresses for
> the sections in object files so that they would not overlap in LLDB's
> lookup maps.
>
> With all these changes I'm able to successfully symbolicate backtraces
> that pass through FreeBSD kernel modules.  Let me know if there is a
> better/cleaner way to achieve any of these fixes.
>
> --
>
> Brian Koropoff
> Dell EMC
> <0001-ObjectFile-ELF-use-private-memory-mappings.patch><0002-ObjectFile-ELF-ensure-relocations-are-done-for-split.patch><0003-SymbolFile-DWARF-force-application-of-relocation

Re: [Lldb-commits] FreeBSD kernel debugging fixes

2017-09-20 Thread Jason Molenda via lldb-commits
Regarding the overlapping files -- when lldb first loads multiple binaries (but 
does not have a running process), it doesn't know where to set the load 
addresses of these binaries so they are all 0-based (or if they have a 
specified load address in the object file, at that address).

We rely on the dynamic linker on the system to tell us where libc.so is, and 
then we update the target's section load list with that address.

For macos kernel debugging, we have a DynamicLoaderDarwinKernel that knows how 
to load all the modules ("kexts") at the correct addresses for the program.  
The user can also do this manually in command line lldb, like

target modules add 
target modules load -f  -s 

but it is correct behavior that in the absence of being told where the binaries 
are loaded in memory, lldb will load them all at their base address, often 0 in 
the modern days of pic code.


I haven't looked at the patch, but a long time ago I did hack that sounds 
similar to yours for gdb, where I would assign binaries random addresses until 
we had connected to a live process & learned where they should be.  So address 
-> symbol resolution would work.  It never worked great and we decided to avoid 
doing that in lldb.


> On Sep 20, 2017, at 3:41 PM, Koropoff, Brian via lldb-commits 
>  wrote:
> 
> Greetings.  I'm submitting a few patches that resolve issues I
> encountered when using lldb to symbolicate FreeBSD kernel backtraces.
> The problems mostly centered around FreeBSD kernel modules actually
> being relocatable (.o) ELF Files.
> 
> The major problems:
> 
> - Relocations were not being applied to the DWARF debug info despite
>   there being code to do this.  Several issues prevented it from working:
> 
>   * Relocations are computed at the same time as the symbol table, but
> in the case of split debug files, symbol table parsing always
> redirects to the primary object file, meaning that relocations
> would never be applied in the debug file.
> 
>   * There's actually no guarantee that the symbol table has been
> parsed yet when trying to parse debug information.
> 
>   * When actually applying relocations, it will segfault because the
> object files are not mapped with MAP_PRIVATE and PROT_WRITE.
> 
> - LLDB returned invalid results when performing ordinary
>   address-to-symbol resolution. It turned out that the addresses
>   specified in the section headers were all 0, so LLDB believed all the
>   sections had overlapping "file addresses" and would sometimes
>   return a symbol from the wrong section.
> 
> I rearranged some of the symbol table parsing code to ensure
> relocations would get applied consistently and added manual calls to
> make sure it happens before trying to use DWARF info, but it feels
> kind of hacky.  I'm open to suggestions for refactoring it.
> 
> I solved the file address problem by computing synthetic addresses for
> the sections in object files so that they would not overlap in LLDB's
> lookup maps.
> 
> With all these changes I'm able to successfully symbolicate backtraces
> that pass through FreeBSD kernel modules.  Let me know if there is a
> better/cleaner way to achieve any of these fixes.
> 
> --
> 
> Brian Koropoff
> Dell EMC
> <0001-ObjectFile-ELF-use-private-memory-mappings.patch><0002-ObjectFile-ELF-ensure-relocations-are-done-for-split.patch><0003-SymbolFile-DWARF-force-application-of-relocations.patch><0004-ObjectFile-ELF-create-synthetic-file-addresses-for-r.patch>___
> lldb-commits mailing list
> lldb-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] FreeBSD kernel debugging fixes

2017-09-20 Thread Koropoff, Brian via lldb-commits
Greetings.  I'm submitting a few patches that resolve issues I
encountered when using lldb to symbolicate FreeBSD kernel backtraces.
The problems mostly centered around FreeBSD kernel modules actually
being relocatable (.o) ELF Files.

The major problems:

- Relocations were not being applied to the DWARF debug info despite
  there being code to do this.  Several issues prevented it from working:

  * Relocations are computed at the same time as the symbol table, but
in the case of split debug files, symbol table parsing always
redirects to the primary object file, meaning that relocations
would never be applied in the debug file.

  * There's actually no guarantee that the symbol table has been
parsed yet when trying to parse debug information.

  * When actually applying relocations, it will segfault because the
object files are not mapped with MAP_PRIVATE and PROT_WRITE.

- LLDB returned invalid results when performing ordinary
  address-to-symbol resolution. It turned out that the addresses
  specified in the section headers were all 0, so LLDB believed all the
  sections had overlapping "file addresses" and would sometimes
  return a symbol from the wrong section.

I rearranged some of the symbol table parsing code to ensure
relocations would get applied consistently and added manual calls to
make sure it happens before trying to use DWARF info, but it feels
kind of hacky.  I'm open to suggestions for refactoring it.

I solved the file address problem by computing synthetic addresses for
the sections in object files so that they would not overlap in LLDB's
lookup maps.

With all these changes I'm able to successfully symbolicate backtraces
that pass through FreeBSD kernel modules.  Let me know if there is a
better/cleaner way to achieve any of these fixes.

--

Brian Koropoff
Dell EMC
From d923e9ff178c10f2b0c8747d09e8d13aa950df72 Mon Sep 17 00:00:00 2001
From: Brian Koropoff 
Date: Wed, 20 Sep 2017 10:44:08 -0700
Subject: [PATCH 1/4] ObjectFile:ELF: use private memory mappings

This is necessary to be able to apply relocations to DWARF info.
---
 source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index 0f67ab5..e2f986b 100644
--- a/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -405,7 +405,7 @@ ObjectFile *ObjectFileELF::CreateInstance(const lldb::ModuleSP _sp,
   lldb::offset_t length) {
   if (!data_sp) {
 data_sp =
-DataBufferLLVM::CreateSliceFromPath(file->GetPath(), length, file_offset);
+DataBufferLLVM::CreateSliceFromPath(file->GetPath(), length, file_offset, true);
 if (!data_sp)
   return nullptr;
 data_offset = 0;
@@ -423,7 +423,7 @@ ObjectFile *ObjectFileELF::CreateInstance(const lldb::ModuleSP _sp,
   // Update the data to contain the entire file if it doesn't already
   if (data_sp->GetByteSize() < length) {
 data_sp =
-DataBufferLLVM::CreateSliceFromPath(file->GetPath(), length, file_offset);
+DataBufferLLVM::CreateSliceFromPath(file->GetPath(), length, file_offset, true);
 if (!data_sp)
   return nullptr;
 data_offset = 0;
-- 
2.7.5

From 698b13b8fc43036e9df034363fa8a61b4707718e Mon Sep 17 00:00:00 2001
From: Brian Koropoff 
Date: Wed, 30 Aug 2017 18:49:16 -0700
Subject: [PATCH 2/4] ObjectFile:ELF: ensure relocations are done for split
 debug symbols

---
 source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp | 27 ++---
 source/Plugins/ObjectFile/ELF/ObjectFileELF.h   |  7 ++-
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index e2f986b..f6ea67e 100644
--- a/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -815,7 +815,8 @@ ObjectFileELF::ObjectFileELF(const lldb::ModuleSP _sp,
 : ObjectFile(module_sp, file, file_offset, length, data_sp, data_offset),
   m_header(), m_uuid(), m_gnu_debuglink_file(), m_gnu_debuglink_crc(0),
   m_program_headers(), m_section_headers(), m_dynamic_symbols(),
-  m_filespec_ap(), m_entry_point_address(), m_arch_spec() {
+  m_filespec_ap(), m_entry_point_address(), m_arch_spec(),
+  m_did_relocations(false) {
   if (file)
 m_file = *file;
   ::memset(_header, 0, sizeof(m_header));
@@ -2797,7 +2798,8 @@ unsigned ObjectFileELF::RelocateSection(
 }
 
 unsigned ObjectFileELF::RelocateDebugSections(const ELFSectionHeader *rel_hdr,
-  user_id_t rel_id) {
+  user_id_t rel_id,
+  lldb_private::Symtab *thetab) {