[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-13 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

In D68655#4048009 , @jasonmolenda 
wrote:

> In D68655#4047126 , @labath wrote:
>
>> In D68655#4045895 , @jasonmolenda 
>> wrote:
>>
>>> Both have to be written by the dwarf linker to be correct, but only the 
>>> former is written ONLY by the dwarf linker.
>>
>> I don't think that's right:
>>
>>   $ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - 
>> --sections | grep aranges
>> [ 6] .debug_arangesPROGBITS a0 30 00 
>>  0   0  1
>> [ 7] .rela.debug_aranges RELA   000380 30 18 
>>   I 20   6  8
>
> Ah thanks Pavel, clang on Darwin doesn't emit this in .o files.  So in this 
> case, the .o file has a DW_TAG_compile_unit with a DW_AT_ranges and a 
> .debug_aranges, both generated pre-link-time by the compiler.  And Greg is 
> saying that the DW_AT_ranges list in the final executable will be better than 
> the .debug_aranges?  I don't know how strongly he asserts this - today if a 
> .debug_aranges has an entry, we don't look at DW_TAG_compile_unit's 
> DW_AT_ranges, or create the ranges ourself via the line table; the 
> debug_aranges is trusted above the other methods if it includes CU.

It isn't better or worse and it varies per compiler. I believe, but I don't 
remember for sure, that DW_AT_ranges only currently has code address ranges (no 
data, so no globals or statics). So if a compiler in fact does include data in 
the .debug_aranges, it would be better to use .debug_aranges.

I tried to add a verifier into llvm-dwarfdump to catch when either 
.debug_aranges or DW_AT_ranges on the CU didn't contain some address ranges 
that were in the CU with:

https://reviews.llvm.org/D136395

The problem is the debugger has issues if either of these tables are incorrect, 
so you would think a verifier would be a good thing since if the address isn't 
in either .debug_aranges or DW_AT_ranges, we won't ever find a DIE for a 
function or global variable. I had seen bugs where we have .debug_aranges that 
didn't include all ranges for all DW_TAG_subprogram DIEs, and also cases where 
DW_AT_ranges didn't have some ranges. This usually happens when someone tries 
to LTO or optimize a binary and the DWARF didn't have a dedicate entry for a 
function, so it would drop address ranges from the .debug_aranges or 
DW_AT_ranges. Since everything non Darwin uses "concatenate + relocate" in the 
linker, image if you have a .debug_aranges entry that had [0x1000-0x2000), and 
you had DWARF with two functions that fall into those ranges: [0x1000-0x1500) 
for "func1" and [0x1500-0x2000) for "func2", and those functions get optimized 
into different places in the binary (not right next to each other), then there 
is no valid relocation to can apply to the range [0x1000-0x2000) that will make 
the table continue to work after linking. You will end up relocating the 0x1000 
address correctly for "func1", and if you apply the relocation for the end 
address of 0x2000, you can end up setting it to the end address of 0x2000, 
which could be less that the new address for the start of "func1". So lots of 
stuff goes wrong in the linking phase, especially when LTO is involved.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-13 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

In D68655#4045873 , @jasonmolenda 
wrote:

> I know this is all moot because the dSYM-specific patch landed, but I am 
> curious about this part,
>
> In D68655#4045561 , @clayborg wrote:
>
>> 
>
>
>
>> Different things are included in DW_AT_ranges, like address ranges for 
>> global and static variables. .debug_aranges only has functions, no globals 
>> or statics, so if you are trying to find a global variable by address, you 
>> can't rely on .debug_aranges. Nothing in the DWARF spec states things 
>> clearly enough for different compilers to know what to include in 
>> .debug_aranges  and the compiler uint DW_AT_ranges.
>
> The standard says,
>
> "Each descriptor is a triple consisting of a segment selector, the beginning 
> address within that segment of a range of text or data covered by some entry 
> owned by the corresponding compilation unit, followed by the non-zero length 
> of that range"
>
> It is pretty clear on the point that any part of the address space that can 
> be attributed to a compile_unit can be included in the debug_aranges range 
> list - if only code is included, that's a choice of the aranges writer.  lldb 
> itself, if debug_aranges is missing or doesn't include a CU, steps through 
> the line table concatenating addresses for all the line entries in the CU (v. 
> DWARFCompileUnit::BuildAddressRangeTable ) - it doesn't include data.  (it 
> will also use DW_AT_ranges from the compile_unit but I worry more about this 
> being copied verbatim from the .o file into the linked DWARF than I worry 
> about debug_aranges, personally)

Sorry, looks like the spec is pretty clear. From what I remember David Blakie 
saying, clang won't put globals or static (no data, just code) into the 
.debug_aranges. GCC might do this differently. I thought that LLDB would try to 
use the following in order:

- .debug_aranges for a CU if available
- fall back to DW_AT_ranges
- fall back to line tables only if DW_AT_ranges is not there

> In a DW_TAG_compile_unit, the DW_AT_ranges that the compiler puts in the .o 
> file isn't relevant to the final linked debug information, unless it included 
> the discrete range of every item which might be linked into the final binary, 
> and the DWARF linker only copies the range entries that were in the final 
> binary in the linked dwarf DW_AT_ranges range list.   Or if the dwarf linker 
> knows how to scan the DWARF compile_unit like lldb does, concatenating line 
> table entries or DW_TAG_subprogram's.

The Darwin workflow has a smart DWARF linker, so yes, it will regenerate only 
what is needed for the final output in dsymutil. The compiler have been trained 
to not emit .debug_aranges for Darwin triples because we know we don't need 
them since dsymutil will regenerate from the final output only what is needed.

All other platforms use standard linker technology where they will concatenate 
all DWARF sections to create the final DWARF sections and then apply 
relocations. So if you have a DW_AT_ranges in a CU that has 100 addresses in a 
.o file, and the final output only uses 10 functions and dead strips 90 
functions, the final .debug_aranges for that CU will still contain ranges for 
all functions, 10 of which will have had relocations applied and will have 
valid address ranges, and 90 others will start at the sentinel addresss of zero 
or -1 to indicate they should have been dead stripped, but there can be tons of 
wasted data that isn't needed in each .debug_aranges CU data.

> If any dwarf linker is doing all of this work to construct an accurate & 
> comprehensive compile_unit DW_AT_ranges, why couldn't it be trusted to also 
> have an identical/accurate dwarf_aranges for this cu?  I don't see the 
> difference.

Only dsymutil does this as this is the only smart DWARF linker at the moment! 
See above comment as to why. We do have llvm-dwarfutil now for ELF files that 
can optimize DWARF after it has been linked to remove all of this extra junk 
and unique types. llvm-dwarfutil uses the same DWARF linking engine as 
dsymutil. But all other systems are concatenate all sections, then apply 
relocations, and anything that doesn't have a relocation gets a sentinel 
address applied as the relocation.

In D68655#4045895 , @jasonmolenda 
wrote:

> I guess to say it shorter.  If I have a dwarf_aranges, that means the dwarf 
> linker created it.  And if it created it, surely its at least based off of 
> the subprogram address ranges or the line table -- that is, the text address 
> ranges.

True for Darwin and dsymutil yes.

> If I have a DW_TAG_compile_unit DW_AT_ranges, either the compiler (to the .o 
> file) created it, in which case I really am suspicious of those ranges 
> because the compiler can't know which symbols will end up in the final 
> executable, and the addresses in the ranges were simply 

[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-12 Thread Pavel Labath via Phabricator via lldb-commits
labath added a comment.

In D68655#4048009 , @jasonmolenda 
wrote:

> In D68655#4047126 , @labath wrote:
>
>> In D68655#4045895 , @jasonmolenda 
>> wrote:
>>
>>> Both have to be written by the dwarf linker to be correct, but only the 
>>> former is written ONLY by the dwarf linker.
>>
>> I don't think that's right:
>>
>>   $ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - 
>> --sections | grep aranges
>> [ 6] .debug_arangesPROGBITS a0 30 00 
>>  0   0  1
>> [ 7] .rela.debug_aranges RELA   000380 30 18 
>>   I 20   6  8
>
> Ah thanks Pavel, clang on Darwin doesn't emit this in .o files.

It does. It just doesn't do it by default -- same as on linux (but not on PS4 I 
believe). You have to use the -gdwarf-aranges flag explicitly (which, quite 
possibly, noone does),

> So in this case, the .o file has a DW_TAG_compile_unit with a DW_AT_ranges 
> and a .debug_aranges, both generated pre-link-time by the compiler.

Yes.

> And Greg is saying that the DW_AT_ranges list in the final executable will be 
> better than the .debug_aranges?  I don't know how strongly he asserts this - 
> today if a .debug_aranges has an entry, we don't look at 
> DW_TAG_compile_unit's DW_AT_ranges, or create the ranges ourself via the line 
> table; the debug_aranges is trusted above the other methods if it includes CU.

That I don't know. Personally, I would expect the two to be equivalent, except 
that the debug_aranges could theoretically be faster to parse (and takes up 
more space).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-12 Thread Jason Molenda via Phabricator via lldb-commits
jasonmolenda added a comment.

In D68655#4047126 , @labath wrote:

> In D68655#4045895 , @jasonmolenda 
> wrote:
>
>> Both have to be written by the dwarf linker to be correct, but only the 
>> former is written ONLY by the dwarf linker.
>
> I don't think that's right:
>
>   $ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - 
> --sections | grep aranges
> [ 6] .debug_arangesPROGBITS a0 30 00  
> 0   0  1
> [ 7] .rela.debug_aranges RELA   000380 30 18  
>  I 20   6  8

Ah thanks Pavel, clang on Darwin doesn't emit this in .o files.  So in this 
case, the .o file has a DW_TAG_compile_unit with a DW_AT_ranges and a 
.debug_aranges, both generated pre-link-time by the compiler.  And Greg is 
saying that the DW_AT_ranges list in the final executable will be better than 
the .debug_aranges?  I don't know how strongly he asserts this - today if a 
.debug_aranges has an entry, we don't look at DW_TAG_compile_unit's 
DW_AT_ranges, or create the ranges ourself via the line table; the 
debug_aranges is trusted above the other methods if it includes CU.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-12 Thread Pavel Labath via Phabricator via lldb-commits
labath added a comment.

In D68655#4045895 , @jasonmolenda 
wrote:

> Both have to be written by the dwarf linker to be correct, but only the 
> former is written ONLY by the dwarf linker.

I don't think that's right:

  $ clang -c -x c - -o - -gdwarf-aranges -g <<<"void f(){}" | llvm-readelf - 
--sections | grep aranges
[ 6] .debug_arangesPROGBITS a0 30 00
  0   0  1
[ 7] .rela.debug_aranges RELA   000380 30 18   
I 20   6  8


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Jason Molenda via Phabricator via lldb-commits
jasonmolenda added a comment.

I guess to say it shorter.  If I have a dwarf_aranges, that means the dwarf 
linker created it.  And if it created it, surely its at least based off of the 
subprogram address ranges or the line table -- that is, the text address 
ranges.  If I have a DW_TAG_compile_unit DW_AT_ranges, either the compiler (to 
the .o file) created it, in which case I really am suspicious of those ranges 
because the compiler can't know which symbols will end up in the final 
executable, and the addresses in the ranges were simply translated to the final 
executable address equivalents.  Or it was rewritten by a dwarf linker that 
parsed the DWARF and knew how to correctly calculate the addresses that 
correspond to that compile unit.

if anything, I would trust a dwarf_aranges entry for a CU before I would trust 
the CU's DW_AT_ranges list.  Both have to be written by the dwarf linker to be 
correct, but only the former is written ONLY by the dwarf linker.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Jason Molenda via Phabricator via lldb-commits
jasonmolenda added a comment.

I know this is all moot because the dSYM-specific patch landed, but I am 
curious about this part,

In D68655#4045561 , @clayborg wrote:

> 



> Different things are included in DW_AT_ranges, like address ranges for global 
> and static variables. .debug_aranges only has functions, no globals or 
> statics, so if you are trying to find a global variable by address, you can't 
> rely on .debug_aranges. Nothing in the DWARF spec states things clearly 
> enough for different compilers to know what to include in .debug_aranges  and 
> the compiler uint DW_AT_ranges.

The standard says,

"Each descriptor is a triple consisting of a segment selector, the beginning 
address within that segment of a range of text or data covered by some entry 
owned by the corresponding compilation unit, followed by the non-zero length of 
that range"

It is pretty clear on the point that any part of the address space that can be 
attributed to a compile_unit can be included in the debug_aranges range list - 
if only code is included, that's a choice of the aranges writer.  lldb itself, 
if debug_aranges is missing or doesn't include a CU, steps through the line 
table concatenating addresses for all the line entries in the CU (v. 
DWARFCompileUnit::BuildAddressRangeTable ) - it doesn't include data.  (it will 
also use DW_AT_ranges from the compile_unit but I worry more about this being 
copied verbatim from the .o file into the linked DWARF than I worry about 
debug_aranges, personally)

In a DW_TAG_compile_unit, the DW_AT_ranges that the compiler puts in the .o 
file isn't relevant to the final linked debug information, unless it included 
the discrete range of every item which might be linked into the final binary, 
and the DWARF linker only copies the range entries that were in the final 
binary in the linked dwarf DW_AT_ranges range list.   Or if the dwarf linker 
knows how to scan the DWARF compile_unit like lldb does, concatenating line 
table entries or DW_TAG_subprogram's.

If any dwarf linker is doing all of this work to construct an accurate & 
comprehensive compile_unit DW_AT_ranges, why couldn't it be trusted to also 
have an identical/accurate dwarf_aranges for this cu?  I don't see the 
difference.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

In D68655#4045604 , @JDevlieghere 
wrote:

> In D68655#4045561 , @clayborg wrote:
>
>> But we still need know if we have a dSYM file or not, because if not, we 
>> can't use .debug_aranges as .o files don't have them, but they also don't 
>> claim to be of type ObjectFile::eTypeDebugInfo.
>
> I came to the same realization and fixed that in 
> https://reviews.llvm.org/rG9b737f148d88501a0a778e1adacf342108286bb0

nice! approved that one as the detection is solid.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Jonas Devlieghere via Phabricator via lldb-commits
JDevlieghere added a comment.

In D68655#4045561 , @clayborg wrote:

> But we still need know if we have a dSYM file or not, because if not, we 
> can't use .debug_aranges as .o files don't have them, but they also don't 
> claim to be of type ObjectFile::eTypeDebugInfo.

I came to the same realization and fixed that in 
https://reviews.llvm.org/rG9b737f148d88501a0a778e1adacf342108286bb0


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

In D68655#4045461 , @jasonmolenda 
wrote:

> In D68655#4045423 , @clayborg wrote:
>
>> There have also been some bugs in .debug_aranges and some folks want to get 
>> rid of .debug_aranges all together and rely only upon the 
>> DW_TAG_compile_unit's DW_AT_ranges. Tons of details in this patch:
>>
>> https://reviews.llvm.org/D136395
>>
>> So long story short, some people believe we should not produce or use 
>> .debug_aranges anymore, and want LLDB changed to ignore it.
>
> I'm honestly curious how the DW_TAG_compile_unit's DW_AT_ranges (or 
> DW_AT_high/low_pc) would be authoritative, but the ranges for the CU in 
> debug_aranges would be incorrect.  If the linker is allowed to reorder 
> functions (does this not happen on linux?), then the compiler can't know the 
> address ranges of the CU, only the linker (or post-linker) can know what 
> address ranges are owned by this compile unit.  So the linker had to create 
> the DW_TAG_compile_unit's DW_AT_ranges list after linking is complete, and 
> the linker is also the part of the toolchain that creates debug_aranges.  If 
> it is creating an incorrect debug_aranges entry, how can it possibly do the 
> correct thing when rewriting the DW_TAG_compile_unit's DW_AT_ranges?

They are all the same from the linker perspective: both .debug_aranges (for 
accelerator table) and .debug_ranges (for DW_AT_ranges) have relocations on 
them. As linkers move stuff around they apply the relocations and they all stay 
up to date. Anything without a relocation has its start address zeroed out.

> On Darwin, we usually get a DW_AT_high/low_pc on the compile_unit in the .o 
> files, where all of the functions are contiguous.  When dsymutil links the 
> debug information into the dSYM, it includes the address ranges of every 
> DW_TAG_subprogram that was included in the final executable in the 
> debug_aranges.  It doesn't modify DW_AT_high/low_pc in the compile_unit, it 
> merely updates the addresses to their linked addresses.  But lldb doesn't use 
> those for anything - either it trusts the debug_aranges entries, or lldb 
> scans all of the subprograms itself to construct the aranges (same thing 
> dsymutil did).

dsymutil generates correct .debug_aranges by looking at the final DWARF. Note 
that there are no .debug_aranges sections in .o files since we don't use them, 
they only get produced if we have a dSYM for darwin.

> tl;dr I don't see what using DW_AT_ranges in the compile_unit accomplishes 
> versus debug_aranges.

Different things are included in DW_AT_ranges, like address ranges for global 
and static variables. .debug_aranges only has functions, no globals or statics, 
so if you are trying to find a global variable by address, you can't rely on 
.debug_aranges. Nothing in the DWARF spec states things clearly enough for 
different compilers to know what to include in .debug_aranges  and the compiler 
uint DW_AT_ranges. Each compiler (GCC and clang) do their own thing and often 
do things differently. So the other way to add functionality like this would be 
to check the target triple and check for Apple as the vendor maybe. But we 
still need know if we have a dSYM file or not, because if not, we can't use 
.debug_aranges as .o files don't have them, but they also don't claim to be of 
type ObjectFile::eTypeDebugInfo.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Jason Molenda via Phabricator via lldb-commits
jasonmolenda added a comment.

In D68655#4045423 , @clayborg wrote:

> There have also been some bugs in .debug_aranges and some folks want to get 
> rid of .debug_aranges all together and rely only upon the 
> DW_TAG_compile_unit's DW_AT_ranges. Tons of details in this patch:
>
> https://reviews.llvm.org/D136395
>
> So long story short, some people believe we should not produce or use 
> .debug_aranges anymore, and want LLDB changed to ignore it.

I'm honestly curious how the DW_TAG_compile_unit's DW_AT_ranges (or 
DW_AT_high/low_pc) would be authoritative, but the ranges for the CU in 
debug_aranges would be incorrect.  If the linker is allowed to reorder 
functions (does this not happen on linux?), then the compiler can't know the 
address ranges of the CU, only the linker (or post-linker) can know what 
address ranges are owned by this compile unit.  So the linker had to create the 
DW_TAG_compile_unit's DW_AT_ranges list after linking is complete, and the 
linker is also the part of the toolchain that creates debug_aranges.  If it is 
creating an incorrect debug_aranges entry, how can it possibly do the correct 
thing when rewriting the DW_TAG_compile_unit's DW_AT_ranges?

On Darwin, we usually get a DW_AT_high/low_pc on the compile_unit in the .o 
files, where all of the functions are contiguous.  When dsymutil links the 
debug information into the dSYM, it includes the address ranges of every 
DW_TAG_subprogram that was included in the final executable in the 
debug_aranges.  It doesn't modify DW_AT_high/low_pc in the compile_unit, it 
merely updates the addresses to their linked addresses.  But lldb doesn't use 
those for anything - either it trusts the debug_aranges entries, or lldb scans 
all of the subprograms itself to construct the aranges (same thing dsymutil 
did).

tl;dr I don't see what using DW_AT_ranges in the compile_unit accomplishes 
versus debug_aranges.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

In D68655#4045420 , @clayborg wrote:

> SymbolVendorELF.cpp is using this as well:
>
>   dsym_objfile_sp->SetType(ObjectFile::eTypeDebugInfo);
>
> They must have copied and pasted some code. So not sure this is safe enough 
> as is to implement. There are also a few places that try to check for a dSYM 
> file using this kind of method, might be nice to add a method, that is done 
> right, that detects if we really have a dSYM or not in ObjectFile.h/.cpp. 
> Wasm also sets the file type to debug info and so does breakpad.

The best way to check for dSYM would be to read the bytes from the mach-o 
header that specify MH_DSYM from the data for the object file.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

There have also been some bugs in .debug_aranges and some folks want to get rid 
of .debug_aranges all together and rely only upon the DW_TAG_compile_unit's 
DW_AT_ranges. Tons of details in this patch:

https://reviews.llvm.org/D136395

So long story short, some people believe we should not produce or use 
.debug_aranges anymore, and want LLDB changed to ignore it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-11 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

SymbolVendorELF.cpp is using this as well:

  dsym_objfile_sp->SetType(ObjectFile::eTypeDebugInfo);

They must have copied and pasted some code. So not sure this is safe enough as 
is to implement. There are also a few places that try to check for a dSYM file 
using this kind of method, might be nice to add a method, that is done right, 
that detects if we really have a dSYM or not in ObjectFile.h/.cpp. Wasm also 
sets the file type to debug info and so does breakpad.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-09 Thread Jonas Devlieghere via Phabricator via lldb-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8b259fe573e1: [lldb] Trust the arange accelerator tables in 
dSYMs (authored by JDevlieghere).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

Files:
  lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -53,13 +53,20 @@
   }
 
   // Manually build arange data for everything that wasn't in .debug_aranges.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  // Skip this step for dSYMs as we can trust dsymutil to have emitted complete
+  // aranges.
+  const bool is_dsym =
+  m_dwarf.GetObjectFile() &&
+  m_dwarf.GetObjectFile()->GetType() == ObjectFile::eTypeDebugInfo;
+  if (!is_dsym) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -53,13 +53,20 @@
   }
 
   // Manually build arange data for everything that wasn't in .debug_aranges.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  // Skip this step for dSYMs as we can trust dsymutil to have emitted complete
+  // aranges.
+  const bool is_dsym =
+  m_dwarf.GetObjectFile() &&
+  m_dwarf.GetObjectFile()->GetType() == ObjectFile::eTypeDebugInfo;
+  if (!is_dsym) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-09 Thread Jason Molenda via Phabricator via lldb-commits
jasonmolenda added a comment.

I agree with landing this change.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2023-01-09 Thread Jonas Devlieghere via Phabricator via lldb-commits
JDevlieghere updated this revision to Diff 487509.
JDevlieghere added a comment.

> Apparently for very small / gapless programs, clang emits single-value 
> DW_AT_low_pc/high_pc in the compile unit, and dsymutil does not rewrite that 
> into DW_AT_ranges.

Although this statement is correct, it shouldn't affect this patch. Even when 
the compile units uses DW_AT_low_pc/high_pc, dsymutil still emits the correct 
aranges. Without a counter-example we should go ahead with this patch.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

Files:
  lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -53,13 +53,20 @@
   }
 
   // Manually build arange data for everything that wasn't in .debug_aranges.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  // Skip this step for dSYMs as we can trust dsymutil to have emitted complete
+  // aranges.
+  const bool is_dsym =
+  m_dwarf.GetObjectFile() &&
+  m_dwarf.GetObjectFile()->GetType() == ObjectFile::eTypeDebugInfo;
+  if (!is_dsym) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -53,13 +53,20 @@
   }
 
   // Manually build arange data for everything that wasn't in .debug_aranges.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  // Skip this step for dSYMs as we can trust dsymutil to have emitted complete
+  // aranges.
+  const bool is_dsym =
+  m_dwarf.GetObjectFile() &&
+  m_dwarf.GetObjectFile()->GetType() == ObjectFile::eTypeDebugInfo;
+  if (!is_dsym) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2019-10-10 Thread Adrian Prantl via Phabricator via lldb-commits
aprantl added a comment.

I wouldn't call it a bug, but if we changed dsymutil to always rewrite 
DW_AT_low_pc into DW_AT_ranges, we could trust the .debug_ranges section to be 
complete.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2019-10-10 Thread Greg Clayton via Phabricator via lldb-commits
clayborg added a comment.

In D68655#1700626 , @aprantl wrote:

> I had to revert this. Apparently for very small / gapless programs, clang 
> emits single-value DW_AT_low_pc/high_pc in the compile unit, and dsymutil 
> does not rewrite that into DW_AT_ranges. Would be a nice speedup though.


So is this a bug in dsymutil? It is also legal for compilers to do this. I have 
also seen a lot of DWARF that has issues where the DW_AT_ranges for a 
DW_TAG_compile_unit is not complete and is missing functions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2019-10-08 Thread Adrian Prantl via Phabricator via lldb-commits
aprantl added a comment.

I had to revert this. Apparently for very small / gapless programs, clang emits 
single-value DW_AT_low_pc/high_pc in the compile unit, and dsymutil does not 
rewrite that into DW_AT_ranges. Would be a nice speedup though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2019-10-08 Thread Adrian Prantl via Phabricator via lldb-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG6399db2f6fd6: Trust the arange accelerator tables in dSYMs 
(authored by aprantl).
Herald added a project: LLDB.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655

Files:
  lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -54,13 +54,19 @@
 
   // Manually build arange data for everything that wasn't in the
   // .debug_aranges table.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  //
+  // This step is skipped for dSYMs and other debug-info-only
+  // objects, which are always trusted to have a complete table.
+  auto *obj = m_dwarf.GetObjectFile();
+  if (!obj || obj->GetType() != ObjectFile::eTypeDebugInfo) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -54,13 +54,19 @@
 
   // Manually build arange data for everything that wasn't in the
   // .debug_aranges table.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  //
+  // This step is skipped for dSYMs and other debug-info-only
+  // objects, which are always trusted to have a complete table.
+  auto *obj = m_dwarf.GetObjectFile();
+  if (!obj || obj->GetType() != ObjectFile::eTypeDebugInfo) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2019-10-08 Thread Adrian Prantl via Phabricator via lldb-commits
aprantl added a comment.

In D68655#1700314 , @jasonmolenda 
wrote:

> Nice fix.  The real cost of computing the aranges manually is expanding all 
> the line tables and calling GetContiguousFileAddressRanges on them.  I wonder 
> if avoiding the full linetable expansion here is shifting that to a different 
> place for this operation.


No, the total user CPU time reported by Instruments goes from ~7000ms -> 
~3500ms, but the wall clock from ~6s -> ~5s (both ±.5s).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2019-10-08 Thread Jason Molenda via Phabricator via lldb-commits
jasonmolenda accepted this revision.
jasonmolenda added a comment.
This revision is now accepted and ready to land.

Nice fix.  The real cost of computing the aranges manually is expanding all the 
line tables and calling GetContiguousFileAddressRanges on them.  I wonder if 
avoiding the full linetable expansion here is shifting that to a different 
place for this operation.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68655/new/

https://reviews.llvm.org/D68655



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D68655: Trust the arange accelerator tables in dSYMs

2019-10-08 Thread Adrian Prantl via Phabricator via lldb-commits
aprantl created this revision.
aprantl added reviewers: jasonmolenda, jingham, friss.
Herald added a subscriber: JDevlieghere.
Herald added a reviewer: JDevlieghere.

When ingesting aranges from a dSYM it makes sense to always trust the contents 
of the accelerator table since it always comes from dsymutil. According to 
Instruments, skipping the decoding of all CU DIEs to get at the DW_AT_ranges 
attribute removes ~3.5 seconds from setting a breakpoint by file/line when 
debugging clang with a dSYM. Interestingly on the wall clock the speedup is 
less noticeable, but still present.

rdar://problem/56057688


https://reviews.llvm.org/D68655

Files:
  lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -54,13 +54,19 @@
 
   // Manually build arange data for everything that wasn't in the
   // .debug_aranges table.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  //
+  // This step is skipped for dSYMs and other debug-info-only
+  // objects, which are always trusted to have a complete table.
+  auto *obj = m_dwarf.GetObjectFile();
+  if (!obj || obj->GetType() != ObjectFile::eTypeDebugInfo) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;


Index: lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
===
--- lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -54,13 +54,19 @@
 
   // Manually build arange data for everything that wasn't in the
   // .debug_aranges table.
-  const size_t num_units = GetNumUnits();
-  for (size_t idx = 0; idx < num_units; ++idx) {
-DWARFUnit *cu = GetUnitAtIndex(idx);
-
-dw_offset_t offset = cu->GetOffset();
-if (cus_with_data.find(offset) == cus_with_data.end())
-  cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+  //
+  // This step is skipped for dSYMs and other debug-info-only
+  // objects, which are always trusted to have a complete table.
+  auto *obj = m_dwarf.GetObjectFile();
+  if (!obj || obj->GetType() != ObjectFile::eTypeDebugInfo) {
+const size_t num_units = GetNumUnits();
+for (size_t idx = 0; idx < num_units; ++idx) {
+  DWARFUnit *cu = GetUnitAtIndex(idx);
+
+  dw_offset_t offset = cu->GetOffset();
+  if (cus_with_data.find(offset) == cus_with_data.end())
+cu->BuildAddressRangeTable(m_cu_aranges_up.get());
+}
   }
 
   const bool minimize = true;
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits