[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-06 Thread Greg Clayton via llvm-branch-commits


@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
 auto LastLE = FI.OptLineTable->last();
 if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line)
 continue;
+
 // Only push a row if it isn't an end sequence. End sequence markers are
 // included for the last address in a function or the last contiguous
 // address in a sequence.
-if (Row.EndSequence) {

clayborg wrote:

> I'm not quite following the collection of thoughts here, they seem disjoint 
> to me, so trying to discuss:
> 
> > We used to not break out on Row.EndSequence
> 
> I don't understand the relationship between end_sequence and functions with 
> discontiguous ranges - could you describe this connection in more detail?

If a `DW_TAG_subprogram` has N discontiguous ranges, we will create N 
FunctionInfo objects, one for each range. We will request the line table 
entries for each range in the `DW_TAG_subprogram`'s `DW_AT_ranges` attribute.  
So if there are end sequences in there, we try to keep going. If there is an 
end_sequence in the line table this is probably the indication of a bug in the 
line tables.

> 
> > as it allows functions to have discontiguous ranges.
> 
> Agreed with @pogo59, I believe both Bolt and Propeller can create 
> discontiguous address ranges for a function (but you'll see DW_AT_ranges on 
> the subprogram)

Yes, and as I mention above, we create individual FunctionInfo objects for each 
range and only request the line table entries for each individual range.
> 
> > I was assuming that if we asked for the rows for a given address range we 
> > wouldn't get all entries if two merged functions with different line table 
> > entries were found, but that assumption might not be correct?
> 
> Yeah, looking at the implementation of `lookupAddressRangeImpl` it finds a 
> single sequence that starts closest to the start address, then adds all rows 
> within that sequence that cover the range requested. So, no, it won't 
> retrieve addresses/ranges from multiple sequences.



https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-06 Thread David Blaikie via llvm-branch-commits


@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
 auto LastLE = FI.OptLineTable->last();
 if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line)
 continue;
+
 // Only push a row if it isn't an end sequence. End sequence markers are
 // included for the last address in a function or the last contiguous
 // address in a sequence.
-if (Row.EndSequence) {

dwblaikie wrote:

I'm not quite following the collection of thoughts here, they seem disjoint to 
me, so trying to discuss:

> We used to not break out on Row.EndSequence 

I don't understand the relationship between end_sequence and functions with 
discontiguous ranges - could you describe this connection in more detail?

> as it allows functions to have discontiguous ranges. 

Agreed with @pogo59, I believe both Bolt and Propeller can create discontiguous 
address ranges for a function (but you'll see DW_AT_ranges on the subprogram)

> I was assuming that if we asked for the rows for a given address range we 
> wouldn't get all entries if two merged functions with different line table 
> entries were found, but that assumption might not be correct?

Yeah, looking at the implementation of `lookupAddressRangeImpl` it finds a 
single sequence that starts closest to the start address, then adds all rows 
within that sequence that cover the range requested. So, no, it won't retrieve 
addresses/ranges from multiple sequences.

https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-01 Thread Paul T Robinson via llvm-branch-commits


@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
 auto LastLE = FI.OptLineTable->last();
 if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line)
 continue;
+
 // Only push a row if it isn't an end sequence. End sequence markers are
 // included for the last address in a function or the last contiguous
 // address in a sequence.
-if (Row.EndSequence) {

pogo59 wrote:

>  allows functions to have discontiguous ranges. Not sure if that happens.

I think it can, Bolt and/or Propeller put each basic block in its own section 
IIRC.

https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-01 Thread Greg Clayton via llvm-branch-commits


@@ -321,7 +321,10 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
   StartAddress, object::SectionedAddress::UndefSection};
 
 
-  if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize, RowVector)) {
+  // end_sequence markers can be located at RangeSize position,
+  // lookupAddressRange search up to RangeSize not inclusive, to include
+  // end_sequence markers it is necessary to lookup until RangeSize+1
+  if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize + 1, 
RowVector)) {

clayborg wrote:

If a line table has two functions that share the same address range within the 
same line table, this call currently will return only matches from the first 
sequence that contains an address. So we won't get all rows from all sequences 
that match. I checked the 
`DWARFDebugLine::LineTable::lookupAddressRangeImpl(...)` function to verify.

https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-01 Thread Greg Clayton via llvm-branch-commits

https://github.com/clayborg edited 
https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-01 Thread Greg Clayton via llvm-branch-commits


@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
 auto LastLE = FI.OptLineTable->last();
 if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line)
 continue;
+
 // Only push a row if it isn't an end sequence. End sequence markers are
 // included for the last address in a function or the last contiguous
 // address in a sequence.
-if (Row.EndSequence) {

clayborg wrote:

We used to not break out on Row.EndSequence as it allows functions to have 
discontiguous ranges. Not sure if that happens. I was assuming that if we asked 
for the rows for a given address range we wouldn't get all entries if two 
merged functions with different line table entries were found, but that 
assumption might not be correct?

https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-01 Thread Greg Clayton via llvm-branch-commits


@@ -354,6 +357,18 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
   for (uint32_t RowIndex : RowVector) {
 // Take file number and line/column from the row.
 const DWARFDebugLine::Row  = CUI.LineTable->Rows[RowIndex];
+
+// TODO(avillega): With this conditional, functions folded by `icf`
+// optimizations will only include 1 of all the folded functions. There is
+// not a clear path forward to have the information of all folded functions
+// in gsym.
+if (Row.EndSequence) {
+  // End sequence markers are included for the last address
+  // in a function or the last contiguos address in a sequence.
+  break;
+}
+
+

clayborg wrote:

What if a function is split into two discontiguous ranges? This will lose the 
line table information for any subsequent discontiguous ranges since we break 
out of the loop here.

https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-05-01 Thread Greg Clayton via llvm-branch-commits

https://github.com/clayborg commented:

Not sure this makes sense after checking the code for 
`DWARFDebugLine::LineTable::lookupAddressRangeImpl(...)`. If a line table has 
multiple sequences that contain an address, it will find the first sequence 
that contains the address and then return the rows for the function.

What is the effect of this change on the test case? Does it change the final 
line table in the GSYM file?

https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-04-29 Thread Andres Villegas via llvm-branch-commits

avillega wrote:

I think I can accomplish the same behaviour exposed in 
https://github.com/llvm/llvm-project/pull/89703 which requires a change to the 
DWARF apis without actually changing them. 


https://github.com/llvm/llvm-project/pull/90535
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-04-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-debuginfo

Author: Andres Villegas (avillega)


Changes

Work around for #46494.
This change adds debug_line end_sequence rows when converting
the function line tables. By including the end_sequence
it is possible to handle some edge cases like icf optimizations.


---

Patch is 23.47 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/90535.diff


2 Files Affected:

- (modified) llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp (+21-12) 
- (added) llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml (+564) 


``diff
diff --git a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp 
b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
index ff6b560d11726b..d12dce510e663e 100644
--- a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
+++ b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
@@ -321,7 +321,10 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
   StartAddress, object::SectionedAddress::UndefSection};
 
 
-  if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize, RowVector)) {
+  // end_sequence markers can be located at RangeSize position,
+  // lookupAddressRange search up to RangeSize not inclusive, to include
+  // end_sequence markers it is necessary to lookup until RangeSize+1
+  if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize + 1, 
RowVector)) {
 // If we have a DW_TAG_subprogram but no line entries, fall back to using
 // the DW_AT_decl_file an d DW_AT_decl_line if we have both attributes.
 std::string FilePath = Die.getDeclFile(
@@ -354,6 +357,18 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
   for (uint32_t RowIndex : RowVector) {
 // Take file number and line/column from the row.
 const DWARFDebugLine::Row  = CUI.LineTable->Rows[RowIndex];
+
+// TODO(avillega): With this conditional, functions folded by `icf`
+// optimizations will only include 1 of all the folded functions. There is
+// not a clear path forward to have the information of all folded functions
+// in gsym.
+if (Row.EndSequence) {
+  // End sequence markers are included for the last address
+  // in a function or the last contiguos address in a sequence.
+  break;
+}
+
+
 std::optional OptFileIdx =
 CUI.DWARFToGSYMFileIndex(Gsym, Row.File);
 if (!OptFileIdx) {
@@ -411,7 +426,7 @@ static void convertFunctionLineTable(OutputAggregator , 
CUInfo ,
   else
 Out.Report("Non-monotonically increasing addresses",
[&](raw_ostream ) {
- OS << "error: line table has addresses that do not "
+ OS << "warning: line table has addresses that do not "
 << "monotonically increase:\n";
  for (uint32_t RowIndex2 : RowVector)
CUI.LineTable->Rows[RowIndex2].dump(OS);
@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator 
, CUInfo ,
 auto LastLE = FI.OptLineTable->last();
 if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line)
 continue;
+
 // Only push a row if it isn't an end sequence. End sequence markers are
 // included for the last address in a function or the last contiguous
 // address in a sequence.
-if (Row.EndSequence) {
-  // End sequence means that the next line entry could have a lower address
-  // that the previous entries. So we clear the previous row so we don't
-  // trigger the line table error about address that do not monotonically
-  // increase.
-  PrevRow = DWARFDebugLine::Row();
-} else {
-  FI.OptLineTable->push(LE);
-  PrevRow = Row;
-}
+FI.OptLineTable->push(LE);
+PrevRow = Row;
+
   }
   // If not line table rows were added, clear the line table so we don't encode
   // on in the GSYM file.
diff --git a/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml 
b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml
new file mode 100644
index 00..0e1e507179057b
--- /dev/null
+++ b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml
@@ -0,0 +1,564 @@
+## Test loading an ELF file with DWARF with icf (identical code folding) 
+## optimizations.
+## First we make the ELF file from yaml,
+## then we convert the ELF file to GSYM, then we do lookups on the newly
+## created GSYM, and finally we dump the entire GSYM.
+##
+## The elf file corresponds to this c program:
+## int f() {
+##   return 1;
+## }
+## 
+## int g() {
+##   return 1;
+## }
+## 
+## int main() {
+##   f();
+##   g();
+##   return 0;
+## }
+
+# RUN: yaml2obj %s -o %t
+# RUN: llvm-gsymutil --convert %t --out-file=%t.gsym 2>&1 | FileCheck %s 
--check-prefix=CONVERT --dump-input=always
+
+# CONVERT-NOT: warning: line table has addresses that do not monotonically 
increase
+# CONVERT: Input file: {{.*\.yaml\.tmp}}
+# CONVERT: Output file (x86_64): {{.*\.yaml\.tmp\.gsym}}
+# CONVERT: Loaded 2 functions from 

[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)

2024-04-29 Thread Andres Villegas via llvm-branch-commits

https://github.com/avillega created 
https://github.com/llvm/llvm-project/pull/90535

Work around for #46494.
This change adds debug_line end_sequence rows when converting
the function line tables. By including the end_sequence
it is possible to handle some edge cases like icf optimizations.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits