[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , auto LastLE = FI.OptLineTable->last(); if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line) continue; + // Only push a row if it isn't an end sequence. End sequence markers are // included for the last address in a function or the last contiguous // address in a sequence. -if (Row.EndSequence) { clayborg wrote: > I'm not quite following the collection of thoughts here, they seem disjoint > to me, so trying to discuss: > > > We used to not break out on Row.EndSequence > > I don't understand the relationship between end_sequence and functions with > discontiguous ranges - could you describe this connection in more detail? If a `DW_TAG_subprogram` has N discontiguous ranges, we will create N FunctionInfo objects, one for each range. We will request the line table entries for each range in the `DW_TAG_subprogram`'s `DW_AT_ranges` attribute. So if there are end sequences in there, we try to keep going. If there is an end_sequence in the line table this is probably the indication of a bug in the line tables. > > > as it allows functions to have discontiguous ranges. > > Agreed with @pogo59, I believe both Bolt and Propeller can create > discontiguous address ranges for a function (but you'll see DW_AT_ranges on > the subprogram) Yes, and as I mention above, we create individual FunctionInfo objects for each range and only request the line table entries for each individual range. > > > I was assuming that if we asked for the rows for a given address range we > > wouldn't get all entries if two merged functions with different line table > > entries were found, but that assumption might not be correct? > > Yeah, looking at the implementation of `lookupAddressRangeImpl` it finds a > single sequence that starts closest to the start address, then adds all rows > within that sequence that cover the range requested. So, no, it won't > retrieve addresses/ranges from multiple sequences. https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , auto LastLE = FI.OptLineTable->last(); if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line) continue; + // Only push a row if it isn't an end sequence. End sequence markers are // included for the last address in a function or the last contiguous // address in a sequence. -if (Row.EndSequence) { dwblaikie wrote: I'm not quite following the collection of thoughts here, they seem disjoint to me, so trying to discuss: > We used to not break out on Row.EndSequence I don't understand the relationship between end_sequence and functions with discontiguous ranges - could you describe this connection in more detail? > as it allows functions to have discontiguous ranges. Agreed with @pogo59, I believe both Bolt and Propeller can create discontiguous address ranges for a function (but you'll see DW_AT_ranges on the subprogram) > I was assuming that if we asked for the rows for a given address range we > wouldn't get all entries if two merged functions with different line table > entries were found, but that assumption might not be correct? Yeah, looking at the implementation of `lookupAddressRangeImpl` it finds a single sequence that starts closest to the start address, then adds all rows within that sequence that cover the range requested. So, no, it won't retrieve addresses/ranges from multiple sequences. https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , auto LastLE = FI.OptLineTable->last(); if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line) continue; + // Only push a row if it isn't an end sequence. End sequence markers are // included for the last address in a function or the last contiguous // address in a sequence. -if (Row.EndSequence) { pogo59 wrote: > allows functions to have discontiguous ranges. Not sure if that happens. I think it can, Bolt and/or Propeller put each basic block in its own section IIRC. https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
@@ -321,7 +321,10 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , StartAddress, object::SectionedAddress::UndefSection}; - if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize, RowVector)) { + // end_sequence markers can be located at RangeSize position, + // lookupAddressRange search up to RangeSize not inclusive, to include + // end_sequence markers it is necessary to lookup until RangeSize+1 + if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize + 1, RowVector)) { clayborg wrote: If a line table has two functions that share the same address range within the same line table, this call currently will return only matches from the first sequence that contains an address. So we won't get all rows from all sequences that match. I checked the `DWARFDebugLine::LineTable::lookupAddressRangeImpl(...)` function to verify. https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
https://github.com/clayborg edited https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
@@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , auto LastLE = FI.OptLineTable->last(); if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line) continue; + // Only push a row if it isn't an end sequence. End sequence markers are // included for the last address in a function or the last contiguous // address in a sequence. -if (Row.EndSequence) { clayborg wrote: We used to not break out on Row.EndSequence as it allows functions to have discontiguous ranges. Not sure if that happens. I was assuming that if we asked for the rows for a given address range we wouldn't get all entries if two merged functions with different line table entries were found, but that assumption might not be correct? https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
@@ -354,6 +357,18 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , for (uint32_t RowIndex : RowVector) { // Take file number and line/column from the row. const DWARFDebugLine::Row = CUI.LineTable->Rows[RowIndex]; + +// TODO(avillega): With this conditional, functions folded by `icf` +// optimizations will only include 1 of all the folded functions. There is +// not a clear path forward to have the information of all folded functions +// in gsym. +if (Row.EndSequence) { + // End sequence markers are included for the last address + // in a function or the last contiguos address in a sequence. + break; +} + + clayborg wrote: What if a function is split into two discontiguous ranges? This will lose the line table information for any subsequent discontiguous ranges since we break out of the loop here. https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
https://github.com/clayborg commented: Not sure this makes sense after checking the code for `DWARFDebugLine::LineTable::lookupAddressRangeImpl(...)`. If a line table has multiple sequences that contain an address, it will find the first sequence that contains the address and then return the rows for the function. What is the effect of this change on the test case? Does it change the final line table in the GSYM file? https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
avillega wrote: I think I can accomplish the same behaviour exposed in https://github.com/llvm/llvm-project/pull/89703 which requires a change to the DWARF apis without actually changing them. https://github.com/llvm/llvm-project/pull/90535 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
llvmbot wrote: @llvm/pr-subscribers-debuginfo Author: Andres Villegas (avillega) Changes Work around for #46494. This change adds debug_line end_sequence rows when converting the function line tables. By including the end_sequence it is possible to handle some edge cases like icf optimizations. --- Patch is 23.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90535.diff 2 Files Affected: - (modified) llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp (+21-12) - (added) llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml (+564) ``diff diff --git a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp index ff6b560d11726b..d12dce510e663e 100644 --- a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp +++ b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp @@ -321,7 +321,10 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , StartAddress, object::SectionedAddress::UndefSection}; - if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize, RowVector)) { + // end_sequence markers can be located at RangeSize position, + // lookupAddressRange search up to RangeSize not inclusive, to include + // end_sequence markers it is necessary to lookup until RangeSize+1 + if (!CUI.LineTable->lookupAddressRange(SecAddress, RangeSize + 1, RowVector)) { // If we have a DW_TAG_subprogram but no line entries, fall back to using // the DW_AT_decl_file an d DW_AT_decl_line if we have both attributes. std::string FilePath = Die.getDeclFile( @@ -354,6 +357,18 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , for (uint32_t RowIndex : RowVector) { // Take file number and line/column from the row. const DWARFDebugLine::Row = CUI.LineTable->Rows[RowIndex]; + +// TODO(avillega): With this conditional, functions folded by `icf` +// optimizations will only include 1 of all the folded functions. There is +// not a clear path forward to have the information of all folded functions +// in gsym. +if (Row.EndSequence) { + // End sequence markers are included for the last address + // in a function or the last contiguos address in a sequence. + break; +} + + std::optional OptFileIdx = CUI.DWARFToGSYMFileIndex(Gsym, Row.File); if (!OptFileIdx) { @@ -411,7 +426,7 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , else Out.Report("Non-monotonically increasing addresses", [&](raw_ostream ) { - OS << "error: line table has addresses that do not " + OS << "warning: line table has addresses that do not " << "monotonically increase:\n"; for (uint32_t RowIndex2 : RowVector) CUI.LineTable->Rows[RowIndex2].dump(OS); @@ -424,19 +439,13 @@ static void convertFunctionLineTable(OutputAggregator , CUInfo , auto LastLE = FI.OptLineTable->last(); if (LastLE && LastLE->File == FileIdx && LastLE->Line == Row.Line) continue; + // Only push a row if it isn't an end sequence. End sequence markers are // included for the last address in a function or the last contiguous // address in a sequence. -if (Row.EndSequence) { - // End sequence means that the next line entry could have a lower address - // that the previous entries. So we clear the previous row so we don't - // trigger the line table error about address that do not monotonically - // increase. - PrevRow = DWARFDebugLine::Row(); -} else { - FI.OptLineTable->push(LE); - PrevRow = Row; -} +FI.OptLineTable->push(LE); +PrevRow = Row; + } // If not line table rows were added, clear the line table so we don't encode // on in the GSYM file. diff --git a/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml new file mode 100644 index 00..0e1e507179057b --- /dev/null +++ b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf-icf.yaml @@ -0,0 +1,564 @@ +## Test loading an ELF file with DWARF with icf (identical code folding) +## optimizations. +## First we make the ELF file from yaml, +## then we convert the ELF file to GSYM, then we do lookups on the newly +## created GSYM, and finally we dump the entire GSYM. +## +## The elf file corresponds to this c program: +## int f() { +## return 1; +## } +## +## int g() { +## return 1; +## } +## +## int main() { +## f(); +## g(); +## return 0; +## } + +# RUN: yaml2obj %s -o %t +# RUN: llvm-gsymutil --convert %t --out-file=%t.gsym 2>&1 | FileCheck %s --check-prefix=CONVERT --dump-input=always + +# CONVERT-NOT: warning: line table has addresses that do not monotonically increase +# CONVERT: Input file: {{.*\.yaml\.tmp}} +# CONVERT: Output file (x86_64): {{.*\.yaml\.tmp\.gsym}} +# CONVERT: Loaded 2 functions from
[llvm-branch-commits] [GSYM] Include end_sequence debug_line rows in Dwarf transform (PR #90535)
https://github.com/avillega created https://github.com/llvm/llvm-project/pull/90535 Work around for #46494. This change adds debug_line end_sequence rows when converting the function line tables. By including the end_sequence it is possible to handle some edge cases like icf optimizations. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits