[lldb-dev] new tool (core2yaml) + a new top-level library (Formats)

Pavel Labath via lldb-dev Tue, 05 Mar 2019 08:32:14 -0800

Hello all,

I have just posted a large-ish patch series for review (D58971, D58973,D58975, D58976), and I want to use this opportunity to draw moreattention to it and highlight various bikesheddingopportunities^H^H^Htopics for discussion :).

The new tool is called core2yaml, and it's goal is to fill the gap inthe testing story for core files. As you might know, at present, theonly way to test core file parsing code (*) is to check in an opaquebinary blob and have the debugger open that. This presents a couple ofchallenges:

- it's really hard to review what is inside the core file
- one has to jump through various hoops to create a "small" core file

This tools fixes both issues by enabling one to check in text files,with human-readable content. The yaml files can also be easily edited toprune out the content which is not relevant for the test. While that'snot my goal at present, I am hoping that this will one day enable us towrite self-contained tests for the unwinder, as the core file can beused to synthesize (or capture&reduce) interesting unwinder scenarios.

Since I also needed to find a home for the new code I was writing, Ithought this would be good opportunity to create a new library forvarious stuff. The goals I was trying to solve are:- make the yaml code a library. The reason for that is that we have anumber of unittests using checked in binaries, and I thought it would benice to be able to convert those to use yaml representation as well.- make the existing minidump parsing code more easily accessible. Theparsing code currently lives in source/Plugins/Process/minidump, and isimpossible to use it without pulling in the rest of lldb (which the tooldoesn't need).The solution I came up with here is a new "Formats" library. I chose afairly generic name, because I realized that we have code for(de)serializing a bunch of small formats, which don't really have a goodplace to live in. Currently I needed a parser for linux /proc/PID/mapsfiles and minidump files, but I am hoping that a generic name wouldenable us to one day move the gdb-remote protocol code there (which isalso currently buried in some plugin code, which makes it hard to dependon from lldb-server), as well as the future debug-info-server, if itever comes into existence.


Discussion topic #1: The library name and scope.

There are lost of other ways this could be organized. One of the names Iconsidered was "BinaryFormat" for symmetry with llvm, but then I choseto drop the "Binary" part as it seemed to me we have plenty ofnon-binary formats as well. As for it's dependencies I currently have itdepending on Utility and nothing else (as far as lldb libraries go). Ican imagine using some Host code might be useful there too, but I wouldlike to avoid any other lldb dependencies right now. Another question iswhether this should be a single library or a bunch of smaller ones. Ichose a single library now because the things I initially plan to putthere are fairly small (/proc/pid/maps parser is 200 LOC), but I can seehow we may want to create sub-libraries for things that grow big (thedebug-info server code might turn out to be one of those) or that havesome additional dependencies.


Discussion topic #2: tool name and scope

A case could be made to integrate this functionality into the llvmyaml2obj utilities. Here I chose not to do that because the minidumpformat is not at all implemented in llvm, and I do not see a use casefor it to be implemented/moved there. A stronger case could be made toput the elf core code there, since llvm already supports reading elffiles. While originally being in favour of that, I eventually adoptedthe view that doing this in lldb would be better because:

- it would bring more symmetry with minidumps

- it would enable us to do fine-grained yamlization for things that wecare about (e.g., registers), which is something that would probably beuninteresting to the rest of llvm.

Discussion topic #3: Use of .def files in lldb. In one of the patches acreate a .def textual header to be used for avoiding repetitive codewhen dealing various constants. This is fairly common practice in llvm,but would be a first in lldb.

Discussion topic #4: Overlap with "process plugin dump". This tool hassome overlap with the given command for minidump files, which alsoprovides a textual description of minidump files. In case we are ok withtweaking the interface of that command slightly (and ok with some yamlartefacts in it's output), it should be possible to reimplement thatcommand on top of the yaml serialization library.


Discussion topic #5: Anything else I haven't thought of.

regards,
pavel

(*) This is not entirely true for MachO core files, where yaml2obj isalready able to convert the core files into text form. However, it isdefinitely true for ELF and minidump core files, and even the MachO yamlfor isn't that well suited for manual viewing or reduction.

_______________________________________________
lldb-dev mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

[lldb-dev] new tool (core2yaml) + a new top-level library (Formats)

Reply via email to