> On Nov 29, 2018, at 2:02 PM, Zachary Turner via Phabricator > <revi...@reviews.llvm.org> wrote: > > zturner added a comment. > > In D53368#1313238 <https://reviews.llvm.org/D53368#1313238>, @labath wrote: > >> In D53368#1313145 <https://reviews.llvm.org/D53368#1313145>, @zturner wrote: >> >>> In D53368#1313124 <https://reviews.llvm.org/D53368#1313124>, @labath wrote: >>> >>>> I've recently started looking at adding a new symbol file format (breakpad >>>> symbols). While researching the best way to achieve that, I started >>>> comparing the operation of PDB and DWARF symbol files. I noticed a very >>>> important difference there, and I think that is the cause of our problems >>>> here. In the DWARF implementation, a symbol file is an overlay on top of >>>> an object file - it takes the data contained by the object file and >>>> presents it in a more structured way. >>>> >>>> However, that is not the case with PDB (both implementations). These take >>>> the debug information from a completely different file, which is not >>>> backed by an ObjectFile instance, and then present that. Since the >>>> SymbolFile interface requires them to be backed by an object file, they >>>> both pretend they are backed by the original EXE file, but in reality the >>>> data comes from elsewhere. >>> >>> >>> Don't DWARF DWP files work this way as well? How is support for this >>> implemented in LLDB? >> >> >> There are some similarities, but DWP is a bit different. The main difference >> is that the DWP file is still an ELF (or whatever) file, so we still have a >> ObjectFile sitting below the symbol file. The other difference is that in >> case of DWP we still have a significant chunk of debug information present >> in the main executable (mainly various offsets that need to be applied to >> the unlinked debug info in the dwo/dwp files), so you can still very well >> say that the symbol file is reading information from the main executable. >> What DWARF does in this case is it creates a main SymbolFileDWARF for >> reading data from the main object file, and then a bunch of inner >> SymbolFileDWARFDwo/Dwp instances which read data from the other files. There >> are plenty of things to not like here as well, but at least this maintains >> the property that each symbol file sits on top of the object file from which >> it reads the data from. (and symtab doesn't go into the dwp file, so there >> are no issues with that). >> >>>> I am asking this because now I am facing a choice in how to implement >>>> breakpad symbols. I could go the PDB way, and read the symbols without an >>>> intervening object file, or I could create an ObjectFileBreakpad and then >>>> (possibly) a SymbolFileBreakpad sitting on top of that. >>> >>> What if `SymbolFile` interface provided a new method such as `GetSymtab()` >>> while `ObjectFile` provides a method called `HasExternalSymtab()`. When >>> you call `ObjectFilePECOFF::GetSymtab()`, it could first check if >>> `HasExternalSymtab()` is true, and if so it could call the SymbolFile >>> plugin and return that >> >> I don't think this would be good because there's no way for the PECOFF file >> to know if we will have a PDB file on top of it. > > > I'm actually starting to wonder even if `GetSymtab()` should be part of > `ObjectFile`. The first thing it does is get the Module and then start > calling a bunch of stuff on the Module interface. Perhaps the place to start > is comparing the Module and ObjectFile interfaces and seeing if the existing > APIs make the most sense being moved up to Module. If everything was on > Module then the Module has everything it needs to go to the SymbolVendor and > find a PDB file.
I would vote against moving anything into the module. Object files have their own symbol tables and we need the ability for an object file to be able to find a symbol that it created and we really don't want to abstract this away since at any time when we delve further into an object file we might need to dig up a symbol by its original symbol table index. So the cleanest design in my opinion is one where the object files can each have their own symbol table and the module uses the symbol vendor to get promote the best information up to the user. Symbols can come from one object file, or an external debug info object file, or from Breakpad. But each of those files should be able to have their own notion of their own symbols. Greg > > > CHANGES SINCE LAST ACTION > https://reviews.llvm.org/D53368/new/ > > https://reviews.llvm.org/D53368 > > > _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits