[Lldb-commits] [PATCH] D54216: [NativePDB] Improve support for reconstructing a clang AST from PDB debug info

2018-11-08 Thread Zachary Turner via Phabricator via lldb-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL346429: [NativePDB] Higher fidelity reconstruction of AST 
from Debug Info. (authored by zturner, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D54216?vs=172989=173198#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D54216

Files:
  lldb/trunk/lit/SymbolFile/NativePDB/Inputs/ast-reconstruction.lldbinit
  lldb/trunk/lit/SymbolFile/NativePDB/Inputs/function-types-classes.lldbinit
  lldb/trunk/lit/SymbolFile/NativePDB/Inputs/globals-classes.lldbinit
  lldb/trunk/lit/SymbolFile/NativePDB/ast-reconstruction.cpp
  lldb/trunk/lit/SymbolFile/NativePDB/function-types-classes.cpp
  lldb/trunk/lit/SymbolFile/NativePDB/global-classes.cpp
  lldb/trunk/source/Plugins/SymbolFile/NativePDB/PdbUtil.cpp
  lldb/trunk/source/Plugins/SymbolFile/NativePDB/PdbUtil.h
  lldb/trunk/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
  lldb/trunk/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
  llvm/trunk/include/llvm/DebugInfo/CodeView/TypeRecord.h

Index: llvm/trunk/include/llvm/DebugInfo/CodeView/TypeRecord.h
===
--- llvm/trunk/include/llvm/DebugInfo/CodeView/TypeRecord.h
+++ llvm/trunk/include/llvm/DebugInfo/CodeView/TypeRecord.h
@@ -429,6 +429,10 @@
 return (Options & ClassOptions::ForwardReference) != ClassOptions::None;
   }
 
+  bool containsNestedClass() const {
+return (Options & ClassOptions::ContainsNestedClass) != ClassOptions::None;
+  }
+
   bool isScoped() const {
 return (Options & ClassOptions::Scoped) != ClassOptions::None;
   }
Index: lldb/trunk/lit/SymbolFile/NativePDB/ast-reconstruction.cpp
===
--- lldb/trunk/lit/SymbolFile/NativePDB/ast-reconstruction.cpp
+++ lldb/trunk/lit/SymbolFile/NativePDB/ast-reconstruction.cpp
@@ -0,0 +1,131 @@
+// clang-format off
+// REQUIRES: lld
+
+// Test various interesting cases for AST reconstruction.
+// RUN: clang-cl /Z7 /GS- /GR- /c /Fo%t.obj -- %s
+// RUN: lld-link /DEBUG /nodefaultlib /entry:main /OUT:%t.exe /PDB:%t.pdb -- %t.obj
+// RUN: env LLDB_USE_NATIVE_PDB_READER=1 lldb -f %t.exe -s \
+// RUN: %p/Inputs/ast-reconstruction.lldbinit 2>&1 | FileCheck %s
+
+// Test trivial versions of each tag type.
+class TrivialC {};
+struct TrivialS {};
+union TrivialU {};
+enum TrivialE {TE_A};
+
+// Test reconstruction of DeclContext hierarchies.
+namespace A {
+  namespace B {
+template
+struct C {
+  T ABCMember;
+};
+
+// Let's try a template specialization with a different implementation
+template<>
+struct C {
+  void *ABCSpecializationMember;
+};
+  }
+
+  // Let's make sure we can distinguish classes and namespaces.  Also let's try
+  // a non-type template parameter.
+  template
+  struct C {
+class D {
+  int ACDMember = 0;
+  C *CPtr = nullptr;
+};
+  };
+
+  struct D {
+// Let's make a nested class with the same name as another nested class
+// elsewhere, and confirm that they appear in the right DeclContexts in
+// the AST.
+struct E {
+  int ADDMember;
+};
+  };
+}
+
+
+// Let's try an anonymous namespace.
+namespace {
+  template
+  struct Anonymous {
+int AnonymousMember;
+// And a nested class within an anonymous namespace
+struct D {
+  int AnonymousDMember;
+};
+  };
+}
+
+TrivialC TC;
+TrivialS TS;
+TrivialU TU;
+TrivialE TE;
+
+A::B::C ABCInt;
+A::B::C ABCFloat;
+A::B::C ABCVoid;
+
+A::C<0> AC0;
+A::C<-1> ACNeg1;
+
+A::C<0>::D AC0D;
+A::C<-1>::D ACNeg1D;
+A::D AD;
+A::D::E ADE;
+
+// FIXME: Anonymous namespaces aren't working correctly.
+Anonymous AnonInt;
+Anonymous> AnonABCVoid;
+Anonymous>::D AnonABCVoidD;
+
+// FIXME: Enum size isn't being correctly determined.
+// FIXME: Can't read memory for variable values.
+
+// CHECK: (TrivialC) TC = {}
+// CHECK: (TrivialS) TS = {}
+// CHECK: (TrivialU) TU = {}
+// CHECK: (TrivialE) TE = 
+// CHECK: (A::B::C) ABCInt = (ABCMember = )
+// CHECK: (A::B::C) ABCFloat = (ABCMember = )
+// CHECK: (A::B::C) ABCVoid = (ABCSpecializationMember = )
+// CHECK: (A::C<0>) AC0 = {}
+// CHECK: (A::C<-1>) ACNeg1 = {}
+// CHECK: (A::C<0>::D) AC0D = (ACDMember = , CPtr = )
+// CHECK: (A::C<-1>::D) ACNeg1D = (ACDMember = , CPtr = )
+// CHECK: (A::D) AD = {}
+// CHECK: (A::D::E) ADE = (ADDMember = )
+// CHECK: Dumping clang ast for 1 modules.
+// CHECK: TranslationUnitDecl {{.*}}
+// CHECK: |-CXXRecordDecl {{.*}} class TrivialC definition
+// CHECK: |-CXXRecordDecl {{.*}} struct TrivialS definition
+// CHECK: |-CXXRecordDecl {{.*}} union TrivialU definition
+// CHECK: |-EnumDecl {{.*}} TrivialE
+// CHECK: |-NamespaceDecl {{.*}} A
+// CHECK: | |-NamespaceDecl {{.*}} B
+// CHECK: | | |-CXXRecordDecl {{.*}} struct C definition
+// CHECK: | | | `-FieldDecl {{.*}} ABCMember 'int'
+// CHECK: | | |-CXXRecordDecl {{.*}} struct C 

[Lldb-commits] [PATCH] D54216: [NativePDB] Improve support for reconstructing a clang AST from PDB debug info

2018-11-08 Thread Aleksandr Urakov via Phabricator via lldb-commits
aleksandr.urakov added a comment.

In https://reviews.llvm.org/D54216#1291852, @zturner wrote:

> I checked on clang.pdb.  For my local build of LLVM this about 780MB.  It's 
> quite slow in debug build (14 seconds for `ParseSectionContribs` and 60 
> seconds for `PreprocessTpiStream`), but in release build the combined total 
> is less than 2 seconds for both function calls


I think that less than 2 seconds for a 780MB PDB in release is very good! Thank 
you!


https://reviews.llvm.org/D54216



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D54216: [NativePDB] Improve support for reconstructing a clang AST from PDB debug info

2018-11-08 Thread Zachary Turner via Phabricator via lldb-commits
zturner added a comment.

In https://reviews.llvm.org/D54216#1291453, @aleksandr.urakov wrote:

> Looks good, thank you!
>
> The only question is performance, haven't you checked how much time takes the 
> preprocessing on huge PDBs? Intuitively it seems that it shouldn't take too 
> much time (n*log(n) where n is the count of LF_NESTTYPE records), but may be 
> you have checked this?


I checked on clang.pdb.  For my local build of LLVM this about 780MB.  It's 
quite slow in debug build (14 seconds for `ParseSectionContribs` and 60 seconds 
for `PreprocessTpiStream`), but in release build the combined total is less 
than 2 seconds for both function calls


https://reviews.llvm.org/D54216



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D54216: [NativePDB] Improve support for reconstructing a clang AST from PDB debug info

2018-11-08 Thread Aleksandr Urakov via Phabricator via lldb-commits
aleksandr.urakov accepted this revision.
aleksandr.urakov added a comment.
This revision is now accepted and ready to land.

Looks good, thank you!

The only question is performance, haven't you checked how much time takes the 
preprocessing on huge PDBs? Intuitively it seems that it shouldn't take too 
much time (n*log(n) where n is the count of LF_NESTTYPE records), but may be 
you have checked this?


https://reviews.llvm.org/D54216



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [PATCH] D54216: [NativePDB] Improve support for reconstructing a clang AST from PDB debug info

2018-11-07 Thread Zachary Turner via Phabricator via lldb-commits
zturner created this revision.
zturner added reviewers: aleksandr.urakov, labath, lemo.
Herald added subscribers: erik.pilkington, JDevlieghere, aprantl.

This is an alternative to https://reviews.llvm.org/D54053 which uses a 
different approach.  The goal of both is the same - to be able to improve the 
quality of the AST that we reconstruct when parsing the debug info.  
https://reviews.llvm.org/D54053 attempts to address this by demangling the 
unique name of each type, and using the structure of the demangler's AST to try 
to reconstruct a clang AST.

However, there are some complications with this approach.  The two biggest ones 
are:

a) The mangling does not always provide enough information to disambiguate 
between two types, depending on where it occurs in the mangling.  
b) The mangling provides no way to differentiate outer classes from outer 
namespaces, so in `A::B::C`, we don't know if `A` and `B` are (class, class), 
(namespace, namespace), or (namespace, class).

b) sounds like it could be an unimportant distinction, but since LLDB works by 
gradually building up an AST over time that grows as more and more debug info 
is parsed, you can very quickly end up in a situation where there are 
ambiguities in your AST.  For example, you may decide that `B` is probably a 
namespace, so you create a `NamespaceDecl` for it in the AST, and then later 
someone instatiates a variable of type `A::B` and you have precise debug info 
telling you it's a class.  This will create two decls at the same scope in the 
AST hierarchy with the same name, causing ambiguities and these will slowly 
build up over time leading to instability.

The approach here is based off of the observation that the PDB contains 
information about nested classes in the parent -> child direction, just not the 
other way around.  That is to say, if you have code such as: `struct A { struct 
B {}; };`  Then the debug info record for `A` will tell you that it contains a 
nested type call `B`, along with an index for the full definition of `B` in the 
debug info.  The problem we are facing all along is that if someone declares a 
variable of type `A::B`, they need the reverse mapping, and PDB doesn't offer 
that.

So, the simple solution employed here is to simply pre-process all types up 
front and build the reverse mapping.  This gives us perfect information about 
class hierarchy, and allows us to precisely determine if a part of a scope is a 
namespace (specifically, it will have no parent in the reverse mapping).

But we can even re-purpose this pre-processing step for other things down the 
line.  For example, we may wish to find all types name `Foo`, but maybe `Foo` 
is a template and the instantion is `Foo`.  We could use this 
pre-processing step to build this kind of hash table.  And many other things as 
well.

Note that the idea of demangling a type and using the structured demangler AST 
is not totally abandoned.  For example, if you have a template instantiation 
named `Foo`, the patch here will simply create a class with the name 
`Foo`.  In other words, we make no attempt to parse template parameters 
and create the appropriate instantiations in the AST.

We also do not yet handle scoped classes (i.e. classes that are defined inside 
the body of a funtion).  But we can handle those later.

Note that I started adding a new kind of test, an ast test.  I even retrofitted 
existing tests with ast testing functionality.  I think this is a useful 
testing strategy to ensure we are generating correct ASTs from debug info.


https://reviews.llvm.org/D54216

Files:
  lldb/lit/SymbolFile/NativePDB/Inputs/ast-reconstruction.lldbinit
  lldb/lit/SymbolFile/NativePDB/Inputs/function-types-classes.lldbinit
  lldb/lit/SymbolFile/NativePDB/Inputs/globals-classes.lldbinit
  lldb/lit/SymbolFile/NativePDB/ast-reconstruction.cpp
  lldb/lit/SymbolFile/NativePDB/function-types-classes.cpp
  lldb/lit/SymbolFile/NativePDB/global-classes.cpp
  lldb/source/Plugins/SymbolFile/NativePDB/PdbUtil.cpp
  lldb/source/Plugins/SymbolFile/NativePDB/PdbUtil.h
  lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
  lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
  llvm/include/llvm/DebugInfo/CodeView/TypeRecord.h

Index: llvm/include/llvm/DebugInfo/CodeView/TypeRecord.h
===
--- llvm/include/llvm/DebugInfo/CodeView/TypeRecord.h
+++ llvm/include/llvm/DebugInfo/CodeView/TypeRecord.h
@@ -429,6 +429,10 @@
 return (Options & ClassOptions::ForwardReference) != ClassOptions::None;
   }
 
+  bool containsNestedClass() const {
+return (Options & ClassOptions::ContainsNestedClass) != ClassOptions::None;
+  }
+
   bool isScoped() const {
 return (Options & ClassOptions::Scoped) != ClassOptions::None;
   }
Index: lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
===
---