[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Haojian Wu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9f865246a817: [ASTMatcher] Fix a performance regression: 
memorize the child match. (authored by hokein).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771

Files:
  clang/lib/ASTMatchers/ASTMatchFinder.cpp


Index: clang/lib/ASTMatchers/ASTMatchFinder.cpp
===
--- clang/lib/ASTMatchers/ASTMatchFinder.cpp
+++ clang/lib/ASTMatchers/ASTMatchFinder.cpp
@@ -45,7 +45,9 @@
 
 enum class MatchType {
   Ancestors,
-  Descendants
+
+  Descendants,
+  Child,
 };
 
 // We use memoization to avoid running the same matcher on the same
@@ -452,8 +454,7 @@
   BoundNodesTreeBuilder *Builder, int MaxDepth,
   TraversalKind Traversal, BindKind Bind) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Node.getMemoizationData() || !Builder->isComparable() || MaxDepth == 
1)
+if (!Node.getMemoizationData() || !Builder->isComparable())
   return matchesRecursively(Node, Matcher, Builder, MaxDepth, Traversal,
 Bind);
 
@@ -463,7 +464,8 @@
 // Note that we key on the bindings *before* the match.
 Key.BoundNodes = *Builder;
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);
 if (I != ResultCache.end()) {
   *Builder = I->second.Nodes;
@@ -700,8 +702,10 @@
 BoundNodesTreeBuilder *Builder,
 AncestorMatchMode MatchMode) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Builder->isComparable() || MatchMode == 
AncestorMatchMode::AMM_ParentOnly)
+// When doing a single-level match, we don't need to memoize because
+// ParentMap (in ASTContext) already memoizes the result.
+if (!Builder->isComparable() ||
+MatchMode == AncestorMatchMode::AMM_ParentOnly)
   return matchesAncestorOfRecursively(Node, Ctx, Matcher, Builder,
   MatchMode);
 


Index: clang/lib/ASTMatchers/ASTMatchFinder.cpp
===
--- clang/lib/ASTMatchers/ASTMatchFinder.cpp
+++ clang/lib/ASTMatchers/ASTMatchFinder.cpp
@@ -45,7 +45,9 @@
 
 enum class MatchType {
   Ancestors,
-  Descendants
+
+  Descendants,
+  Child,
 };
 
 // We use memoization to avoid running the same matcher on the same
@@ -452,8 +454,7 @@
   BoundNodesTreeBuilder *Builder, int MaxDepth,
   TraversalKind Traversal, BindKind Bind) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Node.getMemoizationData() || !Builder->isComparable() || MaxDepth == 1)
+if (!Node.getMemoizationData() || !Builder->isComparable())
   return matchesRecursively(Node, Matcher, Builder, MaxDepth, Traversal,
 Bind);
 
@@ -463,7 +464,8 @@
 // Note that we key on the bindings *before* the match.
 Key.BoundNodes = *Builder;
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);
 if (I != ResultCache.end()) {
   *Builder = I->second.Nodes;
@@ -700,8 +702,10 @@
 BoundNodesTreeBuilder *Builder,
 AncestorMatchMode MatchMode) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Builder->isComparable() || MatchMode == AncestorMatchMode::AMM_ParentOnly)
+// When doing a single-level match, we don't need to memoize because
+// ParentMap (in ASTContext) already memoizes the result.
+if (!Builder->isComparable() ||
+MatchMode == AncestorMatchMode::AMM_ParentOnly)
   return matchesAncestorOfRecursively(Node, Ctx, Matcher, Builder,
   MatchMode);
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

LG




Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:468
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);

sammccall wrote:
> sammccall wrote:
> > This line looks a lot like a cache-poisoning bug, until realizing that 
> > MaxDepth is never actually used as an integer, it's a boolean - 1 means 
> > non-recursive and 0 means recursive. (Note that the recursive call passes 
> > MaxDepth, not MaxDepth-1).
> > I guess we should just fix the type, actually bounding the depth doesn't 
> > seem likely to be done anytime soon.
> er, 1 means non-recursive and anything else means recursive...
That sounds like a historical reason - feel free to change :) (I don't even 
remember how that came to be).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

(I can't quite follow the original reason for disabling memoization in the 
non-recursive case, so this seems reasonable but going to leave that to Manuel 
who knew the argument)




Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:468
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);

This line looks a lot like a cache-poisoning bug, until realizing that MaxDepth 
is never actually used as an integer, it's a boolean - 1 means non-recursive 
and 0 means recursive. (Note that the recursive call passes MaxDepth, not 
MaxDepth-1).
I guess we should just fix the type, actually bounding the depth doesn't seem 
likely to be done anytime soon.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Sam McCall via Phabricator via cfe-commits
sammccall added inline comments.



Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:468
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);

sammccall wrote:
> This line looks a lot like a cache-poisoning bug, until realizing that 
> MaxDepth is never actually used as an integer, it's a boolean - 1 means 
> non-recursive and 0 means recursive. (Note that the recursive call passes 
> MaxDepth, not MaxDepth-1).
> I guess we should just fix the type, actually bounding the depth doesn't seem 
> likely to be done anytime soon.
er, 1 means non-recursive and anything else means recursive...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D82771#2121924 , @hokein wrote:

> In D82771#2120214 , @klimek wrote:
>
> > In what situation are we calling child matchers repeatedly with the same 
> > matcher on the same node?
>
>
> I guess a pattern like 
> https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clang-tidy/readability/ContainerSizeEmptyCheck.cpp#L29-L40?


Ah, so this is when we go hasDeclaration.

I'm wondering whether what we really want to memoize are matchers that go 
across the AST, like hasType, hasDeclaration, etc, where we can expect a linear 
number of nodes hitting that  matcher.

hasChild specifically seems weird to memoize when we don't memoize other has 
matchers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Haojian Wu via Phabricator via cfe-commits
hokein updated this revision to Diff 274347.
hokein added a comment.

Address comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771

Files:
  clang/lib/ASTMatchers/ASTMatchFinder.cpp


Index: clang/lib/ASTMatchers/ASTMatchFinder.cpp
===
--- clang/lib/ASTMatchers/ASTMatchFinder.cpp
+++ clang/lib/ASTMatchers/ASTMatchFinder.cpp
@@ -45,7 +45,9 @@
 
 enum class MatchType {
   Ancestors,
-  Descendants
+
+  Descendants,
+  Child,
 };
 
 // We use memoization to avoid running the same matcher on the same
@@ -452,8 +454,7 @@
   BoundNodesTreeBuilder *Builder, int MaxDepth,
   TraversalKind Traversal, BindKind Bind) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Node.getMemoizationData() || !Builder->isComparable() || MaxDepth == 
1)
+if (!Node.getMemoizationData() || !Builder->isComparable())
   return matchesRecursively(Node, Matcher, Builder, MaxDepth, Traversal,
 Bind);
 
@@ -463,7 +464,8 @@
 // Note that we key on the bindings *before* the match.
 Key.BoundNodes = *Builder;
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);
 if (I != ResultCache.end()) {
   *Builder = I->second.Nodes;
@@ -700,8 +702,10 @@
 BoundNodesTreeBuilder *Builder,
 AncestorMatchMode MatchMode) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Builder->isComparable() || MatchMode == 
AncestorMatchMode::AMM_ParentOnly)
+// When doing a single-level match, we don't need to memoize because
+// ParentMap (in ASTContext) already memoizes the result.
+if (!Builder->isComparable() ||
+MatchMode == AncestorMatchMode::AMM_ParentOnly)
   return matchesAncestorOfRecursively(Node, Ctx, Matcher, Builder,
   MatchMode);
 


Index: clang/lib/ASTMatchers/ASTMatchFinder.cpp
===
--- clang/lib/ASTMatchers/ASTMatchFinder.cpp
+++ clang/lib/ASTMatchers/ASTMatchFinder.cpp
@@ -45,7 +45,9 @@
 
 enum class MatchType {
   Ancestors,
-  Descendants
+
+  Descendants,
+  Child,
 };
 
 // We use memoization to avoid running the same matcher on the same
@@ -452,8 +454,7 @@
   BoundNodesTreeBuilder *Builder, int MaxDepth,
   TraversalKind Traversal, BindKind Bind) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Node.getMemoizationData() || !Builder->isComparable() || MaxDepth == 1)
+if (!Node.getMemoizationData() || !Builder->isComparable())
   return matchesRecursively(Node, Matcher, Builder, MaxDepth, Traversal,
 Bind);
 
@@ -463,7 +464,8 @@
 // Note that we key on the bindings *before* the match.
 Key.BoundNodes = *Builder;
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);
 if (I != ResultCache.end()) {
   *Builder = I->second.Nodes;
@@ -700,8 +702,10 @@
 BoundNodesTreeBuilder *Builder,
 AncestorMatchMode MatchMode) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Builder->isComparable() || MatchMode == AncestorMatchMode::AMM_ParentOnly)
+// When doing a single-level match, we don't need to memoize because
+// ParentMap (in ASTContext) already memoizes the result.
+if (!Builder->isComparable() ||
+MatchMode == AncestorMatchMode::AMM_ParentOnly)
   return matchesAncestorOfRecursively(Node, Ctx, Matcher, Builder,
   MatchMode);
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Haojian Wu via Phabricator via cfe-commits
hokein marked an inline comment as done.
hokein added a comment.

In D82771#2120214 , @klimek wrote:

> In what situation are we calling child matchers repeatedly with the same 
> matcher on the same node?


I guess a pattern like 
https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clang-tidy/readability/ContainerSizeEmptyCheck.cpp#L29-L40?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:467
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memorize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;

Here and below: s/memorize/memoize/


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In what situation are we calling child matchers repeatedly with the same 
matcher on the same node?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-29 Thread Haojian Wu via Phabricator via cfe-commits
hokein created this revision.
hokein added reviewers: klimek, sammccall.
Herald added a project: clang.

D80025  introduced a performance regression: 
in some cases, it makes
clang-tidy readability-container-size-empty ~80x slower (running on an internal
huge TU, before that patch 12s vs after 950s).

after this patch, we go back to 12s.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82771

Files:
  clang/lib/ASTMatchers/ASTMatchFinder.cpp


Index: clang/lib/ASTMatchers/ASTMatchFinder.cpp
===
--- clang/lib/ASTMatchers/ASTMatchFinder.cpp
+++ clang/lib/ASTMatchers/ASTMatchFinder.cpp
@@ -45,7 +45,9 @@
 
 enum class MatchType {
   Ancestors,
-  Descendants
+
+  Descendants,
+  Child,
 };
 
 // We use memoization to avoid running the same matcher on the same
@@ -452,8 +454,7 @@
   BoundNodesTreeBuilder *Builder, int MaxDepth,
   TraversalKind Traversal, BindKind Bind) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Node.getMemoizationData() || !Builder->isComparable() || MaxDepth == 
1)
+if (!Node.getMemoizationData() || !Builder->isComparable())
   return matchesRecursively(Node, Matcher, Builder, MaxDepth, Traversal,
 Bind);
 
@@ -463,7 +464,8 @@
 // Note that we key on the bindings *before* the match.
 Key.BoundNodes = *Builder;
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memorize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);
 if (I != ResultCache.end()) {
   *Builder = I->second.Nodes;
@@ -700,8 +702,10 @@
 BoundNodesTreeBuilder *Builder,
 AncestorMatchMode MatchMode) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Builder->isComparable() || MatchMode == 
AncestorMatchMode::AMM_ParentOnly)
+// When doing a single-level match, we don't need to memoize because
+// ParentMap (in ASTContext) already memorizes the result.
+if (!Builder->isComparable() ||
+MatchMode == AncestorMatchMode::AMM_ParentOnly)
   return matchesAncestorOfRecursively(Node, Ctx, Matcher, Builder,
   MatchMode);
 


Index: clang/lib/ASTMatchers/ASTMatchFinder.cpp
===
--- clang/lib/ASTMatchers/ASTMatchFinder.cpp
+++ clang/lib/ASTMatchers/ASTMatchFinder.cpp
@@ -45,7 +45,9 @@
 
 enum class MatchType {
   Ancestors,
-  Descendants
+
+  Descendants,
+  Child,
 };
 
 // We use memoization to avoid running the same matcher on the same
@@ -452,8 +454,7 @@
   BoundNodesTreeBuilder *Builder, int MaxDepth,
   TraversalKind Traversal, BindKind Bind) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Node.getMemoizationData() || !Builder->isComparable() || MaxDepth == 1)
+if (!Node.getMemoizationData() || !Builder->isComparable())
   return matchesRecursively(Node, Matcher, Builder, MaxDepth, Traversal,
 Bind);
 
@@ -463,7 +464,8 @@
 // Note that we key on the bindings *before* the match.
 Key.BoundNodes = *Builder;
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memorize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);
 if (I != ResultCache.end()) {
   *Builder = I->second.Nodes;
@@ -700,8 +702,10 @@
 BoundNodesTreeBuilder *Builder,
 AncestorMatchMode MatchMode) {
 // For AST-nodes that don't have an identity, we can't memoize.
-// When doing a single-level match, we don't need to memoize
-if (!Builder->isComparable() || MatchMode == AncestorMatchMode::AMM_ParentOnly)
+// When doing a single-level match, we don't need to memoize because
+// ParentMap (in ASTContext) already memorizes the result.
+if (!Builder->isComparable() ||
+MatchMode == AncestorMatchMode::AMM_ParentOnly)
   return matchesAncestorOfRecursively(Node, Ctx, Matcher, Builder,
   MatchMode);
 
___