[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-30 Thread Gabor Marton via Phabricator via cfe-commits
martong added a comment.

> So I think the most valuable optimizations are low-level optimizations to 
> `ImmutableMap`. There were a few suggestions on the mailing list to use 
> something more modern than the AVL trees under the hood but I don't think 
> authors found much success with those.

Back then I was experimenting 

 by replacing some of our `ImmutableMap` instances with an alternative 
implementation. I've started with `Environment::ExprBindings` because profiling 
showed that as the most frequently used. I've chosen `immer::map` as an 
alternative immutable map implementation. It seemed to be very promising 
because of

- the applied Relaxed Radix Balanced Tree implementation 

- efficient batch manipulations 
 .

Results: I could make the LIT tests pass, except two checks. The performance 
was not promising though. The run-time was similar that we had with the AVL 
trees, however, the memory consumption was 5-10x higher! Consequently, I've 
abandoned the prototype. But, I think an **//efficient batch manipulation 
implementation with the existing AVL based//** `ImmutableMap` might be worth to 
experiment with further.

> - From high-level perspective almost half the time is spent in mark-and-sweep 
> garbage collection in `ExprEngine::removeDead()`. Which can't really be cut 
> as running `removeDead()` less often only increases the analysis time as it 
> makes the maps larger and therefore slower to access. And it's also hard to 
> optimize because the procedure is very complicated and fragile and very few 
> people actually understand how it works.

Yes, this is also aligned with my measurements. There are analyzer options to 
manipulate when the `removeDead` is called, before every Stmt, or before every 
basic block, or never. Actually, the last option is meaningless, because 
renders both the engine and many checkers faulty. The `before every basic 
block` option might still be worth to experiment with, but I could not find 
reasonable arguments to change to that yet (I cannot recall, but some lit tests 
were failing miserably with that option too).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-15 Thread Gábor Horváth via Phabricator via cfe-commits
xazax.hun added a comment.

In D131707#3724747 , @NoQ wrote:

> So I think the most valuable optimizations are low-level optimizations to 
> `ImmutableMap`. There were a few suggestions on the mailing list to use 
> something more modern than the AVL trees under the hood but I don't think 
> authors found much success with those.

I am not sure how optimized our `ImmutableMap` is, I would not be surprised if 
there were some small gains here and there. I don't remember the entire 
discussion, but I think some of the mentioned alternative data structures are:

- Patricia Trees, see C. Okasaki, A. Gill. Fast Mergeable Integer Maps. In 
Workshop on ML (1998)
- Fingre Trees, see https://en.wikipedia.org/wiki/Finger_tree
- Abandon the concept of trees and implement an immutable HashMap

Just wanted to dump all of this, in case there are some takers :)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-15 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Hmm at a glance I disagree with the FIXME, this doesn't look like a valuable 
optimization. This operation is typically constant-time anyway so it's probably 
not worth the memory impact and the added complexity.

The easiest way to roughly estimate impact is to take `sqlite3.c` (which is a 
famous large-ish project that gets compiled as a single huge file to ensure 
that optimizations have access to the entire program) and measure time and 
memory with UNIX `time`. Not very representative but very easy to execute. If 
it shows promising results I'm open to further investigation in this area.

I also recommend consulting a performance profiler before implementing 
optimizations, to avoid premature optimizations. IIUC the picture hasn't 
changed much in the recent years:

- From low-level perspective the majority of time is spent accessing and 
manipulating `ImmutableMap`s and other immutable data structures in the program 
state;
- From high-level perspective almost half the time is spent in mark-and-sweep 
garbage collection in `ExprEngine::removeDead()`. Which can't really be cut as 
running `removeDead()` less often only increases the analysis time as it makes 
the maps larger and therefore slower to access. And it's also hard to optimize 
because the procedure is very complicated and fragile and very few people 
actually understand how it works.

So I think the most valuable optimizations are low-level optimizations to 
`ImmutableMap`. There were a few suggestions on the mailing list to use 
something more modern than the AVL trees under the hood but I don't think 
authors found much success with those.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-15 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a subscriber: balazske.
steakhal added a comment.

In D131707#3723403 , @ASDenysPetrov 
wrote:

> @steakhal
> Thank you for the review.
> Could you please do testing for me, since I'm on Windows and have no prepared 
> testing environment as I know you do.

ATM it seems like I don't have the bandwidth. What I meant is choosing the 
right projects to test etc.
Maybe @balazske could give some helping hands. AFAIK it shouldn't be hard for 
him to enqueue a benchmark.
What do you say @balazske, could you please have a differential measurement in 
release configuration on the test set demonstrating the performance gain of 
this patch compared to the baseline?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-15 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added a comment.

In D131707#3723403 , @ASDenysPetrov 
wrote:

> @steakhal
> Thank you for the review.
> Could you please do testing for me, since I'm on Windows and have no prepared 
> testing environment as I know you do.
> I'll add an assertion.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-15 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov updated this revision to Diff 452689.
ASDenysPetrov added a comment.

Add assertion accroding to the remark.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

Files:
  clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h


Index: clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
===
--- clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
+++ clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
@@ -529,6 +529,8 @@
 
 /// TypedValueRegion - An abstract class representing regions having a typed 
value.
 class TypedValueRegion : public TypedRegion {
+  mutable QualType CachedLocationType;
+
   void anchor() override;
 
 protected:
@@ -540,12 +542,15 @@
   virtual QualType getValueType() const = 0;
 
   QualType getLocationType() const override {
-// FIXME: We can possibly optimize this later to cache this value.
-QualType T = getValueType();
-ASTContext  = getContext();
-if (T->getAs())
-  return ctx.getObjCObjectPointerType(T);
-return ctx.getPointerType(getValueType());
+if (CachedLocationType.isNull()) {
+  QualType T = getValueType();
+  ASTContext  = getContext();
+  CachedLocationType = T->isObjCObjectType() ? 
C.getObjCObjectPointerType(T)
+ : C.getPointerType(T);
+  assert(!CachedLocationType.isNull() &&
+ "Cached value supposed to be non-null.");
+}
+return CachedLocationType;
   }
 
   QualType getDesugaredValueType(ASTContext ) const {


Index: clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
===
--- clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
+++ clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
@@ -529,6 +529,8 @@
 
 /// TypedValueRegion - An abstract class representing regions having a typed value.
 class TypedValueRegion : public TypedRegion {
+  mutable QualType CachedLocationType;
+
   void anchor() override;
 
 protected:
@@ -540,12 +542,15 @@
   virtual QualType getValueType() const = 0;
 
   QualType getLocationType() const override {
-// FIXME: We can possibly optimize this later to cache this value.
-QualType T = getValueType();
-ASTContext  = getContext();
-if (T->getAs())
-  return ctx.getObjCObjectPointerType(T);
-return ctx.getPointerType(getValueType());
+if (CachedLocationType.isNull()) {
+  QualType T = getValueType();
+  ASTContext  = getContext();
+  CachedLocationType = T->isObjCObjectType() ? C.getObjCObjectPointerType(T)
+ : C.getPointerType(T);
+  assert(!CachedLocationType.isNull() &&
+ "Cached value supposed to be non-null.");
+}
+return CachedLocationType;
   }
 
   QualType getDesugaredValueType(ASTContext ) const {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-15 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov added a comment.

@steakhal
Thank you for the review.
Could you please do testing for me, since I'm on Windows and have no prepared 
testing environment as I know you are.
I'll add an assertion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer][NFC] Cache the result of getLocationType in TypedValueRegion

2022-08-15 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Please measure the performance implications/gains of this patch to justify this 
change.
Preferably test it on multiple kinds of projects ranging from old-school C and 
even some modern C++ projects.
Let me know your findings and if you need some help setting it up.




Comment at: 
clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h:549-550
+  CachedLocationType = T->isObjCObjectType() ? 
C.getObjCObjectPointerType(T)
+ : C.getPointerType(T);
+}
+return CachedLocationType;

I would appreciate it if you assert that the resulting type cannot be `null`, 
which would imply that if `CachedLocationType.isNull()` means that we haven't 
cached it yet.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131707/new/

https://reviews.llvm.org/D131707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131707: [analyzer]{NFC} Cache the result of getLocationType in TypedValueRegion

2022-08-11 Thread Denys Petrov via Phabricator via cfe-commits
ASDenysPetrov created this revision.
ASDenysPetrov added reviewers: steakhal, martong, NoQ, xazax.hun, isuckatcs.
ASDenysPetrov added a project: clang.
Herald added subscribers: manas, dkrupp, donat.nagy, Szelethus, 
mikhail.ramalho, a.sidorin, rnkovacs, szepet, baloghadamsoftware.
Herald added a project: All.
ASDenysPetrov requested review of this revision.
Herald added a subscriber: cfe-commits.

Fix FIXME. Produce `QualType` once in `getLocationType` at the first call. Then 
cache the result and return it for the next calls.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D131707

Files:
  clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h


Index: clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
===
--- clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
+++ clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
@@ -529,6 +529,8 @@
 
 /// TypedValueRegion - An abstract class representing regions having a typed 
value.
 class TypedValueRegion : public TypedRegion {
+  mutable QualType CachedLocationType;
+
   void anchor() override;
 
 protected:
@@ -540,12 +542,13 @@
   virtual QualType getValueType() const = 0;
 
   QualType getLocationType() const override {
-// FIXME: We can possibly optimize this later to cache this value.
-QualType T = getValueType();
-ASTContext  = getContext();
-if (T->getAs())
-  return ctx.getObjCObjectPointerType(T);
-return ctx.getPointerType(getValueType());
+if (CachedLocationType.isNull()) {
+  QualType T = getValueType();
+  ASTContext  = getContext();
+  CachedLocationType = T->isObjCObjectType() ? 
C.getObjCObjectPointerType(T)
+ : C.getPointerType(T);
+}
+return CachedLocationType;
   }
 
   QualType getDesugaredValueType(ASTContext ) const {


Index: clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
===
--- clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
+++ clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
@@ -529,6 +529,8 @@
 
 /// TypedValueRegion - An abstract class representing regions having a typed value.
 class TypedValueRegion : public TypedRegion {
+  mutable QualType CachedLocationType;
+
   void anchor() override;
 
 protected:
@@ -540,12 +542,13 @@
   virtual QualType getValueType() const = 0;
 
   QualType getLocationType() const override {
-// FIXME: We can possibly optimize this later to cache this value.
-QualType T = getValueType();
-ASTContext  = getContext();
-if (T->getAs())
-  return ctx.getObjCObjectPointerType(T);
-return ctx.getPointerType(getValueType());
+if (CachedLocationType.isNull()) {
+  QualType T = getValueType();
+  ASTContext  = getContext();
+  CachedLocationType = T->isObjCObjectType() ? C.getObjCObjectPointerType(T)
+ : C.getPointerType(T);
+}
+return CachedLocationType;
   }
 
   QualType getDesugaredValueType(ASTContext ) const {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits