[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-15 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

In D149733#4342705 , @kadircet wrote:

> In D149733#4342300 , @aaron.ballman 
> wrote:
>
>> But at the same time, it's become more obvious (at least to me) that clangd 
>> has features that don't work with all of the invariants in Clang and I don't 
>> know that we ever really stopped to figure out whether that's reasonable or 
>> not.
>
> Just to be clear, this behaviour we observe is not clangd specific. it's 
> triggered through code completion callbacks that's implemented through Sema, 
> which is used by clangd and libclang but are not part of clangd.
> You can basically trigger the same behaviour using `-Xclang 
> -code-completion-at=xxx`, if you trigger it while parsing the lambda you 
> receive an incomplete object `COMPLETION: a : [#auto#]a` (which I believe is 
> the part that breaks the invariants of clang AST, but I am happy to go with 
> the perspective that sema code completion doesn't get to rely on AST nodes 
> being complete because it can trigger in the middle of parsing, which I 
> believe implies all of the code completion users needs to deal with 
> consequences) and if you trigger it after parsing the lambda, you get the 
> proper object `COMPLETION: a : [#void#]a(<#Foo#>)[# const#]`.

code completions exist for clangd support, don't they? (We didn't have that 
functionality before we supported clangd, did we?)

> Inside clangd we don't do anything special with regards to feeding code to 
> clang. The only difference is we feed it incomplete code most of the time. 
> The expectation from our side is clang generating a ~okay shaped AST & 
> diagnostics rather than crashes. Any crashes that we trigger in our 
> functionality while traversing that AST, we mostly try to handle inside 
> clangd (when we seem to rely on some incidental contracts). But there are 
> also cases like this, which are not really AST handling on the clangd side 
> but some clang pieces triggering a crash and the only reason why they seem 
> related to clangd is we're just more likely to hit them as regular use-case 
> of clang doesn't involve being triggered at every keystroke :D
> I feel like this approach is quite minimally intrusive but if you disagree, I 
> am happy to follow up and change our approach going forward.

It's good to know that usually you handle this on the clangd side, that makes 
me more comfortable. I've been seeing a fair amount of changes to clang to 
support clangd's and clang-repl's needs for producing ~okay ASTs for malformed 
code, but I typically only review the clang side of those changes, so it's 
possible I'm not seeing how much goes on behind the scenes to avoid making 
changes to clang.

>> That's why I think we need a broader discussion. The problem is not 
>> ideological, it's one of maintainability of the primary product. For 
>> example, the community could decide "it is not Clang's job to be resilient 
>> to this sort of thing". Or we could decide "we need to be resilient to this 
>> but only if it doesn't introduce more than X% overhead". And so on. Each 
>> time we land another one of these whack-a-mole changes, we potentially make 
>> it harder to get to a more principled approach.
>
> Again, this change is only introducing this overhead to USRGeneration which 
> is only used by libclang and clangd, and both of these use cases are suppose 
> to not crash on incomplete code when trying to code-complete.

Today. Within clang. But there are in-tree uses of libclang (like clangd) that 
are impacted by this, and there's ever reason to believe that at some point, 
USRs might wind up hoisted into Clang proper. But your point is still valid 
that for the moment, this is pretty self-contained.

> Hence I don't think the maintenance and overhead arguments do actually apply 
> in this case (I know you're talking about the general case, but I believe we 
> should also be aware of the distinction going forward. As despite not 
> affecting rest of the callers neither code-complexity nor runtime 
> overhead-wise, we still held up this particular patch for ~2 weeks).

Just to make sure we're on the same page: I don't consider it a bug that it 
took ~2 weeks for us to determine if this patch was reasonable or not. Clang 
ships to millions of users; we SHOULD be quite concerned about ancillary tools 
pushing Clang into a worse shape. Those tools are secondary to doing what is 
right for Clang's primary use case, which is a C[++]/Objective-C[++] compiler.

>> (My personal feelings are that invariants about internal object state are 
>> going to be hard for us to change or introduce unwarranted overhead in at 
>> least some circumstances. In those circumstances, I think the onus is on 
>> clangd to determine how to work within those invariants and not on clang to 
>> change unless there isn't another reasonable option. Refactoring or adding 
>> smoke tests can introduce overhead 

[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-15 Thread Kadir Cetinkaya via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0d1be98a67e2: [clang][USR] Prevent crashes on incomplete 
FunctionDecls (authored by kadircet).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

Files:
  clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
  clang/lib/Index/USRGeneration.cpp


Index: clang/lib/Index/USRGeneration.cpp
===
--- clang/lib/Index/USRGeneration.cpp
+++ clang/lib/Index/USRGeneration.cpp
@@ -226,6 +226,11 @@
   if (ShouldGenerateLocation(D) && GenLoc(D, /*IncludeOffset=*/isLocal(D)))
 return;
 
+  if (D->getType().isNull()) {
+IgnoreResults = true;
+return;
+  }
+
   const unsigned StartSize = Buf.size();
   VisitDeclContext(D->getDeclContext());
   if (Buf.size() == StartSize)
Index: clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
===
--- clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
+++ clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
@@ -4002,6 +4002,19 @@
   EXPECT_EQ(Second.activeParameter, 1);
 }
 
+TEST(CompletionTest, DoNotCrash) {
+  llvm::StringLiteral Cases[] = {
+  R"cpp(
+template  struct Foo {};
+auto a = [x(3)](Foo<^>){};
+)cpp",
+  };
+  for (auto Case : Cases) {
+SCOPED_TRACE(Case);
+auto Completions = completions(Case);
+  }
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang


Index: clang/lib/Index/USRGeneration.cpp
===
--- clang/lib/Index/USRGeneration.cpp
+++ clang/lib/Index/USRGeneration.cpp
@@ -226,6 +226,11 @@
   if (ShouldGenerateLocation(D) && GenLoc(D, /*IncludeOffset=*/isLocal(D)))
 return;
 
+  if (D->getType().isNull()) {
+IgnoreResults = true;
+return;
+  }
+
   const unsigned StartSize = Buf.size();
   VisitDeclContext(D->getDeclContext());
   if (Buf.size() == StartSize)
Index: clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
===
--- clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
+++ clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
@@ -4002,6 +4002,19 @@
   EXPECT_EQ(Second.activeParameter, 1);
 }
 
+TEST(CompletionTest, DoNotCrash) {
+  llvm::StringLiteral Cases[] = {
+  R"cpp(
+template  struct Foo {};
+auto a = [x(3)](Foo<^>){};
+)cpp",
+  };
+  for (auto Case : Cases) {
+SCOPED_TRACE(Case);
+auto Completions = completions(Case);
+  }
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-15 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added a comment.

In D149733#4342300 , @aaron.ballman 
wrote:

> But at the same time, it's become more obvious (at least to me) that clangd 
> has features that don't work with all of the invariants in Clang and I don't 
> know that we ever really stopped to figure out whether that's reasonable or 
> not.

Just to be clear, this behaviour we observe is not clangd specific. it's 
triggered through code completion callbacks that's implemented through Sema, 
which is used by clangd and libclang but are not part of clangd.
You can basically trigger the same behaviour using `-Xclang 
-code-completion-at=xxx`, if you trigger it while parsing the lambda you 
receive an incomplete object `COMPLETION: a : [#auto#]a` (which I believe is 
the part that breaks the invariants of clang AST, but I am happy to go with the 
perspective that sema code completion doesn't get to rely on AST nodes being 
complete because it can trigger in the middle of parsing, which I believe 
implies all of the code completion users needs to deal with consequences) and 
if you trigger it after parsing the lambda, you get the proper object 
`COMPLETION: a : [#void#]a(<#Foo#>)[# const#]`.

Inside clangd we don't do anything special with regards to feeding code to 
clang. The only difference is we feed it incomplete code most of the time. The 
expectation from our side is clang generating a ~okay shaped AST & diagnostics 
rather than crashes. Any crashes that we trigger in our functionality while 
traversing that AST, we mostly try to handle inside clangd (when we seem to 
rely on some incidental contracts). But there are also cases like this, which 
are not really AST handling on the clangd side but some clang pieces triggering 
a crash and the only reason why they seem related to clangd is we're just more 
likely to hit them as regular use-case of clang doesn't involve being triggered 
at every keystroke :D
I feel like this approach is quite minimally intrusive but if you disagree, I 
am happy to follow up and change our approach going forward.

> That's why I think we need a broader discussion. The problem is not 
> ideological, it's one of maintainability of the primary product. For example, 
> the community could decide "it is not Clang's job to be resilient to this 
> sort of thing". Or we could decide "we need to be resilient to this but only 
> if it doesn't introduce more than X% overhead". And so on. Each time we land 
> another one of these whack-a-mole changes, we potentially make it harder to 
> get to a more principled approach.

Again, this change is only introducing this overhead to USRGeneration which is 
only used by libclang and clangd, and both of these use cases are suppose to 
not crash on incomplete code when trying to code-complete. Hence I don't think 
the maintenance and overhead arguments do actually apply in this case (I know 
you're talking about the general case, but I believe we should also be aware of 
the distinction going forward. As despite not affecting rest of the callers 
neither code-complexity nor runtime overhead-wise, we still held up this 
particular patch for ~2 weeks).

> (My personal feelings are that invariants about internal object state are 
> going to be hard for us to change or introduce unwarranted overhead in at 
> least some circumstances. In those circumstances, I think the onus is on 
> clangd to determine how to work within those invariants and not on clang to 
> change unless there isn't another reasonable option. Refactoring or adding 
> smoke tests can introduce overhead that typical compilation scenarios should 
> never have to pay the cost for, and we should avoid that as best we can. But 
> I also realize this adds burden to the clangd folks to have compiler 
> performance statistics for changes or refactoring that relate to invariants 
> which may or may not be reasonable.)

We're actually interested in overall performance of the compiler as well, hence 
I think we'd be fine with tracking implications of such changes whenever we 
feel like clang is the right layer to have them. But to do best of my 
knowledge, we don't really have a great way of testing out effects of such 
changes on compile times. Only useful tool that I know about is 
https://llvm-compile-time-tracker.com/, are there any other sources that 
explain how to run such benchmarks?

> The fact that this is happening 100s of times a day for you suggests we 
> should land the changes in this patch so there's less pressure when having 
> the broader discussion about where the division of labor is between clangd 
> and clang. So LGTM!

Thanks a lot!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-15 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman accepted this revision.
aaron.ballman added a comment.
This revision is now accepted and ready to land.

In D149733#4342095 , @kadircet wrote:

>> I think we probably should have a broader discussion before moving forward 
>> here. It's not that this isn't incremental progress fixing an issue, but 
>> it's more that this same justification works to add the workaround 200 more 
>> times without ever addressing the underlying architectural concerns.
>
> I can see the desire to fix such issues at a higher level, and not wanting to 
> play whack-a-mole in various places. But that's the reason why this patch 
> specifically increases robustness in a piece of code that already has some 
> logic to deal with invalid code. Rather than core clang pieces, which has 
> different assumptions about invariants of the AST (e.g. not adding the 
> null-check to every getType call in FunctionDecl methods).
> If we're not willing to land such fixes that only add relevant checks to 
> pieces of clang that's suppose to be more robust against invalid code, I 
> don't see how we can have any stable tooling in the short term. Most of the 
> feature work that introduces handling for new language constructs introduces 
> regressions on invalid code as it's not really a concern and never verified. 
> Hence we fix those afterwards as we notice them, fixing the implementation 
> whenever it's feasible or increasing robustness in rest of the pieces when 
> fixing the implementation is infeasible (or implementation is just right and 
> the application was relying on some assurances that were incidentally there 
> before). As we happen to hit some increasing resistance towards these 
> robustness improvement patches, it'd be nice to understand what should be the 
> way to fix them going forward. e.g. treat them as usual crashes, find the 
> offending patch and just revert it with the reproducer?

On the one hand, I agree with you about the fact that this code is already 
trying to be robust against invalid code and thus it's reasonable to add more 
checks for that. No argument from me about that! But at the same time, it's 
become more obvious (at least to me) that clangd has features that don't work 
with all of the invariants in Clang and I don't know that we ever really 
stopped to figure out whether that's reasonable or not. That's why I think we 
need a broader discussion. The problem is not ideological, it's one of 
maintainability of the primary product. For example, the community could decide 
"it is not Clang's job to be resilient to this sort of thing". Or we could 
decide "we need to be resilient to this but only if it doesn't introduce more 
than X% overhead". And so on. Each time we land another one of these 
whack-a-mole changes, we potentially make it harder to get to a more principled 
approach.

(My personal feelings are that invariants about internal object state are going 
to be hard for us to change or introduce unwarranted overhead in at least some 
circumstances. In those circumstances, I think the onus is on clangd to 
determine how to work within those invariants and not on clang to change unless 
there isn't another reasonable option. Refactoring or adding smoke tests can 
introduce overhead that typical compilation scenarios should never have to pay 
the cost for, and we should avoid that as best we can. But I also realize this 
adds burden to the clangd folks to have compiler performance statistics for 
changes or refactoring that relate to invariants which may or may not be 
reasonable.)

>> That said, is this issue blocking further work so you need to land this in 
>> order to make forward progress elsewhere?
>
> This is resulting in crashes in clangd whenever the users trigger code 
> completion in such code patterns (which is ~100s times a day in our 
> production setup, so not rare in practice). 
> So it isn't linked to any bigger task, but rather general robustness/QoL 
> improvements for invalid code handling in clang(d) which are quite important 
> especially on big projects. As a crash could cost minutes of warm-up times 
> because we need to rebuild all of the caches.

The fact that this is happening 100s of times a day for you suggests we should 
land the changes in this patch so there's less pressure when having the broader 
discussion about where the division of labor is between clangd and clang. So 
LGTM!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-15 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added a comment.

> I think we probably should have a broader discussion before moving forward 
> here. It's not that this isn't incremental progress fixing an issue, but it's 
> more that this same justification works to add the workaround 200 more times 
> without ever addressing the underlying architectural concerns.

I can see the desire to fix such issues at a higher level, and not wanting to 
play whack-a-mole in various places. But that's the reason why this patch 
specifically increases robustness in a piece of code that already has some 
logic to deal with invalid code. Rather than core clang pieces, which has 
different assumptions about invariants of the AST (e.g. not adding the 
null-check to every getType call in FunctionDecl methods).
If we're not willing to land such fixes that only add relevant checks to pieces 
of clang that's suppose to be more robust against invalid code, I don't see how 
we can have any stable tooling in the short term. Most of the feature work that 
introduces handling for new language constructs introduces regressions on 
invalid code as it's not really a concern and never verified. Hence we fix 
those afterwards as we notice them, fixing the implementation whenever it's 
feasible or increasing robustness in rest of the pieces when fixing the 
implementation is infeasible (or implementation is just right and the 
application was relying on some assurances that were incidentally there 
before). As we happen to hit some increasing resistance towards these 
robustness improvement patches, it'd be nice to understand what should be the 
way to fix them going forward. e.g. treat them as usual crashes, find the 
offending patch and just revert it with the reproducer?

> That said, is this issue blocking further work so you need to land this in 
> order to make forward progress elsewhere?

This is resulting in crashes in clangd whenever the users trigger code 
completion in such code patterns (which is ~100s times a day in our production 
setup, so not rare in practice). 
So it isn't linked to any bigger task, but rather general robustness/QoL 
improvements for invalid code handling in clang(d) which are quite important 
especially on big projects. As a crash could cost minutes of warm-up times 
because we need to rebuild all of the caches.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-15 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

In D149733#4321610 , @kadircet wrote:

>> However, I'm not keen on us playing whack-a-mole with the kinds of checks 
>> from this review. For starters, that's going to have a long-tail that makes 
>> it hard to know if we've ever gotten to the bottom of the issue. But also, 
>> each one of these checks is basically useless for the primary way in which 
>> Clang is consumed (as a compiler), so this increases compile times for 
>> limited benefit to "most" users.
>
> I completely agree, that's the reason why I've stayed away from adding those 
> checks to various FunctionDecl helpers (isVariadic, params, etc.).
>
>> In this particular case, we may be fine as this is limited to libclang and 
>> so it shouldn't add overhead for the common path, but I suspect we're going 
>> to find cases that need fixing in more common areas of the compiler that 
>> could be more troublesome.
>
> Agreed, and I am also pretty sure this is not the only place that can be 
> affected from incomplete decls/types. But this is the only one showing up 
> quite frequently ever since changes to lambda parsing.
> I think there's some strategy decision to be made about clang's invariants:
>
> - whether to accept declarations/types can be inspected in the middle of 
> parsing as a new constraint, declare all the existing violations as bugs and 
> fix them as we go (without introducing new ones, which is quite hard) and 
> give people the means to construct ast nodes "safely".
> - claim that variants are WAI and it's on use cases that perform such 
> inspections to figure out how to deal with consequences (e.g. in 
> code-completion consumers).
>
> But in either case, I don't think this review is the right medium to make 
> that decision. Surely it contains a lot of useful pointers, and I am happy to 
> move them to a discourse thread (LMK if I should do it, or you'd like to kick 
> it off @aaron.ballman) to raise awareness, but in the end this review is just 
> trying to fix an issue by adding extra checks to only the applications that 
> can violate contracts of clang parsing. So unless we've specific concerns 
> about the issue being addressed in this patch, I'd like to move forward.

I think we probably should have a broader discussion before moving forward 
here. It's not that this isn't incremental progress fixing an issue, but it's 
more that this same justification works to add the workaround 200 more times 
without ever addressing the underlying architectural concerns.

That said, is this issue blocking further work so you need to land this in 
order to make forward progress elsewhere?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-15 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added a comment.

gentle ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-05 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added a comment.

> However, I'm not keen on us playing whack-a-mole with the kinds of checks 
> from this review. For starters, that's going to have a long-tail that makes 
> it hard to know if we've ever gotten to the bottom of the issue. But also, 
> each one of these checks is basically useless for the primary way in which 
> Clang is consumed (as a compiler), so this increases compile times for 
> limited benefit to "most" users.

I completely agree, that's the reason why I've stayed away from adding those 
checks to various FunctionDecl helpers (isVariadic, params, etc.).

> In this particular case, we may be fine as this is limited to libclang and so 
> it shouldn't add overhead for the common path, but I suspect we're going to 
> find cases that need fixing in more common areas of the compiler that could 
> be more troublesome.

Agreed, and I am also pretty sure this is not the only place that can be 
affected from incomplete decls/types. But this is the only one showing up quite 
frequently ever since changes to lambda parsing.
I think there's some strategy decision to be made about clang's invariants:

- whether to accept declarations/types can be inspected in the middle of 
parsing as a new constraint, declare all the existing violations as bugs and 
fix them as we go (without introducing new ones, which is quite hard) and give 
people the means to construct ast nodes "safely".
- claim that variants are WAI and it's on use cases that perform such 
inspections to figure out how to deal with consequences (e.g. in 
code-completion consumers).

But in either case, I don't think this review is the right medium to make that 
decision. Surely it contains a lot of useful pointers, and I am happy to move 
them to a discourse thread (LMK if I should do it, or you'd like to kick it off 
@aaron.ballman) to raise awareness, but in the end this review is just trying 
to fix an issue by adding extra checks to only the applications that can 
violate contracts of clang parsing. So unless we've specific concerns about the 
issue being addressed in this patch, I'd like to move forward.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-03 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added subscribers: v.g.vassilev, junaire, rsmith.
aaron.ballman added a comment.

Adding in some of the clang-repl folks because they are likely to run into this 
as well, and adding Richard in case he has more historical context.

What I understand of the historical context here is that we've assumed very 
early on that inspecting state of partially-constructed AST nodes only happens 
when debugging (through things like calls to `dump()`, etc). So it was assumed 
that having a partially constructed object which is updated as we continue 
parsing and semantic analysis is fine because the only way to break the 
invariant is from the debugger. However, over time, we've invalidated that 
assumption more and more (REPL, libclang, etc) and now we're paying the price.

I think that refactoring the way we construct AST nodes so that they're not in 
a partially-constructed state by the time we've called `Create()` is a 
nonstarter; I think we've got too much reliance built up on that pattern. For 
example, we do `CreateEmpty()` + building up state as a matter of course when 
deserializing the AST for PCH or modules.  However, I'm not keen on us playing 
whack-a-mole with the kinds of checks from this review. For starters, that's 
going to have a long-tail that makes it hard to know if we've ever gotten to 
the bottom of the issue. But also, each one of these checks is basically 
useless for the primary way in which Clang is consumed (as a compiler), so this 
increases compile times for limited benefit to "most" users. In this particular 
case, we may be fine as this is limited to libclang and so it shouldn't add 
overhead for the common path, but I suspect we're going to find cases that need 
fixing in more common areas of the compiler that could be more troublesome.

I wish I had a good answer for how to address this, but I don't have one off 
the top of my head. About the closest I can think of is to use a bit on the 
declaration to say "I am being worked on still" and add a `Finalize()` method 
to say "I am no longer being worked on" and perform sanity checks and a getter 
method to tell us if the AST node is still under construction or not. That 
gives us a generic way to quickly test "should we be inspecting this further" 
in circumstances where it matters and there is precedent with how we handle 
parsing declaration specifiers 
(https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Sema/DeclSpec.h#L844),
 but this isn't all that much better than ad hoc tests like checking it the 
type is null or not (so I'm not suggesting this is the best way to solve the 
problem, just spitballing ideas).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-03 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added a comment.

In D149733#4315360 , @erichkeane 
wrote:

> I don't recall the case of a Null type being valid for functions (perhaps 
> Aaron does?  Or perhaps its an error condition?).  But otherwise, I would 
> expect `FunctionDecl` to have a type as soon as it escapes the function 
> responsible for 'making' it (that is, the one that calls the ::Create and is 
> filling in everything).

I guess that's where the issue comes from. In theory 
`Parser::ParseLambdaExpressionAfterIntroducer` calls both 
`Sema::ActOnLambdaExpressionAfterIntroducer` (which creates incomplete method 
decl) and `Sema::ActOnStartOfLambdaDefinition` (which populates type info), so 
once we exit the parse function the methoddecl should have a valid type.
But the way clang implements code completion breaks that guarantee, as parser 
can hand out those semi-complete declarations to code completion consumers if 
code completion point is reached while parsing the tokens in between.
More importantly I don't know if this "variant" holds in all the places that 
creates a function decl (even in this example we can actually take an early 
exit 

 but we at least mark the decl invalid before doing so).

> If that patch added a path that doesn't set the return type, perhaps we 
> should just fix that?

it's not just the return type, it's the full type for the function, see here 

 (line 919 of clang/lib/Sema/SemaLambda.cpp in case the link doesn't work).
due to above mentioned reasons, i don't think there's anything wrong with that 
patch in particular (either it's fine or we've a bunch more cases that are 
wrong and hard to fix due to the first reason you mentioned), it just separated 
introduction and finalisation of a decl. 
https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclCXX.cpp#L14469
 is another example (and we've got bunch more).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-03 Thread Erich Keane via Phabricator via cfe-commits
erichkeane added a comment.

Hmm... So I don't see us being able to change how we create `FunctionDecl`s, 
the way we are doing it with a 'build up to the final thing' is about all we 
can do.  However, it should never escape its 'creation' functions without its 
type set.

IF I were to design this from scratch, I probably would have designed a 
`FunctionDeclBuilder` type thing that modeled a `FunctionDecl`, but asserted 
when being `finalized` (to get the `FunctionDecl` from it) if it wasn't 
complete.  However, we don't really have anything that models that in this 
codebase, and I'm not sure we really want to add something like that...

I don't recall the case of a Null type being valid for functions (perhaps Aaron 
does?  Or perhaps its an error condition?).  But otherwise, I would expect 
`FunctionDecl` to have a type as soon as it escapes the function responsible 
for 'making' it (that is, the one that calls the ::Create and is filling in 
everything).  If that patch added a path that doesn't set the return type, 
perhaps we should just fix that?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-03 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added a comment.

FWIW, 
https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/Decl.cpp#L2987 is 
the assertion that suggests "null type is fine for functiondecls", but as 
pointed out pretty much all of the methods assume it's non-null && most of the 
users also don't check for validness before use.
so my 2cents, in the past this intermediate state of a functiondecl wasn't 
observable outside, but overtime that "constraint" got violated in various 
places.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-03 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added a subscriber: erichkeane.
ilya-biryukov added a comment.

It's true that `FunctionType` having a null type does break a lot of even its 
own methods,
e.g. even something as simple as `isVariadic` will dereference a null pointer 
as it uses `getType()->getAs()`.

I am not entirely sure what the contract there should be either.

- Is it the client code of `FunctionDecl` itself that's responsible for 
checking the null type or methods of `FunctionDecl`?
- Would it be easier to avoid creating the functions altogether? (I remember 
seeing some legitimate use-cases for null types there, but I wonder if those 
could be modelled differently).

@aaron.ballman, @erichkeane what is Clang's stance on those? How should we 
proceed here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149733/new/

https://reviews.llvm.org/D149733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149733: [clang][USR] Prevent crashes on incomplete FunctionDecls

2023-05-03 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet created this revision.
kadircet added reviewers: aaron.ballman, ilya-biryukov.
Herald added a subscriber: arphaman.
Herald added a project: All.
kadircet requested review of this revision.
Herald added projects: clang, clang-tools-extra.
Herald added a subscriber: cfe-commits.

FunctionDecls can be created with null types (D124351 
 added such a new
code path), to be filled in later. But parsing can stop before
completing the Decl (e.g. if code completion
point is reached).
Unfortunately most of the methods in FunctionDecl and its derived
classes assume a complete decl and don't perform null-checks.
Since we're not encountring crashes in the wild along other code paths
today introducing extra checks into quite a lot of places didn't feel
right (due to extra complexity && run time checks).
I believe another alternative would be to change Parser & Sema to never
create decls with invalid types, but I can't really see an easy way of
doing that, as most of the pieces are structured around filling that
information as parsing proceeds.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D149733

Files:
  clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
  clang/lib/Index/USRGeneration.cpp


Index: clang/lib/Index/USRGeneration.cpp
===
--- clang/lib/Index/USRGeneration.cpp
+++ clang/lib/Index/USRGeneration.cpp
@@ -226,6 +226,11 @@
   if (ShouldGenerateLocation(D) && GenLoc(D, /*IncludeOffset=*/isLocal(D)))
 return;
 
+  if (D->getType().isNull()) {
+IgnoreResults = true;
+return;
+  }
+
   const unsigned StartSize = Buf.size();
   VisitDeclContext(D->getDeclContext());
   if (Buf.size() == StartSize)
Index: clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
===
--- clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
+++ clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
@@ -4002,6 +4002,19 @@
   EXPECT_EQ(Second.activeParameter, 1);
 }
 
+TEST(CompletionTest, DoNotCrash) {
+  llvm::StringLiteral Cases[] = {
+  R"cpp(
+template  struct Foo {};
+auto a = [x(3)](Foo<^>){};
+)cpp",
+  };
+  for (auto Case : Cases) {
+SCOPED_TRACE(Case);
+auto Completions = completions(Case);
+  }
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang


Index: clang/lib/Index/USRGeneration.cpp
===
--- clang/lib/Index/USRGeneration.cpp
+++ clang/lib/Index/USRGeneration.cpp
@@ -226,6 +226,11 @@
   if (ShouldGenerateLocation(D) && GenLoc(D, /*IncludeOffset=*/isLocal(D)))
 return;
 
+  if (D->getType().isNull()) {
+IgnoreResults = true;
+return;
+  }
+
   const unsigned StartSize = Buf.size();
   VisitDeclContext(D->getDeclContext());
   if (Buf.size() == StartSize)
Index: clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
===
--- clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
+++ clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
@@ -4002,6 +4002,19 @@
   EXPECT_EQ(Second.activeParameter, 1);
 }
 
+TEST(CompletionTest, DoNotCrash) {
+  llvm::StringLiteral Cases[] = {
+  R"cpp(
+template  struct Foo {};
+auto a = [x(3)](Foo<^>){};
+)cpp",
+  };
+  for (auto Case : Cases) {
+SCOPED_TRACE(Case);
+auto Completions = completions(Case);
+  }
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits