ilya-biryukov added inline comments.
================
Comment at: clang/include/clang/AST/ExprCXX.h:4824
+ const Expr *getArrayFiller() const {
+ return const_cast<CXXParenListInitExpr *>(this)->getArrayFiller();
+ }
----------------
NIT: maybe just use `return ArrayFiller`? The code is much simpler and since
functions are very close it's pretty much impossible that anyone would refactor
one of those without looking at the other.
================
Comment at: clang/lib/AST/ExprConstant.cpp:10000
ImplicitValueInitExpr VIE(Field->getType());
- const Expr *InitExpr = E->getNumInits() ? E->getInit(0) : &VIE;
+ const Expr *InitExpr = Args.size() ? Args[0] : &VIE;
----------------
NIT: `!Args.empty() ? Args[0] : &VIE`
`Args.size()` also works, but arguably captures the intent of the code a bit
less clearly.
================
Comment at: clang/lib/CodeGen/CGExprAgg.cpp:581
+ Expr *filler = nullptr;
+ if (auto *ILE = dyn_cast<InitListExpr>(ExprToVisit))
+ filler = ILE->getArrayFiller();
----------------
ayzhao wrote:
> ayzhao wrote:
> > ilya-biryukov wrote:
> > > ilya-biryukov wrote:
> > > > ayzhao wrote:
> > > > > ilya-biryukov wrote:
> > > > > > - Should we have a filler for `CXXParenInitListExpr` too?
> > > > > > It seems like an important optimization and could have large
> > > > > > effect on compile times if we don't
> > > > > >
> > > > > > - Same question for semantic and syntactic-form (similar to
> > > > > > `InitListExpr`): should we have it here?
> > > > > > I am not sure if it's semantically required (probably not?), but
> > > > > > that's definitely something that `clang-tidy` and other source
> > > > > > tools will rely on.
> > > > > >
> > > > > > We should probably share the code there, I suggest moving it to a
> > > > > > shared base class and using it where appropriate instead of the
> > > > > > derived classes.
> > > > > > Should we have a filler for CXXParenInitListExpr too? It seems like
> > > > > > an important optimization and could have large effect on compile
> > > > > > times if we don't
> > > > >
> > > > > This feels like premature optimization - presumably, wouldn't this
> > > > > only be an issue with extraordinarily large (say, O(1000)) arrays?
> > > > >
> > > > > > Same question for semantic and syntactic-form (similar to
> > > > > > InitListExpr): should we have it here? I am not sure if it's
> > > > > > semantically required (probably not?), but that's definitely
> > > > > > something that clang-tidy and other source tools will rely on
> > > > >
> > > > > IIRC this doesn't apply to paren list aggregate expressions, as the
> > > > > syntatic form would be the enclosing `ParenListExpr`.
> > > > > This feels like premature optimization - presumably, wouldn't this
> > > > > only be an issue with extraordinarily large (say, O(1000)) arrays?
> > > > Yes, this should only happen with large arrays. Normally I would agree,
> > > > but it's surprising that changing `{}` to `()` in the compiler would
> > > > lead to performance degradation.
> > > > In that sense, this premature optimization is already implemented, we
> > > > are rather degrading performance for a different syntax to do the same
> > > > thing.
> > > >
> > > > Although we could also land without it, but in that case could you open
> > > > a GH issue and add a FIXME to track the implementation of this
> > > > particular optimization?
> > > > This should increase the chances of users finding the root cause of the
> > > > issue if they happen to hit it.
> > > >
> > > > > IIRC this doesn't apply to paren list aggregate expressions, as the
> > > > > syntatic form would be the enclosing ParenListExpr.
> > > > Do we even have the enclosing `ParenListExpr`? If I dump the AST with
> > > > `clang -fsyntax-only -Xclang=-ast-dump ...` for the following code:
> > > > ```
> > > > struct pair { int a; int b = 2; };
> > > > int main() {
> > > > pair(1); pair p(1);
> > > > pair b{1}; pair{1};
> > > > }
> > > > ```
> > > >
> > > > I get
> > > > ```
> > > > `-FunctionDecl 0x557d79717e98 <line:2:1, line:5:1> line:2:5 main 'int
> > > > ()'
> > > > `-CompoundStmt 0x557d797369d0 <col:12, line:5:1>
> > > > |-CXXFunctionalCastExpr 0x557d79718528 <line:3:3, col:9>
> > > > 'pair':'pair' functional cast to pair <NoOp>
> > > > | `-CXXParenListInitExpr 0x557d79718500 <col:3> 'pair':'pair'
> > > > | |-IntegerLiteral 0x557d79718010 <col:8> 'int' 1
> > > > | `-IntegerLiteral 0x557d79717e18 <line:1:30> 'int' 2
> > > > |-DeclStmt 0x557d79718650 <line:3:12, col:21>
> > > > | `-VarDecl 0x557d79718568 <col:12, col:17> col:17 p 'pair':'pair'
> > > > parenlistinit
> > > > | `-CXXParenListInitExpr 0x557d79718610 <col:17> 'pair':'pair'
> > > > | |-IntegerLiteral 0x557d797185d0 <col:19> 'int' 1
> > > > | `-IntegerLiteral 0x557d79717e18 <line:1:30> 'int' 2
> > > > |-DeclStmt 0x557d797187d8 <line:4:3, col:12>
> > > > | `-VarDecl 0x557d79718680 <col:3, col:11> col:8 b 'pair':'pair'
> > > > listinit
> > > > | `-InitListExpr 0x557d79718750 <col:9, col:11> 'pair':'pair'
> > > > | |-IntegerLiteral 0x557d797186e8 <col:10> 'int' 1
> > > > | `-CXXDefaultInitExpr 0x557d797187a0 <col:11> 'int'
> > > > `-CXXFunctionalCastExpr 0x557d797369a8 <col:14, col:20>
> > > > 'pair':'pair' functional cast to pair <NoOp>
> > > > `-InitListExpr 0x557d79718868 <col:18, col:20> 'pair':'pair'
> > > > |-IntegerLiteral 0x557d79718800 <col:19> 'int' 1
> > > > `-CXXDefaultInitExpr 0x557d797188b8 <col:20> 'int'
> > > > ```
> > > > It feels like the `ParentListExpr` is replaced during semantic analysis
> > > > and there is no way to get it back. I also tried running `clang-query`
> > > > and trying to `match parenListExpr()` and go 0 results. Is it just
> > > > missing in the AST dump and I run `clang-query` incorrectly or do we
> > > > actually not have the syntactic form of this expression after all?
> > > Another important thing to address from the dump: notice how braced
> > > initialization creates `CXXDefaultInitExpr` and `CXXParenListInitExpr`
> > > copies the default argument expression directly. It's really important to
> > > use the former form, here's the example that currently breaks:
> > >
> > >
> > > ```
> > > #include <iostream>
> > >
> > > struct func_init { int some_int; const char* fn = __builtin_FUNCTION(); };
> > >
> > > int main() {
> > > func_init a(10);
> > > std::cout << "From paren init:" << a.fn << std::endl;
> > >
> > > func_init b{10};
> > > std::cout << "From braced init: " << b.fn << std::endl;
> > > }
> > > ```
> > >
> > > The program above is expected to report `main` for both cases, but
> > > instead we get:
> > > ```
> > > From paren init:
> > > From braced init: main
> > > ```
> > The following have now been implemented:
> >
> > * `CXXDefaultInitExpr` and `ImplicitValueInitExpr`, which includes adding a
> > test for `__builtin_FUNCTION()`
> > * Array fillers
> > * Semantic forms vs syntactic forms
> Instead of implementing separate syntactic and semantic forms, I think it
> might be simpler and cleaner to store an `unsigned NumUserSpecifiedExprs`.
>
> The reason for implementing it this way is that the purpose of having
> separate syntactic and semantic forms is to determine what the form in the
> code was originally as written and what the form is after semantic analysis.
> For `InitListExpr`, we need separate objects because there can be a
> significant difference in the two forms (e.g. designated initializers, etc.).
> However, `CXXParenListInitExpr` is much simpler - the first k expressions are
> user-specified, while the remaining are default/compiler-generated.
> Therefore, we can eliminate the complexity required for maintaining two forms
> (say, in `TreeTransform`) by just storing k.
>
> N.B. this also resolves a build failure with an earlier diff where the libc++
> bots were failing.
>
> There is also precedence for this kind of structure - function calls with
> default arguments use similar logic instead of creating two `Expr` objects.
That looks fine indeed, I agree that `CXXParenInitListExpr` is simpler.
================
Comment at: clang/lib/Sema/SemaInit.cpp:5281
+
+ // llvm::errs() << "zzzz VerifyOnly is " << VerifyOnly << '\n';
+
----------------
NIT: reminder to remove this and other debug prints
================
Comment at: clang/lib/Sema/SemaInit.cpp:5326
+ E->getExprLoc(), /*isDirectInit=*/false, E);
+ InitializationSequence Seq(S, SubEntity, SubKind, E);
+
----------------
NIT: maybe rename to `SubSeq` or something similar?
Having both `Seq` and `Sequence` in scope with the same type is error-prone and
hard to comprehend.
================
Comment at: clang/lib/Sema/SemaInit.cpp:5380
+ }
+ InitExprs.push_back(ER.get());
+ }
----------------
ayzhao wrote:
> ayzhao wrote:
> > So the libc++ test compile failures are due to this line.
> >
> > One example of a failing unit test is
> > [range.take.while/ctor.view.pass](https://github.com/llvm/llvm-project/blob/main/libcxx/test/std/ranges/range.adaptors/range.take.while/ctor.view.pass.cpp).
> > Clang calls this function twice in `TreeTransform.h` - once with
> > `VerifyOnly` set to `true`, once with it set to `false`.
> >
> > For some reason, when this function tries to value-initialize the member
> > `MoveOnly mo` in `View`, `Seq.Failed()` returns false after
> > `TryValueInitialization(...)`, but the resulting `ExprResult` is `nullptr`,
> > causing the segfault we see when we push `nullptr` to `InitExprs` and pass
> > `InitExprs` to the constructor of `CXXParenListInitExpr`. One way to be fix
> > this is to move the line `ExprResult ER = Seq.Perform(...)` out of the `if
> > (!VerifyOnly)` block and check for `ER.isInvalid()` instead of
> > `Seq.Failed()`, but that results in test failures due to excess diagnostic
> > messages in `Seq.Perform(...)`
> >
> > I'm still looking into this, but if anyone has any ideas, they would be
> > very welcome.
> >
> > To repro the buildbot failures, just build clang with this patch, and then
> > in a separate build directory, build the target `check-cxx` using the
> > previously built clang.
> I was able to get the above workaround to pass the test by clearing the
> diagnostics after calling `Seq.Perform(...)`.
>
> IMO, this should be OK for now, but I'm open to better ideas if anyone has
> any.
Clearing all the diagnostics is a nuclear options and definitely seems off
here. We should not call `Perform()` when `VerifyOnly` is `true` to avoid
producing the diagnostics in the first place.
It's fine for the call with `VerifyOnly = true` to return no errors and later
produce diagnostics with `VerifyOnly = false`, I believe this is what
`InitListChecker` is already doing.
I have been playing around with the old version of the code, but couldn't fix
it fully. I do have a small example that breaks, we should add it to the test
and it should also be easier to understand what's going on:
```
struct MoveOnly
{
MoveOnly(int data = 1);
MoveOnly(const MoveOnly&) = delete;
MoveOnly(MoveOnly&&) = default;
};
struct View {
int a;
MoveOnly mo;
};
void test() {
View{0};
View(0); // should work, but crashes and produces invalid diagnostics.
}
```
In general, my approach would be to try mimicing what `InitListChecker` is
doing as much as possible, trimming all the unnecessary complexity that
braced-init-lists entail.
Hope it's helpful.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129531/new/
https://reviews.llvm.org/D129531
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits