rsmith added inline comments.

================
Comment at: clang/lib/CodeGen/CGDeclCXX.cpp:563
+  } else if (IsInstantiation ||
              getContext().GetGVALinkageForVariable(D) == GVA_DiscardableODR ||
              D->hasAttr<SelectAnyAttr>()) {
----------------
rnk wrote:
> @rsmith , if inline global variable initialization has ordering requirements, 
> we have a bug, because I believe this GVA_DiscardableODR codepath handles 
> them, and we come through here to give them separate initializers in 
> llvm.global_ctors. See the example with two separate global_ctors entries on 
> godbolt:
> https://gcc.godbolt.org/z/5d577snqb
> 
> As long as LLVM doesn't provide ordering guarantees about same priority 
> initializers in global_ctors, inline globals have the same problems as 
> template instantiations. IMO whatever solution we use to order inline globals 
> should be used for template instantiations.
> 
> Intuitively, that means LLVM should promise to run global_ctors in 
> left-to-right order, and if all TUs instantiate initializers in the same 
> order, everything should behave intuitively.
> 
> The question then becomes, why doesn't this work already?
> Intuitively, that means LLVM should promise to run global_ctors in 
> left-to-right order

I don't think that's sufficient, due to the way we use COMDATs to discard 
duplicate global initializers. Consider:

TU 1 defines inline variable A.
TU 2 defines inline variable A and then inline variable B.
The standard guarantees that A is initialized before B in this scenario. But if 
(somehow) the linker picks the definition of A from TU 2, but orders the 
initializers from TU 1 first, then the resulting global_ctors order will be B 
then A, which is not allowed.

Either we need a guarantee that linkers will use the same ordering between 
objects when picking COMDATs as when concatenating `.init_array`, or we need to 
stop using the COMDAT trick for them (and either make LLVM respect the 
`@llvm.global_ctors` order or coalesce all inline variable initializers into 
the same function we run non-inline initializers from or something). Getting 
that guarantee seems like the best path to me, since the other option will 
presumably mean we check the guard variable once on startup for each TU that 
defines the variable, and it's something I expect linkers already happen to 
guarantee in practice.

> IMO whatever solution we use to order inline globals should be used for 
> template instantiations.

That sounds like it would regress what globalopt is able to optimize, for no 
gain in conformance nor perhaps any real gain in initialization order 
guarantees. The inline variable case is different, both because we can 
guarantee an initialization order and because the standard requires us to do 
so. Should we add complexity to the compiler whose only purpose is to mask 
bugs, if that complexity doesn't actually define away the possibility of bad 
behavior?

If we can actually describe a rule that we provide for initialization order of 
instantiated variables, and we can easily implement that rule and be confident 
we won't want to substantially weaken it later, and we can thereby assure our 
users that we will satisfy that rule, then I think that could be interesting, 
but anything less than that doesn't seem worthwhile to me.

> The question then becomes, why doesn't this work already?

It looks like it mostly does.

One factor here is instantiation order. When we instantiate a variable, we add 
the variables that it references to our "to be instantiated" list, which is 
processed later:
```
template<int N> int Fib = Fib<N - 2> + Fib<N - 1>;
template<> int Fib<0> = 0;
template<> int Fib<1> = 1;
```
* instantiating `Fib<5>` will append `Fib<3>` and `Fib<4>` to the list
* then we visit `Fib<3>` and append `Fib<1>` and `Fib<2>` to the list
* then we visit `Fib<4>` and add no new entries

We pass declarations to the consumer when we're done with the instantiation 
step. *Sometimes* this includes instantiating variables referenced by that 
variable, and *sometimes* it doesn't. The difference is whether we're 
performing "recursive" instantiation or not.

When we're performing immediate instantiation of a variable (either because it 
was explicitly instantiated, or because we might need its value immediately 
because it might be usable in constant expressions), our instantiation step is 
non-recursive. We just add declarations to `Sema`'s "to be instantiated at end 
of TU" list. This is at least a little important semantically: we allow a 
matching specialization to be declared after the first use wherever possible. 
In that case, we'll pass a declaration to the consumer before we've 
instantiated the things it references.

When we're performing the end-of-TU instantiation of all referenced template 
specializations, we do that recursively, and that means that we will 
instantiate any referenced variables *before* we pass the referencing variable 
to the AST consumer.

You can see this happening here: https://godbolt.org/z/4sj1Y7W4G (look at the 
order in which we get the warnings: for `Fib<5>` then `x` then `Fib<3>`, then 
`Fib<2>`, then `Fib<4>`, and note that we initialize `Fib<5>` first, because 
it's the first thing added to the consumer. Then `Fib<2>`, `Fib<3>`, and 
`Fib<4>` get passed to the consumer *in that order*, because that is the order 
in which the instantiations of their definitions *finish*. But `Fib<5>`'s 
instantiation finishes first, because that's a non-recursive instantiation.

Without the explicit instantiation, we pass the variables to the AST consumer 
in the order `Fib<2>`, `Fib<3>`, `Fib<4>`, `Fib<5>`: 
https://godbolt.org/z/sax1dvh7z leading to an "intuitive" result. They end up 
dependency-ordered because that's the order in which their instantiations 
happen to finish. (Example with a static data member: 
https://godbolt.org/z/9fsbPvWTP)

Another factor (and the reason why my examples are working and @ychen's very 
similar examples are not) is `CodeGenModule`'s deferred emission of variables. 
If an instantiated variable has an initializer with no side effects, 
`CodeGenModule` won't emit it unless it emits a reference to it (there's no 
point emitting something that will just be discarded by the linker). And 
`CodeGenModule`'s process for emitting deferred declarations does all kinds of 
reordering. The way this works is:

`CodeGenModule` is handed the globals, sees they don't need to be emitted, and 
ignores them. Then:

* it emits a reference to `Fib<5>` and decides that `Fib<5>` needs to be 
emitted and adds it to a queue of deferred declarations to emit
* it emits the deferred declaration `Fib<5>`, which adds `Fib<3>` and `Fib<4>` 
to a new queue for things used by `Fib<5>`
* it emits the things used by `Fib<5>`: `Fib<3>` and `Fib<4>`
  * emitting `Fib<3>` adds `Fib<2>` to a new queue for things used by `Fib<3>`, 
so `Fib<2>` is emitted next
  * emitting `Fib<4>` adds nothing new to the queue

So when the initializers don't have side-effects, the variables are initialized 
in the order `Fib<5>`, `Fib<3>`, `Fib<2>`, `Fib<4>`.

So the "reversing" has at least two sources (beyond anything that globalopt or 
the linker might do):
* non-recursive instantiations in `Sema` will cause an instantiated variable to 
be passed to the consumer before the things it references; this can mostly be 
avoided by not using explicit instantiations
* instantiated variables with side-effect-free initializers have their 
initializers emitted on use rather than in instantiation order; this can be 
avoided by building with `-femit-all-decls` or by adding a dummy side-effect to 
the initializer

If we want to fix just the `CodeGen` side of this, I think the thing to do 
would be to follow the model that `CodeGen` uses for ordered initialization (it 
tracks a `DelayedCXXInitPosition` map giving the order in which the variables 
should be initialized, if their initializers are actually emitted). We could do 
the same thing for instantiated variables, allocating each one handed to 
CodeGen a slot which either gets filled in with that initializer, or doesn't 
get emitted if the variable is not emitted. But, as noted above, I'm not 
convinced it's worth it unless this leads to some actual user-facing behavior 
guarantee.


================
Comment at: clang/lib/CodeGen/CGDeclCXX.cpp:582
 
-    AddGlobalCtor(Fn, 65535, COMDATKey);
+    AddGlobalCtor(Fn, 65535, COMDATKey, IsInstantiation);
     if (COMDATKey && (getTriple().isOSBinFormatELF() ||
----------------
rsmith wrote:
> This is effectively initializing instantiated variables in reverse 
> instantiation order. That seems like it'll make things worse as much as it 
> makes things better. For example, given:
> 
> ```
> #include <iostream>
> template<typename T> int a = (std::cout << "hello, ", 0);
> template<typename T> int b = (std::cout << "world", 0);
> int main() {
>   (void)a<int>;
>   (void)b<int>;
> }
> ```
> 
> ... we currently print `"hello, world"`, but with this change we'll print 
> `"worldhello, "`.
> 
> If we want a sensible initialization order, I think we need a different 
> strategy, that will probably require `Sema` to be a lot more careful about 
> what order it instantiates variables in and what order it passes them to the 
> AST consumer: if an instantiation A triggers another instantiation B, we 
> should defer passing A to the consumer until B has been instantiated and 
> passed to the consumer. That's probably not too hard to implement, by adding 
> an entry to the pending instantiation list to say "now pass this to the 
> consumer" in the case where one instantiation triggers another. But I do 
> wonder whether that level of complexity is worthwhile, given that code 
> relying on this behavior is broken.
Please see my other (long) comment; `Sema` actually already does what I 
describe here, except in the cases where it does an eager, non-recursive 
instantiation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126341/new/

https://reviews.llvm.org/D126341

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to