https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101061
Bug ID: 101061 Summary: tree-vrp misoptimization on skylake+ using union-based aliasing Product: gcc Version: 8.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alexander.gr...@tu-dresden.de Target Milestone: --- I see an algorithm returning wrong values which happens when using `-ftree-vrp` (or -O2) and `-march=skylake`. This happens for 8.3.0 and 8.4.0 but not with 9.1 or any later version, but I haven't found a particular fix for this and am unsure it is actually fixed or just doesn't occur due to unrelated instruction reordering. Especially as it also disappears when using `-march=skylake -mtune=broadwell` The code core code is a loop: for (Eigen::Index i = 0, j = 0; i < N; ++i) { auto it = uniq.emplace(Tin(i), j); idx_vec(i) = it.first->second; if (it.second) { ++j; } } where `uniq` is Abseils flat_hash_map over <string_view, int>, Tin a string_view-Tensor and idx_vec is an Eigen Tensor, but could as well be a std::vector<int> or even a raw-pointer (both tested) I also tested to assign `it.first->second` into 2 arrays and only the first would be wrong, so even when swapping them, then again the other (now first) will be wrong. Wrong values include zeros (most often) as well as seemingly fully random values including very large ones. The exact same algorithm (templated) works for other all types instead of string_view, e.g. int. I (think I) traced it down due to the flat-hash-map (i.e. unordered_map with a contiguous storage) which at its core uses the following union: template <class K, class V> union map_slot_type { map_slot_type() {} ~map_slot_type() = delete; using value_type = std::pair<const K, V>; using mutable_value_type = std::pair<K, V>; value_type value; mutable_value_type mutable_value; K key; }; Over various templates when emplacing/constructing a map entry it calls: `alloc->construct(&slot->mutable_value, std::forward<Args>(args)...)` The access (i.e. what the iterator at `it.first` will return when dereferenced) is done via `value_type& element(slot_type* slot) { return slot->value; }` --> Construction is done via placement-new of the non-const pair member of the union while access happens through the const pair member of the union. According to https://timsong-cpp.github.io/cppwp/n3337/class.mem#19 (9.2.19) this is allowed, because those types are layout-compatible. I'd suspect GCC 8 misses this (sometimes?). When doing the construction via `&slot->value` all works, in this case. I'll attach the preprocessed source, although it is large. All attempts of mine to reproduce this in a minimized example failed as even minor changes made it disappear.