On 5/12/25 6:02 PM, Tomasz Kaminski wrote:
Thank you for all the work that you have done by doing the two
implementations and
extensive test cases. I wanted to respond to a few points that I think we
may want
to consider to be bugs in specification, and report them as bugs in
standard.
(Would you be interested in doing so?)
I do not understand why the following is the case:
- `layout_left::mapping<Extents> == layout_right::mapping<OExtents>`
is valid, if and only iff `Extents != OExtents`.
For the approach with bases, I was considering something less excessive
where we only
extract Rank0 and Rank1 bases, and not separation at each possible levels.
Yes, I'm interested in reporting them and I would like to continue the
discussion in a new thread:
https://gcc.gnu.org/pipermail/libstdc++/2025-May/061358.html
Regarding, which commits to publish, given that:
* your test showed no difference in the optimized binary
* various inconsistencies in the current specification (we may want to
address them)
* your preference towards separate flat implementation
I think you should move forward with each layout being totally separate.
I'll prepare and submit a new patch series with the flat
implementation. Please let me know if there's more that be checked.
Regards,
Tomasz
On Mon, May 12, 2025 at 5:17 PM Luc Grosheintz <luc.groshei...@gmail.com>
wrote:
On 5/9/25 8:16 AM, Tomasz Kaminski wrote:
The test I would perform would be :
std::layout_left::mapping<std::extents<int>> l0;
std::layout_right:mapping<std::extents<int>> r0;
// stride
bool all_unique()
{
return l0.is_unique();
return r0.is_unique();
}
And we should have only one is_unique symbol.
but with a lot more duplication. Then compile it with:
gcc -O2 -c many_symbols.cc
I finished both implementation and wrote the type of test file we
agreed on, it calls all or almost all ctors, operator(), is_exhaustive,
extents() and stride. I compiled the test file against both
implementations (which both pass all the tests).
The generated code in `.text` on `-g -O2` is exceedingly similar (often
identical) whether we use the flat implementation or the one using base
classes. Using `nm` I find no symbols (other than those for the global
variables and the functions to exercise the layout code), everything has
been inlined.
When looking at code compiled with `-O0` I see more symbols with the
implementation that uses base classes to reduce the number of symbols. I
looked at the symbols and can confirm that the method `extents` exists
exactly once for each specialization of `std::extents` (and not once per
specialization per layout); same for `operator()`.
However, when looking at ctors I find that (certain) inherited ctors are
given a symbol. As an example let's look at:
```
#include <mdspan>
constexpr size_t dyn = std::dynamic_extent;
using M1 = std::layout_left::mapping<std::extents<int, 5>>;
using M2 = std::layout_left::mapping<std::extents<char, 5>>;
M1 ctor_cycle_1d(M2 m2) { return M1(m2); }
```
This results in the following three (demangled) symbols:
std::layout_left::mapping<std::extents<int, 5ul> >
::_Rank1_mapping_base<std::extents<char, 5ul> >(
std::__mdspan::_Rank1_mapping_base<std::extents<char, 5ul> > const&)
std::__mdspan::_Mapping_base<std::extents<int, 5ul> >
::_Mapping_base(std::extents<int, 5ul> const&)
std::__mdspan::_Rank1_mapping_base<std::extents<int, 5ul> >
::_Rank1_mapping_base<std::extents<char, 5ul> >(
std::__mdspan::_Rank1_mapping_base<std::extents<char, 5ul> > const&)
(Naturally, there's more, e.g. for _ExtentsStorage and extents, etc.
Using `objdump -C -d` I can confirm that these "genuine" and there's a
little bit of code associated with each symbol.)
With the current implementations I don't see much difference in object
file size, and almost one when using -O2.
Total object file sizes:
naive | bases
-O2 9832 10240
-g -O2 250768 261920
-g -O0 600512 564400
Number of symbols (nm -C OBJ | wc -l):
naive | bases
-O2 46 46
-g -O2 46 46
-g -O0 816 907
I hope this explains what I'm seeing. Please let me know if anything is
unclear or if you suspect I'm doing something wrong.
--------------------------------------------------------------------
I would like to once more make the case for the flat implementation,
i.e. one class per layout kind, by summarizing some of the twists I've
encountered.
Generic oddities:
- `layout_left::mapping<Extents> == layout_right::mapping<OExtents>`
is valid, if and only iff `Extents != OExtents`.
- `layout_right(layout_stride)` is noexcept but
`layout_left(layout_stride)` isn't.
This inconsistency seems to be a bug in specification. I doubt this is
intended.
Traps:
- The ctor `mapping(extents_type)` can't be inherited, because if we
do, the template parameter `_Extents` isn't inferred.
- We must work around a compiler issue that prevents inheriting the
condition for explicitness of ctors.
- Reusing `_Rank0_mapping_base` for layout_stride requires hiding the
ctor `mapping(extents_type)`.
- For rank == 0, layout_stride is convertible from other
layout_stride, if and only if the extent_type's are convertible.
Whereas layout_stride is convertible to layout_{left,right}
unconditionally (at rank 0).
Here, I also believe that either we should consistently ignore or take into
consideration
the convertibility of index_types for the rank(). Currently it seems like
we are in between
these two.
I believe I've forgotten one; and I'm definitely very worried I've not
uncovered all of them yet.
Inconveniences:
- The ctor `mapping(layout_stride)` can't be shared between
layout_left and layout_right, if we want to respect the missing
`noexcept` for layout_left.
This seems like a bug in a standard.
- While we can extract a `_MappingBase` which deals with the `extents`
object we can't write the correct compatiblity check for
`layout_stride` and the two padded layouts at that level. (Meaning
we'll need an intermediate layer of ctors for checking these
conditions).
- The kind of the layout (left, right, stride) and the rank of the
layout appear as if they're orthogonal concepts. However, the two
concepts are intertwined. Making it hard to implement the notion of
Rank0 or Rank1 in a manner that's independent of the layout kind.
- The trait `std::is_convertible_v` doesn't actually check that
the code used to perform the conversion is valid; meaning one must
also make sure to instantiate the ctor (in all possible variations).
- I'm currently at over 1500 lines of code for testing "everything" in
all it's variations.
The architecture at scale looks like this:
- _Mapping_base: because we would like to extract the method
`extents`.
- _Rank0_mapping_base: for code shared between all three layouts: e.g.
ctors and required_span_size.
- _Rank0_left_right_base: for code not shared with stride: ctor from
layout_stride (because of differing explicitness conditions).
- _Rank1_mapping_base: for code shared between left and right: e.g.
layout changing ctors, trivial implementation of operator().
- _RankN_mapping_base: for code shared between left and right: e.g.
ctors that check their preconditions, required_span_size.
- _RankN_left_base: code for rank N and left only, e.g. non-layout
changing ctors, operator() and stride.
- _RankN_right_base: see _RankN_left_base.
- _Rank0_stride_base: because we need `stride()`.
- _Stride: which is a base which contains `_M_stride`. This we want
because it only depends on IndexType and `sizeof...(Indices)`.
- _RankN_stride_base: code for rank N and stride only, e.g. generic
implementation of operator() and required_span_size().
- _IsAlwaysStrided: for `is_always_strided` and `is_strided`, because
they don't depend on the extents.
- _IsAlwaysUnique: see above.
- _IsExhaustive<Always>: for `is_always_exhaustive` and (if Always)
`is_exhaustive`.
Plus the three not entirely empty classes for `layout_left`,
`layout_right` and `layout_stride`.
I'm happy to submit either version for review; they're both ready and
just need to be structured into individual commits.