On Mon, May 5, 2025 at 9:20 PM Luc Grosheintz <luc.groshei...@gmail.com> wrote:
> > > On 5/5/25 9:44 AM, Tomasz Kaminski wrote: > > On Sat, May 3, 2025 at 2:39 PM Luc Grosheintz <luc.groshei...@gmail.com> > > wrote: > > > >> > >> > >> On 4/30/25 7:13 AM, Tomasz Kaminski wrote: > >>> Hi, > >>> > >>> As we will be landing patches for extends, this will become a separate > >>> patch series. > >>> I would prefer, if you could commit per layout, and start with > >> layout_right > >>> (default) > >>> I try to provide prompt responses, so if that works better for you, you > >> can > >>> post a patch > >>> only with this layout first, as most of the comments will apply to all > of > >>> them. > >>> > >>> For the general design we have constructors that allow conversion > between > >>> rank-0 > >>> and rank-1 layouts left and right. This is done because they > essentially > >>> represents > >>> the same layout. I think we could benefit from that in code by having a > >>> base classes > >>> for rank0 and rank1 mapping: > >>> template<typename _Extents> > >>> _Rank0_mapping_base > >>> { > >>> static_assert(_Extents::rank() == 0); > >>> > >>> template<OtherExtents> > >>> // explicit, requires goes here > >>> _Rank0_mapping_base(_Rank0_mapping_base<OtherExtents>); > >>> > >>> // All members layout_type goes her > >>> }; > >>> > >>> template<typename _Extents> > >>> _Rank1_mapping_base > >>> { > >>> static_assert(_Extents::rank() == 1); > >>> // Static assert for product is much simpler here, as we need to > >> check one > >>> > >>> template<OtherExtents> > >>> // explicit, requires goes here > >>> _Rank1_mapping_base(_Rank1_mapping_base<OtherExtents>); > >>> > >>> // Call operator can also be simplified > >>> index_type operator()(index_type i) const // conversion happens at > >> user > >>> side > >>> > >>> // cosntructor from strided_layout of Rank1 goes here. > >>> > >>> // All members layout_type goes her > >>> }; > >>> Then we will specialize layout_left/right/stride to use > >> _Rank0_mapping_base > >>> as a base for rank() == 0 > >>> and layout_left/right to use _Rank1_mapping as base for rank()1; > >>> template<typename T, unsigned... Ids> > >>> struct extents {}; > >>> > >>> struct layout > >>> { > >>> template<typename Extends> > >>> struct mapping > >>> { > >>> // static assert that Extents mmyst be specialization of _Extents goes > >> here. > >>> } > >>> }; > >>> > >>> template<typename _IndexType> > >>> struct layout::mapping<extents<_IndexType>> > >>> : _Rank0_mapping_base<extents<_IndexType>> > >>> { > >>> using layout_type = layout_left; > >>> // Provides converting constructor. > >>> using _Rank0_mapping_base<extents<_IndexType>>::_Rank0_mapping_base; > >>> // This one is implicit; > >>> mapping(_Rank0_mapping_base<extents<_IndexType>> const&); > >>> }; > >>> > >>> template<typename _IndexType, unsigned _Ext> > >>> struct layout::mapping<extents<_IndexType, _Ext>> > >>> : _Rank1_mapping_base<extents<_IndexType>> > >>> > >>> { > >>> using layout_type = layout_left; > >>> // Provides converting constructor. > >>> using _Rank0_mapping_base<extents<_IndexType>>::_Rank0_mapping_base; > >>> // This one is implicit, allows construction from layout_right > >>> mapping(_Rank1_mapping_base<extents<_IndexType>> const&); > >>> }; > >>> }; > >>> > >>> template<typename _IndexType, unsigned... _Ext> > >>> requires sizeof..(_Ext) > = 2 > >>> struct layout::mapping<extents<_IndexType, _Ext>> > >>> > >>> The last one is a generic implementation that you can use in yours. > >>> Please also include a comment explaining that we are deviating from > >>> standard text here. > >>> > >> > >> Thank for reviewing and offering fast review cycles, I can't say I've > >> ever felt that they were anything but wonderfully fast and I appologise > >> for the delay (I've been away hiking for two days). > >> > >> The reason I implement all three is that I needed to see them all. > >> Otherwise, I can see and "feel" the impact of the duplication (or > >> efforts to reduce duplication). It's also to make sure I understand > >> precisely how the layouts are similar and different. The idea is that > >> you'd review one at a time; and by adding the others you can pick which > >> one and have a glance at the other if it's helpful during review. > >> > >> The review contains three topics. This email responds to the idea of > >> introducing a common base class. I believe I superficially understand > >> the request. However, it's not clear to me what we gain. > >> > >> The reorganization seems to stress the how rank 0 and rank 1 layouts are > >> similar; at the cost of making the uniformity of layout_left (regardless > >> of rank) and layout_right (regardless of rank) less obvious. Personally, > >> I quite like that we can express all layouts of one kind regardless of > >> their rank, without resorting to special cases via specialization. > >> > >> To me the standard reads like the layouts are three separate, > >> independent entities. However, because it would be too tedious to not > >> allow conversion between layouts of rank 0 and 1, a couple of ctors were > >> added. An example, of how the layouts are not consider related is that > >> we can't compare layout_left and layout_right mappings for equality. > >> > >> If I count, the current implementation has 3 copies: layout_left, > >> layout_right, layout_stride. The proposed changes add two (likely three) > >> base classes which, require three specializations of layout_right and > >> layout_left each. Therefore I end up with 6 copies of layout_left and > >> layout_right; and 2 (or 3) base classes. Or if we manage it use the base > >> classes for all layouts: 9 specialization, 2 (or 3) base classes. > >> Therefore, in terms of numbers the restructuring doesn't seem favourable > >> and I feel would require noticeably more repetition in the tests. > >> > > I think we can use a conditional base, to eliminate specializations, > > and have layout_left_base. > > > >> > >> While the rank zero and rank one versions are a lot simpler than the > >> generic copy, they aren't entirely empty either (we'll likely need to > >> use `using super::*` several times). Furthermore, I don't see any real > >> simplifications in the generic copy. Meaning we still have a copy that > >> has all the complexities and could (almost) handle the other two cases. > >> > > I think using a common bases, and allowing layout_left/layout_right to be > > constructed from the base classes, will automatically enable the > conversions > > that are currently expressed via constructor: > > layout_left <-> layout_right for ranks 0 and 1, > > layout_stride <-> layout_left/right for rank 0 (because they are rank0) > > > >> > >> Since layout_stride is a little different, I'm not sure we can reuse the > >> base classes for it. For example it allows conversion more freely, but > >> is only implicit for rank 0; whereas the other two allow implicit > >> conversion between each other for rank 0 and 1. > >> > > This is why I said rank0_mapping_base should be used for layout_left, > > layout_rigth and layout_stride > > (and later layout_left_padeed, and layout_right_padded). > > For rank1 this would be shared between layout_left, laout_right, > > layout_left_paded and layout_right_paded. > > > >> > >> Most importantly though, the reorganization makes it very hard to > >> compare the implementation with the text of the standard. Therefore, > >> making it more likely that a discrepancy goes unnoticed. > >> > >> One example of the subtleties involved in restructuring the code is the > >> following: layout_right, layout_left support CTAD via the ctor > >> `mapping(extents_type)`. However, when using specialization, that stops > >> working and we'd need to add deduction guides to make the implementation > >> conforming. I suspect a Godbolt can explain the situation better: > >> https://godbolt.org/z/EdbE7ME6P > > > > We could use simple specializations, and derive from different base > classes > > using conditional_t. This would require having > > > Thank you for looking deep into that idea, before we would go with totally separate classes, I would like to understands your points. > > I gave this another try; and now I'm stumbling again as show here: > https://godbolt.org/z/hfh9zE8ME This could be done as: template<size_t N, typename T> using mapping_middle_base = std::conditional_t<N == 0, rank0_base<T>, rankN_middle_base<T>>; template<size_t N> class mapping_middle : public mapping_middle_base<N, double> { using _Base = mapping_middle_base<N, double>; public: using _Base::_Base; mapping_middle(const _Base& other) {} }; https://godbolt.org/z/M7jxq6qTK > > > As shown, it could be solved using three ctors `mapping(_Rank0_base)`, > `mapping(_Rank1_base)` and `mapping(_RankN_base)` with the respective > `requires` clauses. However that feels like it defeats the purpose. > I was thinking about having a simple constructor that constructs from _Base. > > Frankly, the savings in terms of lines of code are not great. Partly, > because we often need `using extents_type = __super::extents`; or > the ctor `mapping(mapping_base<OExtents>&)`. > I see that I was unclear there, my point was not about the code size in terms of source file, but the emitted code into binary. What I mean with separate layouts, for E being every static extent and and dynamic_extent, and every member function `fun` we will have: layout_left<extends<E>>::fun(); layout_right<extends<E>>::fun(); layout_left_padded<extends<E>>::fun(); layout_right_padded<extends<E>>::fun(); emitted in binary, and I would like to see if we can reduce it to rank1_mapping_base<extends<E>>::fun(); That's why I am looking into base class direction. > The generic cases of `operator()`, `stride`, `required_span_size` tend > to supported the rank 0 and rank 1 without special cases. (Making it > harder to save lines of code by having special cases for rank 0, 1 and > N.) > Again, I was not clear, when referring to code size, and more thinking of binary size and number of symbols. > > The `stride` is a nice example of how it's all almost uniform, but not > quite. It's present for layout_stride and the padded ones, but not > layout_left and layout_right. For rank 1 it works for anything, just not > layout_stride. (Further reducing the savings in terms of lines of code.) > > >> > >> > >> In summary: personally I feel using base classes adds repetition, > >> complexity and makes it hard (for me impossible) to check that the > >> implementation is conforming. > >> > >> Naturally, I can be convinced to change the implementation. Maybe you > >> could explain a little what and why you don't like the admittedly very > >> brute-force implementation. > >> > > There are a few things I had in mind. For rank 1 checking the > precondition > > for the size of multidimensional index space is trivial, > > and we certainly could provide a __glibcxx_assert in such a case. This > can > > be done for the separate implementation, but I think > > we will end up with repeated if constexpr checks in all mappings. This is > > why in my mind, the reduced rank0 and rank1 cases > > felt a bit special. > > For example, mdspan<T, extents<N>> can be seen as replacement for > span<N>, > > which provides support for adjusting accessor > > and pointer type. > > > > My original request, however, was however driven by code size. The > > layout_left, layout_right, layout_left_padded, layout_right_padded > > with rank 0, have members with same semantics (operator(), stride, ...), > > and we could have only one definition for them, instead of four. > > > >> > >>> > >>> On Tue, Apr 29, 2025 at 2:56 PM Luc Grosheintz < > luc.groshei...@gmail.com > >>> > >>> wrote: > >>> > >>>> Implements the parts of layout_left that don't depend on any of the > >>>> other layouts. > >>>> > >>>> libstdc++/ChangeLog: > >>>> > >>>> * include/std/mdspan (layout_left): New class. > >>>> > >>>> Signed-off-by: Luc Grosheintz <luc.groshei...@gmail.com> > >>>> --- > >>>> libstdc++-v3/include/std/mdspan | 179 > ++++++++++++++++++++++++++++++++ > >>>> 1 file changed, 179 insertions(+) > >>>> > >>>> diff --git a/libstdc++-v3/include/std/mdspan > >>>> b/libstdc++-v3/include/std/mdspan > >>>> index 39ced1d6301..e05048a5b93 100644 > >>>> --- a/libstdc++-v3/include/std/mdspan > >>>> +++ b/libstdc++-v3/include/std/mdspan > >>>> @@ -286,6 +286,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >>>> > >>>> namespace __mdspan > >>>> { > >>>> + template<typename _Extents> > >>>> + constexpr typename _Extents::index_type > >>>> + __fwd_prod(const _Extents& __exts, size_t __r) noexcept > >>>> + { > >>>> + typename _Extents::index_type __fwd = 1; > >>>> + for(size_t __i = 0; __i < __r; ++__i) > >>>> + __fwd *= __exts.extent(__i); > >>>> + return __fwd; > >>>> + } > >>>> > >>> As we are inside the standard library implementation, we can do some > >> tricks > >>> here, > >>> and provide two functions: > >>> // Returns the std::span(_ExtentsStorage::_Ext).substr(f, l); > >>> // For extents forward to __static_exts > >>> span<typename Extends::index_type> __static_exts(size_t f, size_t l); > >>> // Returns the > >>> > >> > std::span(_ExtentsStorage::_M_dynamic_extents).substr(_S_dynamic_index[f], > >>> _S_dynamic_index[l); > >>> span<typename Extends::index_type> __dynamic_exts(Extents const& c); > >>> Then you can befriend this function both to extents and > _ExtentsStorage. > >>> Also add index_type members to _ExtentsStorage. > >>> > >>> Then instead of having fwd-prod and rev-prod I would have: > >>> template<typename _Extents> > >>> consteval size_t __static_ext_prod(size_t f, size_t l) > >>> { > >>> // multiply E != dynamic_ext from __static_exts > >>> } > >>> constexpr size __ext_prod(const _Extents& __exts, size_t f, size_t l) > >>> { > >>> // multiply __static_ext_prod<_Extents>(f, l) and each elements of > >>> __dynamic_exts(__exts, f, l); > >>> } > >>> > >>> Then fwd-prod(e, n) would be __ext_prod(e, 0, n), and rev_prod(e, n) > >> would > >>> be __ext_prod(e, __ext.rank() -n, n, __ext.rank()) > >>> > >>> > >>>> + > >>>> + template<typename _Extents> > >>>> + constexpr typename _Extents::index_type > >>>> + __rev_prod(const _Extents& __exts, size_t __r) noexcept > >>>> + { > >>>> + typename _Extents::index_type __rev = 1; > >>>> + for(size_t __i = __r + 1; __i < __exts.rank(); ++__i) > >>>> + __rev *= __exts.extent(__i); > >>>> + return __rev; > >>>> + } > >>>> + > >>>> template<typename _IndexType, size_t... _Counts> > >>>> auto __build_dextents_type(integer_sequence<size_t, > _Counts...>) > >>>> -> extents<_IndexType, ((void) _Counts, dynamic_extent)...>; > >>>> @@ -304,6 +324,165 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >>>> explicit extents(_Integrals...) -> > >>>> extents<size_t, __mdspan::__dynamic_extent<_Integrals>()...>; > >>>> > >>>> + struct layout_left > >>>> + { > >>>> + template<typename _Extents> > >>>> + class mapping; > >>>> + }; > >>>> + > >>>> + namespace __mdspan > >>>> + { > >>>> + template<typename _Tp> > >>>> + constexpr bool __is_extents = false; > >>>> + > >>>> + template<typename _IndexType, size_t... _Extents> > >>>> + constexpr bool __is_extents<extents<_IndexType, _Extents...>> = > >>>> true; > >>>> + > >>>> + template<size_t _Count> > >>>> + struct _LinearIndexLeft > >>>> + { > >>>> + template<typename _Extents, typename... _Indices> > >>>> + static constexpr typename _Extents::index_type > >>>> + _S_value(const _Extents& __exts, typename _Extents::index_type > >>>> __idx, > >>>> + _Indices... __indices) noexcept > >>>> + { > >>>> + return __idx + __exts.extent(_Count) > >>>> + * _LinearIndexLeft<_Count + 1>::_S_value(__exts, > >> __indices...); > >>>> + } > >>>> + > >>>> + template<typename _Extents> > >>>> + static constexpr typename _Extents::index_type > >>>> + _S_value(const _Extents&) noexcept > >>>> + { return 0; } > >>>> + }; > >>>> + > >>>> + template<typename _Extents, typename... _Indices> > >>>> + constexpr typename _Extents::index_type > >>>> + __linear_index_left(const _Extents& __exts, _Indices... > >> __indices) > >>>> + { > >>>> + return _LinearIndexLeft<0>::_S_value(__exts, __indices...); > >>>> + } > >>>> > >>> This can be eliminated by fold expressions, see below. > >>> > >>>> + > >>>> + template<typename _IndexType, typename _Tp, size_t _Nm> > >>>> + consteval bool > >>>> + __is_representable_product(array<_Tp, _Nm> __factors) > >>>> + { > >>>> + size_t __rest = numeric_limits<_IndexType>::max(); > >>>> + for(size_t __i = 0; __i < _Nm; ++__i) > >>>> + { > >>>> + if (__factors[__i] == 0) > >>>> + return true; > >>>> + __rest /= _IndexType(__factors[__i]); > >>>> + } > >>>> + return __rest > 0; > >>>> + } > >>>> > >>> I would replace that with > >>> template<IndexType> > >>> consteval size_t __div_reminder(span<const size_t, _Nm> __factors, > size_t > >>> __val) > >>> { > >>> size_t __rest = val; > >>> for(size_t __i = 0; __i < _Nm; ++__i) > >>> { > >>> if (__factors[__i] == dynamic_extent) > >>> continue; > >>> if (__factors[__i] != 0) > >>> return val; > >>> __rest /= _IndexType(__factors[__i]); > >>> if (__res == 0) > >>> return 0; > >>> } > >>> return __rest; > >>> } > >>> > >>> We can express the is presentable check as > >>> static constexpr __dyn_reminder = > >> __div_reminder(__static_exts<Extents>(0, > >>> rank()), std::numeric_limits<Index>::max()); > >>> static_assert(__dyn_reminder > 0); > >>> However, with __dyn_reminder value, the precondition > >>> https://eel.is/c++draft/mdspan.layout#left.cons-1, > >>> can be checked by doing equivalent of __div_remainder for __dyn_extents > >>> with __val being __dyn_reminder. > >>> > >>> > >>>> + > >>>> + template<typename _Extents> > >>>> + consteval array<typename _Extents::index_type, > _Extents::rank()> > >>>> + __static_extents_array() > >>>> + { > >>>> + array<typename _Extents::index_type, _Extents::rank()> __exts; > >>>> + for(size_t __i = 0; __i < _Extents::rank(); ++__i) > >>>> + __exts[__i] = _Extents::static_extent(__i); > >>>> + return __exts; > >>>> + } > >>>> > >>> > >>> Replaced by __static_exts accessor, as described above. > >>> > >>> > >>>> + > >>>> + template<typename _Extents, typename _IndexType> > >>>> + concept __representable_size = _Extents::rank_dynamic() != 0 > >>>> + || __is_representable_product<_IndexType>( > >>>> + __static_extents_array<_Extents>()); > >>>> + > >>>> + template<typename _Extents> > >>>> + concept __layout_extent = __representable_size< > >>>> + _Extents, typename _Extents::index_type>; > >>>> + } > >>>> > >>> + > >>>> + template<typename _Extents> > >>>> + class layout_left::mapping > >>>> + { > >>>> + static_assert(__mdspan::__layout_extent<_Extents>, > >>>> + "The size of extents_type is not representable as > index_type."); > >>>> + public: > >>>> + using extents_type = _Extents; > >>>> + using index_type = typename extents_type::index_type; > >>>> + using size_type = typename extents_type::size_type; > >>>> + using rank_type = typename extents_type::rank_type; > >>>> + using layout_type = layout_left; > >>>> + > >>>> + constexpr > >>>> + mapping() noexcept = default; > >>>> + > >>>> + constexpr > >>>> + mapping(const mapping&) noexcept = default; > >>>> + > >>>> + constexpr > >>>> + mapping(const extents_type& __extents) noexcept > >>>> + : _M_extents(__extents) > >>>> + { > >>>> > >>> > >>> > >>> > >>>> } > >>>> + > >>>> + template<typename _OExtents> > >>>> + requires (is_constructible_v<extents_type, _OExtents>) > >>>> + constexpr explicit(!is_convertible_v<_OExtents, extents_type>) > >>>> + mapping(const mapping<_OExtents>& __other) noexcept > >>>> + : _M_extents(__other.extents()) > >>>> + { > >>> > >>> Here we could do checks at compile time: > >>> if constexpr(_OExtents::rank_dynamic() == 0) > >>> static_assert( __div_remainder(...) > 0); > >>> } > >>> > >>> > >>>> } > >>>> + > >>>> + constexpr mapping& > >>>> + operator=(const mapping&) noexcept = default; > >>>> + > >>>> + constexpr const extents_type& > >>>> + extents() const noexcept { return _M_extents; } > >>>> + > >>>> + constexpr index_type > >>>> + required_span_size() const noexcept > >>>> + { return __mdspan::__fwd_prod(_M_extents, _M_extents.rank()); } > >>>> + > >>>> + template<__mdspan::__valid_index_type<index_type>... _Indices> > >>>> > >>> // Because we extracted rank0 and rank1 specializations > >>> > >>>> + requires (sizeof...(_Indices) + 1 == extents_type::rank()) > >>>> + constexpr index_type > >>>> + operator()(index_type __idx, _Indices... __indices) const > >> noexcept > >>>> + { > >>>> > >>> This could be implemented as, please synchronize the names. > >>> if constexpr (!is_same_v<_Indices, index_type> || ...) > >>> // Reduce the number of instantations. > >>> return operator()(index_type _idx0, > >>> static_cast<index_type>(std::move(__indices))....); > >>> else > >>> { > >>> // This can be used for layout stride, if you start with __res > = > >> 0; > >>> index_type __res = _idx0; > >>> index_type __mult = _M_extents.extent(0); > >>> auto __update = [&__res, &__mult, __pos = 1u](index_type __idx) > >>> mutable > >>> { > >>> __res += __idx * __mult; > >>> __mult *= _M_extents.extent(__pos); > >>> ++__pos; > >>> }; > >>> // Fold over invocation of lambda > >>> (__update(_Indices), ....); > >>> return __res; > >>> } > >>> > >>> This could be even simpler and written as (use for layout stride): > >>> size_t __pos = 0; > >>> return (index_type(0) + ... + __indices * stride(__pos++)); > >>> Here, I prefer to avoid multiplying multiple times. > >>> > >>> > >>> + return __mdspan::__linear_index_left( > >>>> + _M_extents, static_cast<index_type>(__indices)...); > >>>> + } > >>>> + > >>>> + static constexpr bool > >>>> + is_always_unique() noexcept { return true; } > >>>> + > >>>> + static constexpr bool > >>>> + is_always_exhaustive() noexcept { return true; } > >>>> + > >>>> + static constexpr bool > >>>> + is_always_strided() noexcept { return true; } > >>>> + > >>>> + static constexpr bool > >>>> + is_unique() noexcept { return true; } > >>>> + > >>>> + static constexpr bool > >>>> + is_exhaustive() noexcept { return true; } > >>>> + > >>>> + static constexpr bool > >>>> + is_strided() noexcept { return true; } > >>>> + > >>>> + constexpr index_type > >>>> + stride(rank_type __i) const noexcept > >>>> + requires (extents_type::rank() > 0) > >>>> + { > >>>> + __glibcxx_assert(__i < extents_type::rank()); > >>>> + return __mdspan::__fwd_prod(_M_extents, __i); > >>>> + } > >>>> + > >>>> + template<typename _OExtents> > >>>> + requires (extents_type::rank() == _OExtents::rank()) > >>>> + friend constexpr bool > >>>> + operator==(const mapping& __self, const mapping<_OExtents>& > >>>> __other) > >>>> + noexcept > >>>> + { return __self.extents() == __other.extents(); } > >>>> + > >>>> + private: > >>>> + [[no_unique_address]] extents_type _M_extents; > >>>> + }; > >>>> + > >>>> _GLIBCXX_END_NAMESPACE_VERSION > >>>> } > >>>> #endif > >>>> -- > >>>> 2.49.0 > >>>> > >>>> > >>> > >> > >> > > > >