On Fri, Jun 13, 2025 at 4:46 PM Patrick Palka <ppa...@redhat.com> wrote:

> On Fri, 13 Jun 2025, Tomasz Kaminski wrote:
>
> >
> >
> > On Fri, Jun 13, 2025 at 3:54 PM Patrick Palka <ppa...@redhat.com> wrote:
> >       On Fri, 13 Jun 2025, Tomasz Kaminski wrote:
> >
> >       >
> >       >
> >       > On Thu, Jun 12, 2025 at 9:05 PM Patrick Palka <ppa...@redhat.com>
> wrote:
> >       >       On Thu, 12 Jun 2025, Patrick Palka wrote:
> >       >
> >       >       > On Thu, 12 Jun 2025, Jonathan Wakely wrote:
> >       >       >
> >       >       > >
> >       >       > >
> >       >       > > On Thu, 12 Jun 2025, 16:56 Patrick Palka, <
> ppa...@redhat.com> wrote:
> >       >       > >       Tested on x86_64-pc-linux-gnu, does this look OK
> for trunk?
> >       >       > >
> >       >       > >       I'm sure if introducing a new overload is
> preferable to
> >       >       > >       introducing a constexpr if branch in the
> existing overload.
> >       >       > >
> >       >       > >
> >       >       > > I think a constexpr if branch in the existing
> functions is cheaper. We need to check the traits either way, but we can
> avoid doing overload resolution.
> >       >       >
> >       >       > Ack.  I opted to just define specialized functors for
> these helpers
> >       >       > instead of using lambdas.  That way we can easily
> optimize the case
> >       >       > where only one of the functions is stateless.
> >       >       >
> >       >       > Changes in v2:
> >       >       >   - Use a functor utilizing [[no_unique_address]]
> instead of a lambda,
> >       >       >     to also concisely handle the case where only one of
> the functino
> >       >       >     object is stateless.  In turn check is_copy_xible
> instead of
> >       >       >     is_default_xible.
> >       >       >
> >       >       > -- >8 --
> >       >       >
> >       >       > When creating a composite comparator/predicate that
> invokes a given
> >       >       > projection function, we don't need to capture a
> stateless function
> >       >       > object by reference, instead we can capture it by value
> and use
> >       >       > [[no_unique_address]] to elide its storage.  This makes
> using
> >       >       > __make_comp_proj zero-cost in the common case where the
> comparator or
> >       >       > projection (or both) are stateless.
> >
> > Update description to mention function/member pointers, where we are also
> > at zero cost compared to passing them by value.
> > Outside of this note looks good to me.
>
> Thanks for the reviews.  The updated commit message says:
>
> When creating a composite comparator/predicate that invokes a given
> projection function, we don't need to capture a scalar (such as a
> function pointer or member pointer) or empty object by reference,
> instead capture it by value and use [[no_unique_address]] to elide its
> storage in the empty case.  This makes using __make_comp_proj zero-cost
> in the common case where the comparator and projection are both
> empty/scalars.
>
Much better. Thanks!

>
> >       >       >
> >       >       > libstdc++-v3/ChangeLog:
> >       >       >
> >       >       >       * include/bits/ranges_algo.h
> (__detail::_Comp_proj): New.
> >       >       >       (__detail::__make_comp_proj): Use it instead.
> >       >       >       (__detail::_Pred_proj): New.
> >       >       >       (__detail::__make_pred_proj): Use it instead.
> >       >       > ---
> >       >       >  libstdc++-v3/include/bits/ranges_algo.h | 70
> +++++++++++++++++++------
> >       >       >  1 file changed, 54 insertions(+), 16 deletions(-)
> >       >       >
> >       >       > diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> b/libstdc++-v3/include/bits/ranges_algo.h
> >       >       > index 9f94c6b1d57e..ce48ca047b69 100644
> >       >       > --- a/libstdc++-v3/include/bits/ranges_algo.h
> >       >       > +++ b/libstdc++-v3/include/bits/ranges_algo.h
> >       >       > @@ -49,27 +49,65 @@ namespace ranges
> >       >       >    namespace __detail
> >       >       >    {
> >       >       >      template<typename _Comp, typename _Proj>
> >       >       > -      constexpr auto
> >       >       > +      struct _Comp_proj
> >       >       > +      {
> >       >       > +     [[no_unique_address]]
> >       >       > +       __conditional_t<is_empty_v<_Comp> &&
> is_copy_constructible_v<_Comp>,
> >       >       > +                       _Comp, _Comp&> _M_comp;
> >       >
> >       >       Comparators, predicates and projections in ranges
> algorithms are already
> >       >       constrained to by copyable, so checking
> is_copy_constructible_v here
> >       >       is redundant.
> >       >
> >       >       -- >8 --
> >       >
> >       >       Subject: [PATCH] libstdc++: Optimize __make_comp_proj and
> __make_pred_proj
> >       >
> >       >       Changes in v3:
> >       >         - Remove redundant is_copy_constructible check when
> deciding
> >       >           whether to capture by reference or by value.
> >       >
> >       >       Changes in v2:
> >       >         - Use a functor utilizing [[no_unique_address]] instead
> of a lambda,
> >       >           to also concisely handle the case where only one of
> the functino
> >       >           object is stateless.  In turn check is_copy_xible
> instead of
> >       >           is_default_xible.
> >       >
> >       >       -- >8 --
> >       >
> >       >       When creating a composite comparator/predicate that
> invokes a given
> >       >       projection function, we don't need to capture a stateless
> function
> >       >       object by reference, instead we can capture it by value
> and use
> >       >       [[no_unique_address]] to elide its storage.  This makes
> using
> >       >       __make_comp_proj zero-cost in the common case where the
> comparator
> >       >       and projection are both stateless.
> >       >
> >       >       libstdc++-v3/ChangeLog:
> >       >
> >       >               * include/bits/ranges_algo.h
> (__detail::_Comp_proj): New.
> >       >               (__detail::__make_comp_proj): Use it instead.
> >       >               (__detail::_Pred_proj): New.
> >       >               (__detail::__make_pred_proj): Use it instead.
> >       >       ---
> >       >        libstdc++-v3/include/bits/ranges_algo.h | 64
> ++++++++++++++++++-------
> >       >        1 file changed, 48 insertions(+), 16 deletions(-)
> >       >
> >       >       diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> b/libstdc++-v3/include/bits/ranges_algo.h
> >       >       index cc182d110b30..a7df4c9a13ca 100644
> >       >       --- a/libstdc++-v3/include/bits/ranges_algo.h
> >       >       +++ b/libstdc++-v3/include/bits/ranges_algo.h
> >       >       @@ -49,27 +49,59 @@ namespace ranges
> >       >          namespace __detail
> >       >          {
> >       >            template<typename _Comp, typename _Proj>
> >       >       -      constexpr auto
> >       >       +      struct _Comp_proj
> >       >       +      {
> >       >       +       [[no_unique_address]]
> >       >       +         __conditional_t<is_empty_v<_Comp>, _Comp,
> _Comp&> _M_comp;
> >       >       +       [[no_unique_address]]
> >       >       +         __conditional_t<is_empty_v<_Proj>, _Proj,
> _Proj&> _M_proj;
> >       >
> >       > I would also store function pointers and member pointers (for
> proj) by value,
> >       > instead of by reference. So instead of is_empty_v I would check:
> >       > is_empty_v<_Func> || is_pointer_v<_Func> ||
> is_member_pointer_v<_Func>
> >       > Or just:
> >       > __or_v<is_scalar<_Func>, is_empty<_Func>>.
> >
> >       Sounds good, like so?
> >
> >       -- >8 --
> >
> >       Subject: [PATCH] libstdc++: Optimize __make_comp/pred_proj for
> empty/scalar
> >        types
> >
> >       Changes in v4:
> >         - Also capture scalar (non-empty) functions by value.
> >
> >       Changes in v3:
> >         - Remove redundant is_copy_constructible check when deciding
> >           whether to capture by reference or by value.
> >
> >       Changes in v2:
> >         - Use a functor utilizing [[no_unique_address]] instead of a
> lambda,
> >           to also concisely handle the case where only one of the
> functino
> >           object is stateless.  In turn check is_copy_xible instead of
> >           is_default_xible.
> >
> >       -- >8 --
> >
> >       When creating a composite comparator/predicate that invokes a given
> >       projection function, we don't need to capture a scalar (such as a
> >       function pointer or member pointer) or empty object by reference,
> >       instead capture it by value and use [[no_unique_address]] to elide
> its
> >       storage in the empty case.  This makes using __make_comp_proj
> zero-cost
> >       in the common case where the comparator and projection are both
> >       empty/scalars.
> >
> >       libstdc++-v3/ChangeLog:
> >
> >               * include/bits/ranges_algo.h
> (__detail::__by_ref_or_value_fn): New.
> >               (__detail::_Comp_proj): New.
> >               (__detail::__make_comp_proj): Use it instead.
> >               (__detail::_Pred_proj): New.
> >               (__detail::__make_pred_proj): Use it instead.
> >       ---
> >        libstdc++-v3/include/bits/ranges_algo.h | 64
> ++++++++++++++++++-------
> >        1 file changed, 48 insertions(+), 16 deletions(-)
> >
> >       diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> b/libstdc++-v3/include/bits/ranges_algo.h
> >       index cc182d110b30..7447fbd21f7e 100644
> >       --- a/libstdc++-v3/include/bits/ranges_algo.h
> >       +++ b/libstdc++-v3/include/bits/ranges_algo.h
> >       @@ -48,28 +48,60 @@ namespace ranges
> >        {
> >          namespace __detail
> >          {
> >       +    template<typename _Fp>
> >       +      using __by_ref_or_value_fn
> >       +       = __conditional_t<is_scalar_v<_Fp> || is_empty_v<_Fp>,
> _Fp, _Fp&>;
> >       +
> >            template<typename _Comp, typename _Proj>
> >       -      constexpr auto
> >       +      struct _Comp_proj
> >       +      {
> >       +       [[no_unique_address]] __by_ref_or_value_fn<_Comp> _M_comp;
> >       +       [[no_unique_address]] __by_ref_or_value_fn<_Proj> _M_proj;
> >       +
> >       +       constexpr
> >       +       _Comp_proj(_Comp& __comp, _Proj& __proj)
> >       +       : _M_comp(__comp), _M_proj(__proj)
> >       +       { }
> >       +
> >       +       template<typename _Tp, typename _Up>
> >       +       constexpr bool
> >       +       operator()(_Tp&& __x, _Up&& __y)
> >       +       {
> >       +         return std::__invoke(_M_comp,
> >       +                              std::__invoke(_M_proj,
> std::forward<_Tp>(__x)),
> >       +                              std::__invoke(_M_proj,
> std::forward<_Up>(__y)));
> >       +       }
> >       +      };
> >       +
> >       +    template<typename _Comp, typename _Proj>
> >       +      constexpr _Comp_proj<_Comp, _Proj>
> >              __make_comp_proj(_Comp& __comp, _Proj& __proj)
> >       +      { return {__comp, __proj}; }
> >       +
> >       +    template<typename _Pred, typename _Proj>
> >       +      struct _Pred_proj
> >              {
> >       -       return [&] (auto&& __lhs, auto&& __rhs) -> bool {
> >       -         using _TL = decltype(__lhs);
> >       -         using _TR = decltype(__rhs);
> >       -         return std::__invoke(__comp,
> >       -                              std::__invoke(__proj,
> std::forward<_TL>(__lhs)),
> >       -                              std::__invoke(__proj,
> std::forward<_TR>(__rhs)));
> >       -       };
> >       -      }
> >       +       [[no_unique_address]] __by_ref_or_value_fn<_Pred> _M_pred;
> >       +       [[no_unique_address]] __by_ref_or_value_fn<_Proj> _M_proj;
> >       +
> >       +       constexpr
> >       +       _Pred_proj(_Pred& __pred, _Proj& __proj)
> >       +       : _M_pred(__pred), _M_proj(__proj)
> >       +       { }
> >       +
> >       +       template<typename _Tp>
> >       +         constexpr bool
> >       +         operator()(_Tp&& __x)
> >       +         {
> >       +           return std::__invoke(_M_pred,
> >       +                                std::__invoke(_M_proj,
> std::forward<_Tp>(__x)));
> >       +         }
> >       +      };
> >
> >            template<typename _Pred, typename _Proj>
> >       -      constexpr auto
> >       +      constexpr _Pred_proj<_Pred, _Proj>
> >              __make_pred_proj(_Pred& __pred, _Proj& __proj)
> >       -      {
> >       -       return [&] <typename _Tp> (_Tp&& __arg) -> bool {
> >       -         return std::__invoke(__pred,
> >       -                              std::__invoke(__proj,
> std::forward<_Tp>(__arg)));
> >       -       };
> >       -      }
> >       +      { return {__pred, __proj}; }
> >          } // namespace __detail
> >
> >          struct __all_of_fn
> >       --
> >       2.50.0.rc2
> >
> >
> >

Reply via email to