Re: Relocation (= move+destroy)
On Fri, 9 Nov 2018, Jonathan Wakely wrote: Here's the fix for the regression this introduced. Thanks. I was going to handle it, but I was waiting for you to review the second relocation patch first to avoid having several patches in flight over the same piece of code. Anyway, having the regression fixed is good. -- Marc Glisse
Re: Relocation (= move+destroy)
On 25/10/18 13:36 +0100, Jonathan Wakely wrote: On 25/10/18 14:30 +0200, Marc Glisse wrote: On Tue, 23 Oct 2018, Jonathan Wakely wrote: + template +inline void +__relocate_a(_Tp* __dest, _Up* __orig, _Allocator& __alloc) I find it a little surprising that this overload for single objects using the memmove argument ordering (dest, source) but the range overload below uses the STL ordering (source_begin, source_end, dest). But I wouldn't be surprised if we're already doing that somewhere that I've forgotten about. WOuld it make sense to either rename this overload, or to use consistent argument ordering for the two __relocate_a overloads? The functions were not meant as overloads, it just happened that I arrived at the same name for both, but it would make perfect sense to give them different names. I started from __relocate(dest, source) for one element, and later added an allocator to it. The other one corresponds to __uninitialized_move_a, and naming it __uninitialized_relocate_a would be silly since "uninitialized" is included in the definition of relocate. Yes, my first thought was to add "uninitialized" and I rejected it for that same reason. I think I'd rather rename than change the order. Do you have suggestions? __relocate_range_a? I was thinking the single object one could be __relocate_1_a with the _1 being like copy_n, fill_n etc. but that's going to be confusing with __relocate_a_1 instead! It seems unfortunate to have to put "range" in the name when no other algos that work on ranges bother to say that in the name. Maybe the single object one could be __relocate_single_a? Or __do_relocate_a? __relocate_obj_a? None of them really makes me happy. I went with __relocate_object_a, but that's easy to change. I'm OK with that. If anybody thinks of a better name we can change it. We may want to specialize / overload __relocate_object_a for some specific types in the future, although we would more likely have __relocate_object_a call __relocate_object (no allocator) when it receives the default allocator, and specialize that one. And this should anyway be less common that specializing __is_trivially_relocatable. I just realized that __reallocate_a and construct don't take the allocator argument in the same position... I can switch if it helps. No, leaving it at the end is consistent with __uninitialized_copy_a et al, and it's closer in spirit to them. OK for trunk, thanks! Here's the fix for the regression this introduced. Tested x86_64-linux, committed to trunk. commit caf2adcb563d7dd62c3de7bb49f528753f958ba3 Author: Jonathan Wakely Date: Fri Nov 9 19:19:16 2018 + PR libstdc++/87787 fix UBsan error in std::vector PR libstdc++/87787 * include/bits/stl_uninitialized.h (__relocate_a_1): Do not call memmove when there's nothing to copy (and pointers could be null). diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h b/libstdc++-v3/include/bits/stl_uninitialized.h index 94c7e151e29..8839bfdcc90 100644 --- a/libstdc++-v3/include/bits/stl_uninitialized.h +++ b/libstdc++-v3/include/bits/stl_uninitialized.h @@ -904,7 +904,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Tp* __result, allocator<_Up>& __alloc) { ptrdiff_t __count = __last - __first; - __builtin_memmove(__result, __first, __count * sizeof(_Tp)); + if (__count > 0) + __builtin_memmove(__result, __first, __count * sizeof(_Tp)); return __result + __count; }
Re: Relocation (= move+destroy)
On 25/10/18 14:30 +0200, Marc Glisse wrote: On Tue, 23 Oct 2018, Jonathan Wakely wrote: + template +inline void +__relocate_a(_Tp* __dest, _Up* __orig, _Allocator& __alloc) I find it a little surprising that this overload for single objects using the memmove argument ordering (dest, source) but the range overload below uses the STL ordering (source_begin, source_end, dest). But I wouldn't be surprised if we're already doing that somewhere that I've forgotten about. WOuld it make sense to either rename this overload, or to use consistent argument ordering for the two __relocate_a overloads? The functions were not meant as overloads, it just happened that I arrived at the same name for both, but it would make perfect sense to give them different names. I started from __relocate(dest, source) for one element, and later added an allocator to it. The other one corresponds to __uninitialized_move_a, and naming it __uninitialized_relocate_a would be silly since "uninitialized" is included in the definition of relocate. Yes, my first thought was to add "uninitialized" and I rejected it for that same reason. I think I'd rather rename than change the order. Do you have suggestions? __relocate_range_a? I was thinking the single object one could be __relocate_1_a with the _1 being like copy_n, fill_n etc. but that's going to be confusing with __relocate_a_1 instead! It seems unfortunate to have to put "range" in the name when no other algos that work on ranges bother to say that in the name. Maybe the single object one could be __relocate_single_a? Or __do_relocate_a? __relocate_obj_a? None of them really makes me happy. I went with __relocate_object_a, but that's easy to change. I'm OK with that. If anybody thinks of a better name we can change it. We may want to specialize / overload __relocate_object_a for some specific types in the future, although we would more likely have __relocate_object_a call __relocate_object (no allocator) when it receives the default allocator, and specialize that one. And this should anyway be less common that specializing __is_trivially_relocatable. I just realized that __reallocate_a and construct don't take the allocator argument in the same position... I can switch if it helps. No, leaving it at the end is consistent with __uninitialized_copy_a et al, and it's closer in spirit to them. OK for trunk, thanks!
Re: Relocation (= move+destroy)
On Tue, 23 Oct 2018, Jonathan Wakely wrote: + template +inline void +__relocate_a(_Tp* __dest, _Up* __orig, _Allocator& __alloc) I find it a little surprising that this overload for single objects using the memmove argument ordering (dest, source) but the range overload below uses the STL ordering (source_begin, source_end, dest). But I wouldn't be surprised if we're already doing that somewhere that I've forgotten about. WOuld it make sense to either rename this overload, or to use consistent argument ordering for the two __relocate_a overloads? The functions were not meant as overloads, it just happened that I arrived at the same name for both, but it would make perfect sense to give them different names. I started from __relocate(dest, source) for one element, and later added an allocator to it. The other one corresponds to __uninitialized_move_a, and naming it __uninitialized_relocate_a would be silly since "uninitialized" is included in the definition of relocate. Yes, my first thought was to add "uninitialized" and I rejected it for that same reason. I think I'd rather rename than change the order. Do you have suggestions? __relocate_range_a? I was thinking the single object one could be __relocate_1_a with the _1 being like copy_n, fill_n etc. but that's going to be confusing with __relocate_a_1 instead! It seems unfortunate to have to put "range" in the name when no other algos that work on ranges bother to say that in the name. Maybe the single object one could be __relocate_single_a? Or __do_relocate_a? __relocate_obj_a? None of them really makes me happy. I went with __relocate_object_a, but that's easy to change. We may want to specialize / overload __relocate_object_a for some specific types in the future, although we would more likely have __relocate_object_a call __relocate_object (no allocator) when it receives the default allocator, and specialize that one. And this should anyway be less common that specializing __is_trivially_relocatable. I just realized that __reallocate_a and construct don't take the allocator argument in the same position... I can switch if it helps. Regtested on gcc112. 2018-10-25 Marc Glisse PR libstdc++/87106 * include/bits/alloc_traits.h (_S_construct, _S_destroy, construct, destroy): Add noexcept specification. * include/bits/allocator.h (construct, destroy): Likewise. * include/ext/alloc_traits.h (construct, destroy): Likewise. * include/ext/malloc_allocator.h (construct, destroy): Likewise. * include/ext/new_allocator.h (construct, destroy): Likewise. * include/bits/stl_uninitialized.h (__relocate_object_a, __relocate_a, __relocate_a_1): New functions. (__is_trivially_relocatable): New class. * include/bits/stl_vector.h (__use_relocate): New static member. * include/bits/vector.tcc (reserve, _M_realloc_insert, _M_default_append): Use __relocate_a. (reserve, _M_assign_aux, _M_realloc_insert, _M_fill_insert, _M_default_append, _M_range_insert): Move _GLIBCXX_ASAN_ANNOTATE_REINIT after _Destroy. * testsuite/23_containers/vector/modifiers/push_back/49836.cc: Replace CopyConsOnlyType with DelAnyAssign. -- Marc GlisseIndex: libstdc++-v3/include/bits/alloc_traits.h === --- libstdc++-v3/include/bits/alloc_traits.h (revision 265451) +++ libstdc++-v3/include/bits/alloc_traits.h (working copy) @@ -233,38 +233,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using type = decltype(__test<_Alloc>(0)); }; template using __has_construct = typename __construct_helper<_Tp, _Args...>::type; template static _Require<__has_construct<_Tp, _Args...>> _S_construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) { __a.construct(__p, std::forward<_Args>(__args)...); } template static _Require<__and_<__not_<__has_construct<_Tp, _Args...>>, is_constructible<_Tp, _Args...>>> _S_construct(_Alloc&, _Tp* __p, _Args&&... __args) + noexcept(noexcept(::new((void*)__p) + _Tp(std::forward<_Args>(__args)...))) { ::new((void*)__p) _Tp(std::forward<_Args>(__args)...); } template static auto _S_destroy(_Alloc2& __a, _Tp* __p, int) + noexcept(noexcept(__a.destroy(__p))) -> decltype(__a.destroy(__p)) { __a.destroy(__p); } template static void _S_destroy(_Alloc2&, _Tp* __p, ...) + noexcept(noexcept(__p->~_Tp())) { __p->~_Tp(); } template static auto _S_max_size(_Alloc2& __a, int) -> decltype(__a.max_size()) { return __a.max_size(); } template static size_type @@ -333,33 +338,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __p Pointer to memory of suitable size and alignment for Tp * @param __args Constructor arguments. * * Calls __a
Re: Relocation (= move+destroy)
On 23/10/18 23:17 +0200, Marc Glisse wrote: On Tue, 23 Oct 2018, Jonathan Wakely wrote: What depends on C++14 here? Just enable_if_t? Because we have __enable_if_t for use in C++11. Both GCC and Clang will allow constexpr-if and static_assert with no message in C++11. Probably it can be enabled in C++11 if you think that matters. I'll admit that I personally don't care at all about C++11, and the main motivation would be to enable a cleanup if we stop supporting C++03 (I am not very optimistic). Me neither. But even if enabling cleanup some day is unlikely, I think that for this code the only change you'd need is to replace enable_if_t with __enable_if_t and we get C++11 support for free, so we might as well do it. + template +inline void +__relocate_a(_Tp* __dest, _Up* __orig, _Allocator& __alloc) I find it a little surprising that this overload for single objects using the memmove argument ordering (dest, source) but the range overload below uses the STL ordering (source_begin, source_end, dest). But I wouldn't be surprised if we're already doing that somewhere that I've forgotten about. WOuld it make sense to either rename this overload, or to use consistent argument ordering for the two __relocate_a overloads? The functions were not meant as overloads, it just happened that I arrived at the same name for both, but it would make perfect sense to give them different names. I started from __relocate(dest, source) for one element, and later added an allocator to it. The other one corresponds to __uninitialized_move_a, and naming it __uninitialized_relocate_a would be silly since "uninitialized" is included in the definition of relocate. Yes, my first thought was to add "uninitialized" and I rejected it for that same reason. I think I'd rather rename than change the order. Do you have suggestions? __relocate_range_a? I was thinking the single object one could be __relocate_1_a with the _1 being like copy_n, fill_n etc. but that's going to be confusing with __relocate_a_1 instead! It seems unfortunate to have to put "range" in the name when no other algos that work on ranges bother to say that in the name. Maybe the single object one could be __relocate_single_a? Or __do_relocate_a? __relocate_obj_a? None of them really makes me happy. + noexcept(noexcept(__gnu_cxx::__alloc_traits<_Allocator>::construct(__alloc, Since this is C++14 (or maybe C++11) you could just use std::allocator_traits directly. __gnu_cxx::__alloc_traits is to provide equivalent functionality in C++98 code. Thanks, I was wondering what it was for. It's just so I could write code in containers that works the same for C++11 allocators and C++98 allocators, always using __alloc_traits. In C++98 it just forwards everything straight to the allocator, because C++98 allocators are required to define all the members themselves.
Re: Relocation (= move+destroy)
On Tue, 23 Oct 2018, Jonathan Wakely wrote: CCing gcc-patches It seems to have disappeared somehow during the discussion, sorry. The tricky stuff in all looks right, I only have some comments on the __relocate_a functions ... Index: libstdc++-v3/include/bits/stl_uninitialized.h === --- libstdc++-v3/include/bits/stl_uninitialized.h (revision 265289) +++ libstdc++-v3/include/bits/stl_uninitialized.h (working copy) @@ -872,14 +872,75 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION uninitialized_move_n(_InputIterator __first, _Size __count, _ForwardIterator __result) { auto __res = std::__uninitialized_copy_n_pair (_GLIBCXX_MAKE_MOVE_ITERATOR(__first), __count, __result); return {__res.first.base(), __res.second}; } #endif +#if __cplusplus >= 201402L What depends on C++14 here? Just enable_if_t? Because we have __enable_if_t for use in C++11. Both GCC and Clang will allow constexpr-if and static_assert with no message in C++11. Probably it can be enabled in C++11 if you think that matters. I'll admit that I personally don't care at all about C++11, and the main motivation would be to enable a cleanup if we stop supporting C++03 (I am not very optimistic). + template +inline void +__relocate_a(_Tp* __dest, _Up* __orig, _Allocator& __alloc) I find it a little surprising that this overload for single objects using the memmove argument ordering (dest, source) but the range overload below uses the STL ordering (source_begin, source_end, dest). But I wouldn't be surprised if we're already doing that somewhere that I've forgotten about. WOuld it make sense to either rename this overload, or to use consistent argument ordering for the two __relocate_a overloads? The functions were not meant as overloads, it just happened that I arrived at the same name for both, but it would make perfect sense to give them different names. I started from __relocate(dest, source) for one element, and later added an allocator to it. The other one corresponds to __uninitialized_move_a, and naming it __uninitialized_relocate_a would be silly since "uninitialized" is included in the definition of relocate. I think I'd rather rename than change the order. Do you have suggestions? __relocate_range_a? + noexcept(noexcept(__gnu_cxx::__alloc_traits<_Allocator>::construct(__alloc, Since this is C++14 (or maybe C++11) you could just use std::allocator_traits directly. __gnu_cxx::__alloc_traits is to provide equivalent functionality in C++98 code. Thanks, I was wondering what it was for. +__dest, std::move(*__orig))) +&& noexcept(__gnu_cxx::__alloc_traits<_Allocator>::destroy( + __alloc, std::__addressof(*__orig +{ + typedef __gnu_cxx::__alloc_traits<_Allocator> __traits; + __traits::construct(__alloc, __dest, std::move(*__orig)); + __traits::destroy(__alloc, std::__addressof(*__orig)); +} + + template +struct __is_trivially_relocatable +: is_trivial<_Tp> { }; It might be worth adding a comment that this type might be specialized in future, so that I don't forget and simplify it to an alias template later :-) Ok. -- Marc Glisse
Re: Relocation (= move+destroy)
CCing gcc-patches On 19/10/18 07:33 +0200, Marc Glisse wrote: On Thu, 18 Oct 2018, Marc Glisse wrote: Uh, why didn't I notice that the function __relocate is unused? I guess I'll resend the same patch without __relocate once retesting has finished :-( Sorry for all the corrections, I guess I didn't check my patch carefully enough before sending it the first time. 2018-10-19 Marc Glisse PR libstdc++/87106 * include/bits/alloc_traits.h (_S_construct, _S_destroy, construct, destroy): Add noexcept specification. * include/bits/allocator.h (construct, destroy): Likewise. * include/ext/alloc_traits.h (construct, destroy): Likewise. * include/ext/malloc_allocator.h (construct, destroy): Likewise. * include/ext/new_allocator.h (construct, destroy): Likewise. * include/bits/stl_uninitialized.h (__relocate_a, __relocate_a_1): New functions. (__is_trivially_relocatable): New class. * include/bits/stl_vector.h (__use_relocate): New static member. * include/bits/vector.tcc (reserve, _M_realloc_insert, _M_default_append): Use __relocate_a. (reserve, _M_assign_aux, _M_realloc_insert, _M_fill_insert, _M_default_append, _M_range_insert): Move _GLIBCXX_ASAN_ANNOTATE_REINIT after _Destroy. * testsuite/23_containers/vector/modifiers/push_back/49836.cc: Replace CopyConsOnlyType with DelAnyAssign. -- Marc Glisse The tricky stuff in all looks right, I only have some comments on the __relocate_a functions ... Index: libstdc++-v3/include/bits/stl_uninitialized.h === --- libstdc++-v3/include/bits/stl_uninitialized.h (revision 265289) +++ libstdc++-v3/include/bits/stl_uninitialized.h (working copy) @@ -872,14 +872,75 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION uninitialized_move_n(_InputIterator __first, _Size __count, _ForwardIterator __result) { auto __res = std::__uninitialized_copy_n_pair (_GLIBCXX_MAKE_MOVE_ITERATOR(__first), __count, __result); return {__res.first.base(), __res.second}; } #endif +#if __cplusplus >= 201402L What depends on C++14 here? Just enable_if_t? Because we have __enable_if_t for use in C++11. Both GCC and Clang will allow constexpr-if and static_assert with no message in C++11. + template +inline void +__relocate_a(_Tp* __dest, _Up* __orig, _Allocator& __alloc) I find it a little surprising that this overload for single objects using the memmove argument ordering (dest, source) but the range overload below uses the STL ordering (source_begin, source_end, dest). But I wouldn't be surprised if we're already doing that somewhere that I've forgotten about. WOuld it make sense to either rename this overload, or to use consistent argument ordering for the two __relocate_a overloads? +noexcept(noexcept(__gnu_cxx::__alloc_traits<_Allocator>::construct(__alloc, Since this is C++14 (or maybe C++11) you could just use std::allocator_traits directly. __gnu_cxx::__alloc_traits is to provide equivalent functionality in C++98 code. +__dest, std::move(*__orig))) +&& noexcept(__gnu_cxx::__alloc_traits<_Allocator>::destroy( + __alloc, std::__addressof(*__orig +{ + typedef __gnu_cxx::__alloc_traits<_Allocator> __traits; + __traits::construct(__alloc, __dest, std::move(*__orig)); + __traits::destroy(__alloc, std::__addressof(*__orig)); +} + + template +struct __is_trivially_relocatable +: is_trivial<_Tp> { }; It might be worth adding a comment that this type might be specialized in future, so that I don't forget and simplify it to an alias template later :-) + template +inline std::enable_if_t::value, _Tp*> +__relocate_a_1(_Tp* __first, _Tp* __last, + _Tp* __result, allocator<_Up>& __alloc) +{ + ptrdiff_t __count = __last - __first; + __builtin_memmove(__result, __first, __count * sizeof(_Tp)); + return __result + __count; +} + + template +inline _ForwardIterator +__relocate_a_1(_InputIterator __first, _InputIterator __last, + _ForwardIterator __result, _Allocator& __alloc) +{ + typedef typename iterator_traits<_InputIterator>::value_type + _ValueType; + typedef typename iterator_traits<_ForwardIterator>::value_type + _ValueType2; + static_assert(std::is_same<_ValueType, _ValueType2>::value); + static_assert(noexcept(std::__relocate_a(std::addressof(*__result), + std::addressof(*__first), + __alloc))); + _ForwardIterator __cur = __result; + for (; __first != __last; ++__first, (void)++__cur) + std::__relocate_a(std::__addressof(*__cur), + std::__addressof(*__first), __alloc); +
Re: Relocation (= move+destroy)
On Sat, 13 Oct 2018, Marc Glisse wrote: + template +struct __is_trivially_relocatable +: is_trivially_move_constructible<_Tp> { }; Oups, this part is wrong, sorry, it is supposed to be "is_trivial" instead of "is_trivially_move_constructible", to match what is done elsewhere in this file. Using is_trivially_move_constructible instead (like libc++) is a separate issue, and in the particular case of relocation would need to be combined with is_trivially_destructible. I'll retest with is_trivial, but I would be very surprised if that broke anything in the testsuite. -- Marc Glisse
Re: Relocation (= move+destroy)
On Sat, 13 Oct 2018, Jonathan Wakely wrote: On 13/10/18 11:07 +0200, Marc Glisse wrote: --- libstdc++-v3/include/bits/alloc_traits.h(revision 265131) +++ libstdc++-v3/include/bits/alloc_traits.h(working copy) @@ -233,38 +233,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using type = decltype(__test<_Alloc>(0)); }; template using __has_construct = typename __construct_helper<_Tp, _Args...>::type; template static _Require<__has_construct<_Tp, _Args...>> _S_construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) You could use std::declval<_Args>()... instead of forwarding __args... which would be slightly shorter (and might keep some of the others on a single line). No need to change it unless you want to, either is OK. In order to reduce the risk of bugs, my preference is 1) noexcept(auto): I am amazed that no compiler has implemented it yet, even as a private extension spelled __noexcept_auto or whatever, since it reduces significantly the burden on standard library developers. As a recent example, PR 87538 would never have had a chance to exist. 2) an exact copy-paste of the body of the function. 3) something else only if needed, for instance because 2) misses the conversion to the result type. Of course, since there is more chance *you* will have to maintain that mess, whatever style you find easier to maintain is more important. -- Marc Glisse
Re: Relocation (= move+destroy)
On 13/10/18 11:07 +0200, Marc Glisse wrote: --- libstdc++-v3/include/bits/alloc_traits.h(revision 265131) +++ libstdc++-v3/include/bits/alloc_traits.h(working copy) @@ -233,38 +233,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using type = decltype(__test<_Alloc>(0)); }; template using __has_construct = typename __construct_helper<_Tp, _Args...>::type; template static _Require<__has_construct<_Tp, _Args...>> _S_construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) You could use std::declval<_Args>()... instead of forwarding __args... which would be slightly shorter (and might keep some of the others on a single line). No need to change it unless you want to, either is OK. I'll finish reviewing the rest on Monday.
Re: Relocation (= move+destroy)
On Sun, 2 Sep 2018, Jonathan Wakely wrote: On 01/09/18 21:56 +0200, Marc Glisse wrote: On Sat, 1 Sep 2018, Marc Glisse wrote: this patch passed bootstrap+regtest on powerpc64le-unknown-linux-gnu. I realized afterwards that for a C++17-only feature, that's not testing much... So I changed it to apply in C++14 and fixed a minor issue. There is now a single regression: 23_containers/vector/modifiers/push_back/49836.cc The PR was about not using assignment for an operation that should only use construction, and that's fine. But we ended up with a stricter testcase using CopyConsOnlyType, where the type has a deleted move constructor which, as far as I understand the standard, makes it an invalid type for use in vector::push_back. Is that something we want to keep supporting, or may I break it? What is happening is that I think you can break it. I'll look back over the history of the test case, but I don't think supporting deleted moves is intended. Here is a version where I adapt the test. Bootstrap+testsuite on gcc112. 2018-10-15 Marc Glisse PR libstdc++/87106 * include/bits/alloc_traits.h (_S_construct, _S_destroy, construct, destroy): Add noexcept specification. * include/bits/allocator.h (construct, destroy): Likewise. * include/ext/alloc_traits.h (construct, destroy): Likewise. * include/ext/malloc_allocator.h (construct, destroy): Likewise. * include/ext/new_allocator.h (construct, destroy): Likewise. * include/bits/stl_uninitialized.h (__relocate, __relocate_a, __relocate_a_1): New functions. (__is_trivially_relocatable): New class. * include/bits/stl_vector.h (__use_relocate): New static member. * include/bits/vector.tcc (reserve, _M_realloc_insert, _M_default_append): Use __relocate_a. (reserve, _M_assign_aux, _M_realloc_insert, _M_fill_insert, _M_default_append, _M_range_insert): Move _GLIBCXX_ASAN_ANNOTATE_REINIT after _Destroy. * testsuite/23_containers/vector/modifiers/push_back/49836.cc: Replace CopyConsOnlyType with DelAnyAssign. -- Marc GlisseIndex: libstdc++-v3/include/bits/alloc_traits.h === --- libstdc++-v3/include/bits/alloc_traits.h (revision 265131) +++ libstdc++-v3/include/bits/alloc_traits.h (working copy) @@ -233,38 +233,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using type = decltype(__test<_Alloc>(0)); }; template using __has_construct = typename __construct_helper<_Tp, _Args...>::type; template static _Require<__has_construct<_Tp, _Args...>> _S_construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) { __a.construct(__p, std::forward<_Args>(__args)...); } template static _Require<__and_<__not_<__has_construct<_Tp, _Args...>>, is_constructible<_Tp, _Args...>>> _S_construct(_Alloc&, _Tp* __p, _Args&&... __args) + noexcept(noexcept(::new((void*)__p) + _Tp(std::forward<_Args>(__args)...))) { ::new((void*)__p) _Tp(std::forward<_Args>(__args)...); } template static auto _S_destroy(_Alloc2& __a, _Tp* __p, int) + noexcept(noexcept(__a.destroy(__p))) -> decltype(__a.destroy(__p)) { __a.destroy(__p); } template static void _S_destroy(_Alloc2&, _Tp* __p, ...) + noexcept(noexcept(__p->~_Tp())) { __p->~_Tp(); } template static auto _S_max_size(_Alloc2& __a, int) -> decltype(__a.max_size()) { return __a.max_size(); } template static size_type @@ -333,33 +338,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __p Pointer to memory of suitable size and alignment for Tp * @param __args Constructor arguments. * * Calls __a.construct(__p, std::forward(__args)...) * if that expression is well-formed, otherwise uses placement-new * to construct an object of type @a _Tp at location @a __p from the * arguments @a __args... */ template static auto construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(_S_construct(__a, __p, + std::forward<_Args>(__args)...))) -> decltype(_S_construct(__a, __p, std::forward<_Args>(__args)...)) { _S_construct(__a, __p, std::forward<_Args>(__args)...); } /** * @brief Destroy an object of type @a _Tp * @param __a An allocator. * @param __p Pointer to the object to destroy * * Calls @c __a.destroy(__p) if that expression is well-formed, * otherwise calls @c __p->~_Tp() */ template static void destroy(_Alloc& __a, _Tp* __p) + noexcept(noexcept(_S_destroy(__a, __p, 0))) { _S_destroy(__a, __p, 0); } /** * @brief The maximum supported allocation size * @param __a An allocator. * @return @c __a.max_size() or @c numeric_limits::max(
Re: Relocation (= move+destroy)
On 01/09/18 17:00 +0200, Marc Glisse wrote: _GLIBCXX_ASAN_ANNOTATE_REINIT: I am not familiar with those annotations. It was convenient in one function to move this annotation after _Destroy, to reduce code duplication. For consistency, I did the same in the whole file. As far as I understand, the macro makes it ok to access memory between _M_finish and _M_end_of_storage, and at the end of the block marks again the region after the new _M_finish as protected. Since _Destroy should stop at _M_finish, moving the macro looks safe. But maybe the position of the macro was chosen to reduce needless checking in ASAN? I don't remember if I had a specific reason for doing the REINIT before _Destroy, but your change looks fine. I'll keep reviewing the rest, but my first impression is that this is a nice optimisation, thanks for working on it.
Re: Relocation (= move+destroy)
On 2 September 2018 at 23:03, Jonathan Wakely wrote: > On 01/09/18 21:56 +0200, Marc Glisse wrote: >> >> On Sat, 1 Sep 2018, Marc Glisse wrote: >> >>> this patch passed bootstrap+regtest on powerpc64le-unknown-linux-gnu. >> >> >> I realized afterwards that for a C++17-only feature, that's not testing >> much... So I changed it to apply in C++14 and fixed a minor issue. There is >> now a single regression: >> >> 23_containers/vector/modifiers/push_back/49836.cc >> >> The PR was about not using assignment for an operation that should only >> use construction, and that's fine. But we ended up with a stricter testcase >> using CopyConsOnlyType, where the type has a deleted move constructor which, >> as far as I understand the standard, makes it an invalid type for use in >> vector::push_back. Is that something we want to keep supporting, or may I >> break it? What is happening is that > > > I think you can break it. I'll look back over the history of the test > case, but I don't think supporting deleted moves is intended. We have supported those occasionally; I did so in std::any, but the standard has explicitly been moving towards a direction where deleted moves (if corresponding copies are not deleted) are not a supported thing, so I concur with the suggestion of breaking such tests being okay.
Re: Relocation (= move+destroy)
On 01/09/18 21:56 +0200, Marc Glisse wrote: On Sat, 1 Sep 2018, Marc Glisse wrote: this patch passed bootstrap+regtest on powerpc64le-unknown-linux-gnu. I realized afterwards that for a C++17-only feature, that's not testing much... So I changed it to apply in C++14 and fixed a minor issue. There is now a single regression: 23_containers/vector/modifiers/push_back/49836.cc The PR was about not using assignment for an operation that should only use construction, and that's fine. But we ended up with a stricter testcase using CopyConsOnlyType, where the type has a deleted move constructor which, as far as I understand the standard, makes it an invalid type for use in vector::push_back. Is that something we want to keep supporting, or may I break it? What is happening is that I think you can break it. I'll look back over the history of the test case, but I don't think supporting deleted moves is intended. the definition of __use_relocate asks if some expression involving a move of _Tp is noexcept, which causes a hard error. It would certainly be possible to work around that, but it would complicate the code and seems quite pointless to me. Agreed.
Re: Relocation (= move+destroy)
On Sat, 1 Sep 2018, Marc Glisse wrote: this patch passed bootstrap+regtest on powerpc64le-unknown-linux-gnu. I realized afterwards that for a C++17-only feature, that's not testing much... So I changed it to apply in C++14 and fixed a minor issue. There is now a single regression: 23_containers/vector/modifiers/push_back/49836.cc The PR was about not using assignment for an operation that should only use construction, and that's fine. But we ended up with a stricter testcase using CopyConsOnlyType, where the type has a deleted move constructor which, as far as I understand the standard, makes it an invalid type for use in vector::push_back. Is that something we want to keep supporting, or may I break it? What is happening is that the definition of __use_relocate asks if some expression involving a move of _Tp is noexcept, which causes a hard error. It would certainly be possible to work around that, but it would complicate the code and seems quite pointless to me. -- Marc GlisseIndex: include/bits/alloc_traits.h === --- include/bits/alloc_traits.h (revision 264027) +++ include/bits/alloc_traits.h (working copy) @@ -233,38 +233,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using type = decltype(__test<_Alloc>(0)); }; template using __has_construct = typename __construct_helper<_Tp, _Args...>::type; template static _Require<__has_construct<_Tp, _Args...>> _S_construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) { __a.construct(__p, std::forward<_Args>(__args)...); } template static _Require<__and_<__not_<__has_construct<_Tp, _Args...>>, is_constructible<_Tp, _Args...>>> _S_construct(_Alloc&, _Tp* __p, _Args&&... __args) + noexcept(noexcept(::new((void*)__p) + _Tp(std::forward<_Args>(__args)...))) { ::new((void*)__p) _Tp(std::forward<_Args>(__args)...); } template static auto _S_destroy(_Alloc2& __a, _Tp* __p, int) + noexcept(noexcept(__a.destroy(__p))) -> decltype(__a.destroy(__p)) { __a.destroy(__p); } template static void _S_destroy(_Alloc2&, _Tp* __p, ...) + noexcept(noexcept(__p->~_Tp())) { __p->~_Tp(); } template static auto _S_max_size(_Alloc2& __a, int) -> decltype(__a.max_size()) { return __a.max_size(); } template static size_type @@ -333,33 +338,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __p Pointer to memory of suitable size and alignment for Tp * @param __args Constructor arguments. * * Calls __a.construct(__p, std::forward(__args)...) * if that expression is well-formed, otherwise uses placement-new * to construct an object of type @a _Tp at location @a __p from the * arguments @a __args... */ template static auto construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(_S_construct(__a, __p, + std::forward<_Args>(__args)...))) -> decltype(_S_construct(__a, __p, std::forward<_Args>(__args)...)) { _S_construct(__a, __p, std::forward<_Args>(__args)...); } /** * @brief Destroy an object of type @a _Tp * @param __a An allocator. * @param __p Pointer to the object to destroy * * Calls @c __a.destroy(__p) if that expression is well-formed, * otherwise calls @c __p->~_Tp() */ template static void destroy(_Alloc& __a, _Tp* __p) + noexcept(noexcept(_S_destroy(__a, __p, 0))) { _S_destroy(__a, __p, 0); } /** * @brief The maximum supported allocation size * @param __a An allocator. * @return @c __a.max_size() or @c numeric_limits::max() * * Returns @c __a.max_size() if that expression is well-formed, * otherwise returns @c numeric_limits::max() */ @@ -465,32 +473,34 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @brief Construct an object of type @a _Up * @param __a An allocator. * @param __p Pointer to memory of suitable size and alignment for Tp * @param __args Constructor arguments. * * Calls __a.construct(__p, std::forward(__args)...) */ template static void construct(allocator_type& __a, _Up* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) { __a.construct(__p, std::forward<_Args>(__args)...); } /** * @brief Destroy an object of type @a _Up * @param __a An allocator. * @param __p Pointer to the object to destroy * * Calls @c __a.destroy(__p). */ template static void destroy(allocator_type& __a, _Up* __p) + noexcept(noexcept(__a.destroy(__p))) { __a.destroy(__p); } /** * @brief The maximum supported allocation size * @param __a An allocato
Re: Relocation (= move+destroy)
I forgot to attach the "diff -w" version of the patch, which may be a bit more readable. -- Marc GlisseIndex: include/bits/alloc_traits.h === --- include/bits/alloc_traits.h (revision 264027) +++ include/bits/alloc_traits.h (working copy) @@ -233,38 +233,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using type = decltype(__test<_Alloc>(0)); }; template using __has_construct = typename __construct_helper<_Tp, _Args...>::type; template static _Require<__has_construct<_Tp, _Args...>> _S_construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) { __a.construct(__p, std::forward<_Args>(__args)...); } template static _Require<__and_<__not_<__has_construct<_Tp, _Args...>>, is_constructible<_Tp, _Args...>>> _S_construct(_Alloc&, _Tp* __p, _Args&&... __args) + noexcept(noexcept(::new((void*)__p) + _Tp(std::forward<_Args>(__args)...))) { ::new((void*)__p) _Tp(std::forward<_Args>(__args)...); } template static auto _S_destroy(_Alloc2& __a, _Tp* __p, int) + noexcept(noexcept(__a.destroy(__p))) -> decltype(__a.destroy(__p)) { __a.destroy(__p); } template static void _S_destroy(_Alloc2&, _Tp* __p, ...) + noexcept(noexcept(__p->~_Tp())) { __p->~_Tp(); } template static auto _S_max_size(_Alloc2& __a, int) -> decltype(__a.max_size()) { return __a.max_size(); } template static size_type @@ -333,33 +338,36 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __p Pointer to memory of suitable size and alignment for Tp * @param __args Constructor arguments. * * Calls __a.construct(__p, std::forward(__args)...) * if that expression is well-formed, otherwise uses placement-new * to construct an object of type @a _Tp at location @a __p from the * arguments @a __args... */ template static auto construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(_S_construct(__a, __p, + std::forward<_Args>(__args)...))) -> decltype(_S_construct(__a, __p, std::forward<_Args>(__args)...)) { _S_construct(__a, __p, std::forward<_Args>(__args)...); } /** * @brief Destroy an object of type @a _Tp * @param __a An allocator. * @param __p Pointer to the object to destroy * * Calls @c __a.destroy(__p) if that expression is well-formed, * otherwise calls @c __p->~_Tp() */ template static void destroy(_Alloc& __a, _Tp* __p) + noexcept(noexcept(_S_destroy(__a, __p, 0))) { _S_destroy(__a, __p, 0); } /** * @brief The maximum supported allocation size * @param __a An allocator. * @return @c __a.max_size() or @c numeric_limits::max() * * Returns @c __a.max_size() if that expression is well-formed, * otherwise returns @c numeric_limits::max() */ @@ -465,32 +473,34 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @brief Construct an object of type @a _Up * @param __a An allocator. * @param __p Pointer to memory of suitable size and alignment for Tp * @param __args Constructor arguments. * * Calls __a.construct(__p, std::forward(__args)...) */ template static void construct(allocator_type& __a, _Up* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) { __a.construct(__p, std::forward<_Args>(__args)...); } /** * @brief Destroy an object of type @a _Up * @param __a An allocator. * @param __p Pointer to the object to destroy * * Calls @c __a.destroy(__p). */ template static void destroy(allocator_type& __a, _Up* __p) + noexcept(noexcept(__a.destroy(__p))) { __a.destroy(__p); } /** * @brief The maximum supported allocation size * @param __a An allocator. * @return @c __a.max_size() */ static size_type max_size(const allocator_type& __a) noexcept { return __a.max_size(); } Index: include/bits/allocator.h === --- include/bits/allocator.h (revision 264027) +++ include/bits/allocator.h (working copy) @@ -81,25 +81,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if __cplusplus >= 201103L // _GLIBCXX_RESOLVE_LIB_DEFECTS // 2103. std::allocator propagate_on_container_move_assignment typedef true_type propagate_on_container_move_assignment; typedef true_type is_always_equal; template void construct(_Up* __p, _Args&&... __args) + noexcept(noexcept(::new((void *)__p) + _Up(std::forward<_Args>(__args)...))) { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); } template void - destroy(_Up* __p)
Relocation (= move+destroy)
Hello, this patch passed bootstrap+regtest on powerpc64le-unknown-linux-gnu. The main idea is manually performing loop fusion when we see 2 consecutive loops where the first moves data from A to B, and the second destroys the same elements in A. This is beneficial because there is one loop fewer (as usual with loop fusion), but also because the move constructor and the destructor can combine relatively well for several types. I performed some simple tests on std::vector, and for a loop that just emplace_back many empty strings, I see a performance gain close to 20%. With not-small strings, I am still seeing a noticable gain (maybe 10%? The noise makes it hard to be precise). I had to add a special case for trivial types, using memmove, to avoid perf regressions, since relocation takes precedence over the old path that is specialized to call memmove. _GLIBCXX_ASAN_ANNOTATE_REINIT: I am not familiar with those annotations. It was convenient in one function to move this annotation after _Destroy, to reduce code duplication. For consistency, I did the same in the whole file. As far as I understand, the macro makes it ok to access memory between _M_finish and _M_end_of_storage, and at the end of the block marks again the region after the new _M_finish as protected. Since _Destroy should stop at _M_finish, moving the macro looks safe. But maybe the position of the macro was chosen to reduce needless checking in ASAN? It might be possible to introduce some helpers that do relocate if noexcept and copy otherwise, but it is much less convenient than for move_if_noexcept, because they don't want to execute the destructors at the same point, so we might also want a destroy_if_no_relocate to go with it... The exact form of the relocate functions is whatever I had when things started working, it is probably not that important as long as we don't document them. I had a _n version taking a size, but I ended up not using it, so I removed it. The change is limited to C++17+, because I felt like using if constexpr. Actually, I think g++ accepts if constexpr in C++11 with a warning (ok in a system header), I don't remember if I ended up using any other recent features... Possible future stuff: * use relocation in more places: there should be 1 or 2 places left in vector, deque may also be a good candidate, I didn't look elsewhere. * specialize relocation for some types (maybe deque?) where it can be noexcept, possibly even trivial, whereas the move constructor cannot. If we do that, we may want to specialize for pair/tuple/array as well, in case one of the members is specialized. 2018-09-01 Marc Glisse PR libstdc++/87106 * include/bits/alloc_traits.h (_S_construct, _S_destroy, construct, destroy): Add noexcept specification. * include/bits/allocator.h (construct, destroy): Likewise. * include/ext/alloc_traits.h (construct, destroy): Likewise. * include/ext/malloc_allocator.h (construct, destroy): Likewise. * include/ext/new_allocator.h (construct, destroy): Likewise. * include/bits/stl_uninitialized.h (__relocate, __relocate_a, __relocate_a_1): New functions. (__is_trivially_relocatable): New class. * include/bits/stl_vector.h (__use_relocate): New static member. * include/bits/vector.tcc (reserve, _M_realloc_insert, _M_default_append): Use __relocate_a. (reserve, _M_assign_aux, _M_realloc_insert, _M_fill_insert, _M_default_append, _M_range_insert): Move _GLIBCXX_ASAN_ANNOTATE_REINIT after _Destroy. -- Marc GlisseIndex: include/bits/alloc_traits.h === --- include/bits/alloc_traits.h (revision 264027) +++ include/bits/alloc_traits.h (working copy) @@ -233,38 +233,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using type = decltype(__test<_Alloc>(0)); }; template using __has_construct = typename __construct_helper<_Tp, _Args...>::type; template static _Require<__has_construct<_Tp, _Args...>> _S_construct(_Alloc& __a, _Tp* __p, _Args&&... __args) + noexcept(noexcept(__a.construct(__p, std::forward<_Args>(__args)...))) { __a.construct(__p, std::forward<_Args>(__args)...); } template static _Require<__and_<__not_<__has_construct<_Tp, _Args...>>, is_constructible<_Tp, _Args...>>> _S_construct(_Alloc&, _Tp* __p, _Args&&... __args) + noexcept(noexcept(::new((void*)__p) + _Tp(std::forward<_Args>(__args)...))) { ::new((void*)__p) _Tp(std::forward<_Args>(__args)...); } template static auto _S_destroy(_Alloc2& __a, _Tp* __p, int) + noexcept(noexcept(__a.destroy(__p))) -> decltype(__a.destroy(__p)) { __a.destroy(__p); } template static void _S_destroy(_Alloc2&, _Tp* __p, ...) + noexcept(noexcept(__p->~_Tp())) { __p->~_Tp(); } template static auto _S_max_size(_Alloc2& __a, int) -> decltyp