On Mon, Oct 13, 2025 at 12:22 PM Jonathan Wakely <[email protected]> wrote:

>
>
> On Mon, 13 Oct 2025 at 10:08, Tomasz Kaminski <[email protected]> wrote:
>
>>
>>
>> On Sat, Oct 11, 2025 at 3:33 PM Jonathan Wakely <[email protected]>
>> wrote:
>>
>>> We need to ensure that the memory allocated for a path::_List::_Impl is
>>> at least 4-byte aligned, so that we can use the two least significant
>>> bits to store a _Type value. Use __STDCPP_DEFAULT_NEW_ALIGNMENT__ to
>>> detect whether we need to use aligned new/delete to get memory with
>>> unused low bits.
>>>
>>> Allocation and deallocation of path::_List::_Impl objects is refactored
>>> into a new _Impl::create function so that the memory allocation is done
>>> in one place and can ensure the required alignment.
>>>
>> If I understand correctly we do not have a target
>> where  __STDCPP_DEFAULT_NEW_ALIGNMENT__ is
>> smaller than 4. If so, wouldn't static_assert be better instead of
>> effectively dead code?
>>
>
>
> I think it's dead code in all targets supported by upstream GCC. It's
> possible there are out-of-tree ports for other microprocessors
> where __STDCPP_DEFAULT_NEW_ALIGNMENT__ < 4, and if we use a static_assert
> they would need to do additional work in their port. But we can add a
> comment to the static_assert, and maybe deal with that problem later if it
> ever happens (YAGNI).
>
As an alternative, I would be fine if we only had one code path, and always
used an aligned version of the new.
And we just add if (align <= __STDCPP_DEFAULT_NEW_ALIGNMENT__) return
::operator new(sz) in the align_val_t
operator new implementation.

>
>
>
>>
>>> libstdc++-v3/ChangeLog:
>>>
>>>         PR libstdc++/122255
>>>         * src/c++17/fs_path.cc (path::_List::_Impl::copy): Use create
>>>         function.
>>>         (path::_List::_Impl::required_alignment): New static data
>>>         member.
>>>         (path::_List::_Impl::use_aligned_new): New static data member.
>>>         (path::_List::_Impl::create): New static member function.
>>>         (path::_List::_Impl_deleter::operator()): Use aligned delete if
>>>         needed.
>>> ---
>>>
>>> Tested x86_64-linux.
>>>
>>
>>
>>>
>>>  libstdc++-v3/src/c++17/fs_path.cc | 80 ++++++++++++++++++++++---------
>>>  1 file changed, 58 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/libstdc++-v3/src/c++17/fs_path.cc
>>> b/libstdc++-v3/src/c++17/fs_path.cc
>>> index 215afa08ad25..aa47cfa2a655 100644
>>> --- a/libstdc++-v3/src/c++17/fs_path.cc
>>> +++ b/libstdc++-v3/src/c++17/fs_path.cc
>>> @@ -34,6 +34,7 @@
>>>  #include <filesystem>
>>>  #include <algorithm>
>>>  #include <array>
>>> +#include <new>
>>>  #include <bits/stl_uninitialized.h>
>>>  #include <ext/numeric_traits.h> // __gnu_cxx::__int_traits
>>>
>>> @@ -207,6 +208,8 @@ struct path::_List::_Impl
>>>
>>>    _Impl(int cap) : _M_size(0), _M_capacity(cap) { }
>>>
>>> +  // Align the first member like the value_type so that we can store
>>> one or
>>> +  // more objects of that type immediately after the memory occupied by
>>> *this.
>>>    alignas(value_type) int _M_size;
>>>    int _M_capacity;
>>>
>>> @@ -246,8 +249,7 @@ struct path::_List::_Impl
>>>    unique_ptr<_Impl, _Impl_deleter> copy() const
>>>    {
>>>      const auto n = size();
>>> -    void* p = ::operator new(sizeof(_Impl) + n * sizeof(value_type));
>>> -    unique_ptr<_Impl, _Impl_deleter> newptr(::new (p) _Impl{n});
>>> +    auto newptr = create(n, false); // Skip overflow checks, n is a
>>> good size.
>>>      std::uninitialized_copy_n(begin(), n, newptr->begin());
>>>      newptr->_M_size = n;
>>>      return newptr;
>>> @@ -259,16 +261,67 @@ struct path::_List::_Impl
>>>      constexpr uintptr_t mask = ~(uintptr_t)0x3;
>>>      return reinterpret_cast<_Impl*>(reinterpret_cast<uintptr_t>(p) &
>>> mask);
>>>    }
>>> +
>>> +  // Require memory aligned to 4 bytes so we can store _Type in the low
>>> bits.
>>> +  static constexpr size_t required_alignment
>>> +    = std::max(size_t(4), alignof(value_type));
>>>
>> This max seem superfluous here,  we use this variable only if default new
>> alignment
>> is lower than 4,
>>
>
> It occurred to me that it's theoretically possible that
> alignof(_Impl::value_type) might also be higher than guaranteed by operator
> new. We need the memory to be suitable for storing _Impl *and* to have two
> free bits for tagging.
>
> If we just did `new _Impl` then the compiler would do the same
> alignof(_Impl) check and use aligned-new if needed, so this is replicating
> that (we can't just do `new _Impl` because we need the additional memory
> following it for the components).
>
> Again, it's probably redundant for all targets supported in-tree by GCC
> ... but then that means PR122255 is not a bug at all and this entire change
> is unnecessary. The refactoring to add _Impl::create still seems useful
> though, so I can adjust the patch to do that refactoring, and just add a
> static assert to ensure we don't silently under-align anything.
>
>
>
>> +
>>> +  // Only use aligned new if needed, because it might be less efficient.
>>> +  static constexpr bool use_aligned_new
>>> +    = __STDCPP_DEFAULT_NEW_ALIGNMENT__ < required_alignment;
>>> +
>>> +  // Create a new _Impl with capacity for n components.
>>> +  static unique_ptr<_Impl, _Impl_deleter>
>>> +  create(int n, bool check_overflow = true)
>>> +  {
>>> +    if (check_overflow)
>>> +      {
>>> +       using __gnu_cxx::__int_traits;
>>> +       // Nobody should need paths with this many components.
>>> +       if (n >= __int_traits<int>::__max / 4)
>>> +         std::__throw_bad_alloc();
>>> +
>>> +       if constexpr (__int_traits<int>::__max >=
>>> __int_traits<size_t>::__max)
>>> +         {
>>> +           // Check that the calculation below won't overflow.
>>> +           size_t bytes;
>>> +           if (__builtin_mul_overflow(n, sizeof(value_type), &bytes)
>>> +                 || __builtin_add_overflow(sizeof(_Impl), bytes,
>>> &bytes))
>>> +             std::__throw_bad_alloc();
>>> +         }
>>> +       // Otherwise, it can't overflow, even for 20-bit size_t on
>>> msp430.
>>> +      }
>>> +
>>> +    void* p;
>>> +    if constexpr (use_aligned_new)
>>> +      p = ::operator new(sizeof(_Impl) + n * sizeof(value_type),
>>> +                        align_val_t{required_alignment});
>>> +    else
>>> +      p = ::operator new(sizeof(_Impl) + n * sizeof(value_type));
>>> +
>>> +    return std::unique_ptr<_Impl, _Impl_deleter>(::new(p) _Impl{n});
>>> +  }
>>>  };
>>>
>>> -void path::_List::_Impl_deleter::operator()(_Impl* p) const noexcept
>>> +// Destroy and deallocate an _Impl.
>>> +void
>>> +path::_List::_Impl_deleter::operator()(_Impl* p) const noexcept
>>>  {
>>>    p = _Impl::notype(p);
>>>    if (p)
>>>      {
>>>        __glibcxx_assert(p->_M_size <= p->_M_capacity);
>>>        p->clear();
>>> -      ::operator delete(p, sizeof(*p) + p->_M_capacity *
>>> sizeof(value_type));
>>> +
>>> +      const auto n = p->_M_capacity;
>>> +      // Destructor is trivial, so this isn't really necessary:
>>> +      p->~_Impl();
>>> +
>>> +      if constexpr (_Impl::use_aligned_new)
>>> +       ::operator delete(p, sizeof(_Impl) + n *
>>> sizeof(_Impl::value_type),
>>> +                         align_val_t{_Impl::required_alignment});
>>> +      else
>>> +       ::operator delete(p, sizeof(_Impl) + n *
>>> sizeof(_Impl::value_type));
>>>      }
>>>  }
>>>
>>> @@ -455,24 +508,7 @@ path::_List::reserve(int newcap, bool exact = false)
>>>             newcap = nextcap;
>>>         }
>>>
>>> -      using __gnu_cxx::__int_traits;
>>> -      // Nobody should need paths with this many components.
>>> -      if (newcap >= __int_traits<int>::__max / 4)
>>> -       std::__throw_bad_alloc();
>>> -
>>> -      size_t bytes;
>>> -      if constexpr (__int_traits<int>::__max >=
>>> __int_traits<size_t>::__max)
>>> -       {
>>> -         size_t components;
>>> -         if (__builtin_mul_overflow(newcap, sizeof(value_type),
>>> &components)
>>> -               || __builtin_add_overflow(sizeof(_Impl), components,
>>> &bytes))
>>> -           std::__throw_bad_alloc();
>>> -       }
>>> -      else // This won't overflow, even for 20-bit size_t on msp430.
>>> -       bytes = sizeof(_Impl) + newcap * sizeof(value_type);
>>> -
>>> -      void* p = ::operator new(bytes);
>>> -      std::unique_ptr<_Impl, _Impl_deleter> newptr(::new(p)
>>> _Impl{newcap});
>>> +      auto newptr = _Impl::create(newcap);
>>>        const int cursize = curptr ? curptr->size() : 0;
>>>        if (cursize)
>>>         {
>>> --
>>> 2.51.0
>>>
>>>

Reply via email to