On Mon, 13 Oct 2025 at 10:08, Tomasz Kaminski <[email protected]> wrote:

>
>
> On Sat, Oct 11, 2025 at 3:33 PM Jonathan Wakely <[email protected]>
> wrote:
>
>> We need to ensure that the memory allocated for a path::_List::_Impl is
>> at least 4-byte aligned, so that we can use the two least significant
>> bits to store a _Type value. Use __STDCPP_DEFAULT_NEW_ALIGNMENT__ to
>> detect whether we need to use aligned new/delete to get memory with
>> unused low bits.
>>
>> Allocation and deallocation of path::_List::_Impl objects is refactored
>> into a new _Impl::create function so that the memory allocation is done
>> in one place and can ensure the required alignment.
>>
> If I understand correctly we do not have a target
> where  __STDCPP_DEFAULT_NEW_ALIGNMENT__ is
> smaller than 4. If so, wouldn't static_assert be better instead of
> effectively dead code?
>


I think it's dead code in all targets supported by upstream GCC. It's
possible there are out-of-tree ports for other microprocessors
where __STDCPP_DEFAULT_NEW_ALIGNMENT__ < 4, and if we use a static_assert
they would need to do additional work in their port. But we can add a
comment to the static_assert, and maybe deal with that problem later if it
ever happens (YAGNI).



>
>> libstdc++-v3/ChangeLog:
>>
>>         PR libstdc++/122255
>>         * src/c++17/fs_path.cc (path::_List::_Impl::copy): Use create
>>         function.
>>         (path::_List::_Impl::required_alignment): New static data
>>         member.
>>         (path::_List::_Impl::use_aligned_new): New static data member.
>>         (path::_List::_Impl::create): New static member function.
>>         (path::_List::_Impl_deleter::operator()): Use aligned delete if
>>         needed.
>> ---
>>
>> Tested x86_64-linux.
>>
>
>
>>
>>  libstdc++-v3/src/c++17/fs_path.cc | 80 ++++++++++++++++++++++---------
>>  1 file changed, 58 insertions(+), 22 deletions(-)
>>
>> diff --git a/libstdc++-v3/src/c++17/fs_path.cc
>> b/libstdc++-v3/src/c++17/fs_path.cc
>> index 215afa08ad25..aa47cfa2a655 100644
>> --- a/libstdc++-v3/src/c++17/fs_path.cc
>> +++ b/libstdc++-v3/src/c++17/fs_path.cc
>> @@ -34,6 +34,7 @@
>>  #include <filesystem>
>>  #include <algorithm>
>>  #include <array>
>> +#include <new>
>>  #include <bits/stl_uninitialized.h>
>>  #include <ext/numeric_traits.h> // __gnu_cxx::__int_traits
>>
>> @@ -207,6 +208,8 @@ struct path::_List::_Impl
>>
>>    _Impl(int cap) : _M_size(0), _M_capacity(cap) { }
>>
>> +  // Align the first member like the value_type so that we can store one
>> or
>> +  // more objects of that type immediately after the memory occupied by
>> *this.
>>    alignas(value_type) int _M_size;
>>    int _M_capacity;
>>
>> @@ -246,8 +249,7 @@ struct path::_List::_Impl
>>    unique_ptr<_Impl, _Impl_deleter> copy() const
>>    {
>>      const auto n = size();
>> -    void* p = ::operator new(sizeof(_Impl) + n * sizeof(value_type));
>> -    unique_ptr<_Impl, _Impl_deleter> newptr(::new (p) _Impl{n});
>> +    auto newptr = create(n, false); // Skip overflow checks, n is a good
>> size.
>>      std::uninitialized_copy_n(begin(), n, newptr->begin());
>>      newptr->_M_size = n;
>>      return newptr;
>> @@ -259,16 +261,67 @@ struct path::_List::_Impl
>>      constexpr uintptr_t mask = ~(uintptr_t)0x3;
>>      return reinterpret_cast<_Impl*>(reinterpret_cast<uintptr_t>(p) &
>> mask);
>>    }
>> +
>> +  // Require memory aligned to 4 bytes so we can store _Type in the low
>> bits.
>> +  static constexpr size_t required_alignment
>> +    = std::max(size_t(4), alignof(value_type));
>>
> This max seem superfluous here,  we use this variable only if default new
> alignment
> is lower than 4,
>

It occurred to me that it's theoretically possible that
alignof(_Impl::value_type) might also be higher than guaranteed by operator
new. We need the memory to be suitable for storing _Impl *and* to have two
free bits for tagging.

If we just did `new _Impl` then the compiler would do the same
alignof(_Impl) check and use aligned-new if needed, so this is replicating
that (we can't just do `new _Impl` because we need the additional memory
following it for the components).

Again, it's probably redundant for all targets supported in-tree by GCC ...
but then that means PR122255 is not a bug at all and this entire change is
unnecessary. The refactoring to add _Impl::create still seems useful
though, so I can adjust the patch to do that refactoring, and just add a
static assert to ensure we don't silently under-align anything.



> +
>> +  // Only use aligned new if needed, because it might be less efficient.
>> +  static constexpr bool use_aligned_new
>> +    = __STDCPP_DEFAULT_NEW_ALIGNMENT__ < required_alignment;
>> +
>> +  // Create a new _Impl with capacity for n components.
>> +  static unique_ptr<_Impl, _Impl_deleter>
>> +  create(int n, bool check_overflow = true)
>> +  {
>> +    if (check_overflow)
>> +      {
>> +       using __gnu_cxx::__int_traits;
>> +       // Nobody should need paths with this many components.
>> +       if (n >= __int_traits<int>::__max / 4)
>> +         std::__throw_bad_alloc();
>> +
>> +       if constexpr (__int_traits<int>::__max >=
>> __int_traits<size_t>::__max)
>> +         {
>> +           // Check that the calculation below won't overflow.
>> +           size_t bytes;
>> +           if (__builtin_mul_overflow(n, sizeof(value_type), &bytes)
>> +                 || __builtin_add_overflow(sizeof(_Impl), bytes, &bytes))
>> +             std::__throw_bad_alloc();
>> +         }
>> +       // Otherwise, it can't overflow, even for 20-bit size_t on msp430.
>> +      }
>> +
>> +    void* p;
>> +    if constexpr (use_aligned_new)
>> +      p = ::operator new(sizeof(_Impl) + n * sizeof(value_type),
>> +                        align_val_t{required_alignment});
>> +    else
>> +      p = ::operator new(sizeof(_Impl) + n * sizeof(value_type));
>> +
>> +    return std::unique_ptr<_Impl, _Impl_deleter>(::new(p) _Impl{n});
>> +  }
>>  };
>>
>> -void path::_List::_Impl_deleter::operator()(_Impl* p) const noexcept
>> +// Destroy and deallocate an _Impl.
>> +void
>> +path::_List::_Impl_deleter::operator()(_Impl* p) const noexcept
>>  {
>>    p = _Impl::notype(p);
>>    if (p)
>>      {
>>        __glibcxx_assert(p->_M_size <= p->_M_capacity);
>>        p->clear();
>> -      ::operator delete(p, sizeof(*p) + p->_M_capacity *
>> sizeof(value_type));
>> +
>> +      const auto n = p->_M_capacity;
>> +      // Destructor is trivial, so this isn't really necessary:
>> +      p->~_Impl();
>> +
>> +      if constexpr (_Impl::use_aligned_new)
>> +       ::operator delete(p, sizeof(_Impl) + n *
>> sizeof(_Impl::value_type),
>> +                         align_val_t{_Impl::required_alignment});
>> +      else
>> +       ::operator delete(p, sizeof(_Impl) + n *
>> sizeof(_Impl::value_type));
>>      }
>>  }
>>
>> @@ -455,24 +508,7 @@ path::_List::reserve(int newcap, bool exact = false)
>>             newcap = nextcap;
>>         }
>>
>> -      using __gnu_cxx::__int_traits;
>> -      // Nobody should need paths with this many components.
>> -      if (newcap >= __int_traits<int>::__max / 4)
>> -       std::__throw_bad_alloc();
>> -
>> -      size_t bytes;
>> -      if constexpr (__int_traits<int>::__max >=
>> __int_traits<size_t>::__max)
>> -       {
>> -         size_t components;
>> -         if (__builtin_mul_overflow(newcap, sizeof(value_type),
>> &components)
>> -               || __builtin_add_overflow(sizeof(_Impl), components,
>> &bytes))
>> -           std::__throw_bad_alloc();
>> -       }
>> -      else // This won't overflow, even for 20-bit size_t on msp430.
>> -       bytes = sizeof(_Impl) + newcap * sizeof(value_type);
>> -
>> -      void* p = ::operator new(bytes);
>> -      std::unique_ptr<_Impl, _Impl_deleter> newptr(::new(p)
>> _Impl{newcap});
>> +      auto newptr = _Impl::create(newcap);
>>        const int cursize = curptr ? curptr->size() : 0;
>>        if (cursize)
>>         {
>> --
>> 2.51.0
>>
>>

Reply via email to