[Bug tree-optimization/109353] FAIL: 23_containers/vector/bool/allocator/copy.cc (test for excess errors)

2023-10-04 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109353

Jonathan Wakely  changed:

   What|Removed |Added

 CC||seurer at gcc dot gnu.org

--- Comment #6 from Jonathan Wakely  ---
*** Bug 111686 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/109353] FAIL: 23_containers/vector/bool/allocator/copy.cc (test for excess errors)

2023-03-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109353

--- Comment #5 from Richard Biener  ---
(In reply to Jonathan Wakely from comment #4)
> This doesn't help:
> 
> --- a/libstdc++-v3/include/bits/vector.tcc
> +++ b/libstdc++-v3/include/bits/vector.tcc
> @@ -936,15 +936,25 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>   *__position = __x;
>   ++this->_M_impl._M_finish;
> }
> +  else if (!this->_M_impl._M_start._M_p)

yes, we can't CSE this load from the = nullptr assignment so we'd still
diagnose the else case ...

[Bug tree-optimization/109353] FAIL: 23_containers/vector/bool/allocator/copy.cc (test for excess errors)

2023-03-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109353

--- Comment #4 from Jonathan Wakely  ---
This doesn't help:

--- a/libstdc++-v3/include/bits/vector.tcc
+++ b/libstdc++-v3/include/bits/vector.tcc
@@ -936,15 +936,25 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  *__position = __x;
  ++this->_M_impl._M_finish;
}
+  else if (!this->_M_impl._M_start._M_p)
+   {
+ _Bit_pointer __q = this->_M_allocate(1);
+ iterator __i(std::__addressof(*__q), 0);
+ this->_M_impl._M_end_of_storage = __q + 1;
+ this->_M_impl._M_start = __i;
+ *__i = __x;
+ this->_M_impl._M_finish = ++__i;
+   }
   else
{
  const size_type __len =
_M_check_len(size_type(1), "vector::_M_insert_aux");
+ const iterator __begin = begin(), __end = end();
  _Bit_pointer __q = this->_M_allocate(__len);
  iterator __start(std::__addressof(*__q), 0);
- iterator __i = _M_copy_aligned(begin(), __position, __start);
+ iterator __i = _M_copy_aligned(__begin, __position, __start);
  *__i++ = __x;
- iterator __finish = std::copy(__position, end(), __i);
+ iterator __finish = std::copy(__position, __end, __i);
  this->_M_deallocate();
  this->_M_impl._M_end_of_storage = __q + _S_nword(__len);
  this->_M_impl._M_start = __start;

[Bug tree-optimization/109353] FAIL: 23_containers/vector/bool/allocator/copy.cc (test for excess errors)

2023-03-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109353

--- Comment #3 from Richard Biener  ---
So if it's possible to refactor things as to key the copy size dispatching on
the destination size (that seems to be known here) rather than the source
that might help (unless we then diagnose overread from the source in other
cases ...)

[Bug tree-optimization/109353] FAIL: 23_containers/vector/bool/allocator/copy.cc (test for excess errors)

2023-03-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109353

--- Comment #2 from Richard Biener  ---
So it seems to be

_M_insert_aux(iterator __position, bool __x)
{
  if (this->_M_impl._M_finish._M_p != this->_M_impl._M_end_addr())
{
  std::copy_backward(__position, this->_M_impl._M_finish,
 this->_M_impl._M_finish + 1);
  *__position = __x;
  ++this->_M_impl._M_finish;
}
  else
{
  const size_type __len =
_M_check_len(size_type(1), "vector::_M_insert_aux");
  _Bit_pointer __q = this->_M_allocate(__len);
  iterator __start(std::__addressof(*__q), 0);
  iterator __i = _M_copy_aligned(begin(), __position, __start); // <--
  *__i++ = __x;

where the compiler fails to optimize the case of no existing allocation.
Since that's from the push_back () itself no caching of the null start
in libstdc++ is possible.

Disabling early threading avoids the diagnostic because we then inline
very differently, not exposing most of the above.

[Bug tree-optimization/109353] FAIL: 23_containers/vector/bool/allocator/copy.cc (test for excess errors)

2023-03-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109353

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Keywords||missed-optimization
 CC||jwakely.gcc at gmail dot com
   Last reconfirmed||2023-03-31
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
 [local count: 86938296]:
D.70713 ={v} {CLOBBER(eol)};
_74 = v1.D.60930._M_impl.D.60395._M_start.D.16824._M_p;
_638 = (long int) _74;
_261 = -_638;
_383 = (long unsigned int) _261;
if (_638 < -8)
  goto ; [90.00%]
else
  goto ; [10.00%]

 [local count: 78244465]:
__builtin_memmove (_216, _74, _383);

so _383 is > 8, the reported access size looks good here.  We determine
the destination size as 8 which is also good since

_216 = operator new (8);

so the diagnostic seems legit.  The alternative code taken isn't very much
better:

 [local count: 8693830]:
if (_74 == -8B)
  goto ; [34.00%]
else
  goto ; [66.00%]

 [local count: 2955901]:
_266 = MEM[(long unsigned int *)-8B];
*_216 = _266;

 [local count: 86938296]:
_268 = _216 + _383;
_324 = MEM[(_Bit_type *)_268];
_327 = _324 & 18446744073709551614;

so definitely bb13 looks to be better unreachable as well.

I have difficulties in tracing libstdc++ back to the ultimate caller but
the above control flow is from stl_algobase.h:435 which is

  template
struct __copy_move<_IsMove, true, random_access_iterator_tag>
{
  template
_GLIBCXX20_CONSTEXPR
static _Up* 
__copy_m(_Tp* __first, _Tp* __last, _Up* __result) 
{
  const ptrdiff_t _Num = __last - __first;
  if (__builtin_expect(_Num > 1, true)) 
__builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
  else if (_Num == 1) 
std::__copy_move<_IsMove, false, random_access_iterator_tag>::
  __assign_one(__result, __first);
  return __result + _Num; 
}
};

but the assign-one seems to be odd.  So I suppose this all happens before

   v1.push_back(T());

and thus _Num == 0.

v1.D.60930._M_impl.D.60395._M_start.D.16824._M_p is probably NULL at this
point but 'v1' escapes to _M_deallocate so the intervening
operator new calls and an __atomic_load_1 / __cxa_guard_acquire prevent
CSE of this value.

It might be possible (again) to CSE this manually in libstdc++ to help
code generation.