https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110945

            Bug ID: 110945
           Summary: std::basic_string::assign dramatically slower than
                    other means of copying memory
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: janschultke at googlemail dot com
  Target Milestone: ---

See https://quick-bench.com/q/bqGjfyd180oOlJhiY_XnURMNKG8

Using the copy constructor performs best, and ends up using std::memcpy
internally. Even using .resize() and std::copy is much faster than .assign(),
because it is subject to more partial loop unrolling.

basic_string::assign:
https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.h#L1713C28-L1713C28

this calls the four-iterator form of .replace():
https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.h#L2378

this calls this form of _M_replace_dispatch(): (I think)
https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.tcc#L430

this calls _M_replace():
https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.tcc#L507

in this case, it should call _S_move():
https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/basic_string.h#L431

this calls char_traits::move():
https://github.com/gcc-mirror/gcc/blob/25c4b1620ebc10fceabd86a34fdbbaf8037e7e82/libstdc%2B%2B-v3/include/bits/char_traits.h#L223

and that calls __builtin_memcpy()

However, I must have followed this chain of calls incorrectly, because I do not
see a call to memmove in the output assembly, and most of the time is spent
here:

>        nopl   (%rax)
>        movdqa 0x42d8a0(%rdx),%xmm0
> 63.27% movups %xmm0,(%rax,%rdx,1)
> 36.69% add    $0x10,%rdx
> 0.03%  cmp    $0x100000,%rdx

Reply via email to