https://bugs.llvm.org/show_bug.cgi?id=43096

            Bug ID: 43096
           Summary: std::basic_string loses the bottom bit of capacity for
                    regular ABI little-endian and alternate ABI big-endian
                    strings
           Product: libc++
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: All Bugs
          Assignee: unassignedclangb...@nondot.org
          Reporter: richard-l...@metafoo.co.uk
                CC: llvm-bugs@lists.llvm.org, mclow.li...@gmail.com

libc++'s std::basic_string reuses the low-order bit of its capacity field as an
'is long' marker when in regular ABI little-endian or alternate ABI big-endian
mode.

When sizeof(_CharT) <= 8, this is OK: all allocations are always of a multiple
of 16 bytes, so capacity is always even, so we don't actually lose any data.
But when sizeof(_CharT) is 16, things go wrong:


#include <string>
#include <iostream>

struct big_character {
    char data[16];
};

int main() {
    for (int n = 0; n != 20; ++n) {
        std::basic_string<big_character> s;
        s.reserve(n);
        std::cout << "asked for " << n << " got " << s.capacity() << "\n";
    }
}


produces:


asked for 0 got 1
asked for 1 got 1
asked for 2 got 3
asked for 3 got 3
asked for 4 got 3
asked for 5 got 5
asked for 6 got 5
asked for 7 got 7
asked for 8 got 7
asked for 9 got 9
asked for 10 got 9
asked for 11 got 11
asked for 12 got 11
asked for 13 got 13
asked for 14 got 13
asked for 15 got 15
asked for 16 got 15
asked for 17 got 17
asked for 18 got 17
asked for 19 got 19


Note that for even sizes greater than 2, reserve(n) fails to result in a
capacity() >= n. Similarly:

    std::basic_string<big_character> s(6, big_character{});
    std::cout << "size is " << s.size() << ", capacity is " << s.capacity() <<
"\n";

prints:

size is 6, capacity is 5


If we want to avoid breaking ABI, it looks like we need to ensure that we
always allocate an even number of elements (including the terminator) in the
affected cases.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to