https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102531
Bug ID: 102531
Summary: std::hash does not work correctly on Big Endian
platforms
Product: gcc
Version: 8.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: miladfarca at gmail dot com
Target Milestone: ---
std::hash does not work correctly on BE machines using the following key types:
- std::hash<std::bitset>
- std::hash<std::vector<bool>>
They both break the 5th rule, a large number of inputs will end up having the
same hash:
https://en.cppreference.com/w/cpp/utility/hash
### std::bitset
```
#include <iostream>
#include <functional>
#include <bitset>
int main(){
std::bitset<2> a(0b01);
std::bitset<2> b(0b10);
std::size_t h1 = std::hash<std::bitset<2>>{}(a);
std::size_t h2 = std::hash<std::bitset<2>>{}(b);
std::cout << h1 << std::endl;
std::cout << h2 << std::endl;
return 0;
```
Output will be the same on BE machines.
The input length calculation is done incorrectly here:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/bitset#L1575
`bitset` writes 0 extended size_t sized values into memory. Reading smaller
lengths might return 0 as bytes are not reversed on BE.
### std::vector<bool>
```
#include <iostream>
#include <vector>
#include <functional>
int main(){
std::vector<bool> a= {static_cast<bool>(1), static_cast<bool>(2)};
std::vector<bool> b= {static_cast<bool>(1), static_cast<bool>(2),
static_cast<bool>(3)};
std::size_t h1 = std::hash<std::vector<bool>>{}(a);
std::size_t h2 = std::hash<std::vector<bool>>{}(b);
std::cout << h1 << std::endl;
std::cout << h2 << std::endl;
return 0;
}
```
Similar issue as bitset. vector<bool> writes size_t sized values into memory.
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/vector.tcc#L987
The calculated length in the above link might be smaller than the input and
only 0s will get returned as bytes are not reversed on BE.