On Fri, Aug 15, 2025 at 12:47:35AM +1200, [email protected] wrote:
On Wed, 13 Aug 2025 at 23:06, Colin Watson <[email protected]> wrote:
...
I'm downgrading this for the moment as I can't currently find evidence
that it's a baseline violation. I've tried this in various ancient qemu
CPU models ("-cpu Conroe", "-cpu qemu64", "-cpu core2duo"), and it seems
fine there. I'm prepared to believe that I've missed something, but
figuring it out seems like a bit of a fishing expedition.
Hi Colin! It's a baseline violation. Your analysis of the build files
was helpful but ultimately I just had to check the dmesg log for the
segfault and look up the offset in the shared library:
traps: mtxrun[62011] trap invalid opcode ip:7fe6e4f64988
sp:7ffe42301c80 error:0 in libmimalloc.so.3.0[c988,7fe6e4f5e000+15000]
c988: f3 48 0f b8 c2 popcnt rax,rdx
OK. I think this is coming from supposedly CPUID-guarded code:
$ git grep -i popcnt
include/mimalloc/bits.h:extern bool _mi_cpu_has_popcnt;
include/mimalloc/bits.h: if mi_unlikely(!_mi_cpu_has_popcnt) { return
_mi_popcount_generic(x); }
include/mimalloc/bits.h: __asm ("popcnt\t%1,%0" : "=r"(r) : "r"(x) : "cc");
include/mimalloc/bits.h: if mi_unlikely(!_mi_cpu_has_popcnt) { return
_mi_popcount_generic(x); }
include/mimalloc/bits.h: return (size_t)mi_msc_builtinz(__popcnt)(x);
include/mimalloc/bits.h: return (size_t)mi_msc_builtinz(__popcnt)(x);
src/init.c:mi_decl_cache_align bool _mi_cpu_has_popcnt = false;
src/init.c: _mi_cpu_has_popcnt = ((cpu_info[2] & (1 << 23)) != 0); // bit 23 of
ECX : see <https://en.wikipedia.org/wiki/CPUID#EAX=1:_Processor_Info_and_Feature_Bits>
src/init.c: _mi_cpu_has_popcnt = true;
Here's the relevant code (for GCC/amd64):
#if !defined(__BMI1__)
if mi_unlikely(!_mi_cpu_has_popcnt) { return _mi_popcount_generic(x); }
#endif
size_t r;
__asm ("popcnt\t%1,%0" : "=r"(r) : "r"(x) : "cc");
return r;
And:
static void mi_detect_cpu_features(void) {
// FSRM for fast short rep movsb/stosb support (AMD Zen3+ (~2020) or Intel
Ice Lake+ (~2017))
// EMRS for fast enhanced rep movsb/stosb support
uint32_t cpu_info[4];
if (mi_cpuid(cpu_info, 7)) {
_mi_cpu_has_fsrm = ((cpu_info[3] & (1 << 4)) != 0); // bit 4 of EDX : see
<https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features>
_mi_cpu_has_erms = ((cpu_info[1] & (1 << 9)) != 0); // bit 9 of EBX : see
<https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features>
}
if (mi_cpuid(cpu_info, 1)) {
_mi_cpu_has_popcnt = ((cpu_info[2] & (1 << 23)) != 0); // bit 23 of ECX : see
<https://en.wikipedia.org/wiki/CPUID#EAX=1:_Processor_Info_and_Feature_Bits>
}
}
In principle this sort of thing should be OK. But maybe __BMI1__ is
defined and so the generic fallback isn't present, or maybe the CPUID
check is incorrect for your CPU? I'll check the former when I have a
little more time, but if you could check the latter that would be
helpful.
(Note that I don't know this library well. I'm just trying to figure
this out since it's been blocking some other things I work on.)
--
Colin Watson (he/him) [[email protected]]