https://bugs.kde.org/show_bug.cgi?id=520753

            Bug ID: 520753
           Summary: Advertise LZCNT via CPUID for x86 (32-bit) clients
    Classification: Developer tools
           Product: valgrind
      Version First 3.27 GIT
       Reported In:
          Platform: Other
                OS: Linux
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: vex
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

LZCNT is not advertised via CPUID when client is a x86 (32-bit) binary

$ ./vg-in-place cpuid32 -1 |& fgrep LZCNT
      LZCNT advanced bit manipulation        = false
$ cpuid32 -1 |& fgrep LZCNT
      LZCNT advanced bit manipulation        = true
$

Per Intel manual, LZCNT was introduced in "Intel® Xeon® processor E3/E5/E7 v3
product families, 4th Generation Intel® Core™ processor family".  In Valgrind
guest_amd64_helpers.c, LZCNT is advertised via amd64g_dirtyhelper_CPUID_avx2()
ECX, bit 5:

3652       case 0x80000001:
3653          SET_ABCD(0x00000000, 0x00000000, 0x00000021, 0x2c100800);
-------------------------------------------------------^--------------- <<<
here

For 32-bit clients, Valgrind has guest_x86_helpers.c.  The dirtyhelper_CPUID*
function is being selected around guest_x86_toIR.c:15516,
x86g_dirtyhelper_CPUID_sse3() being the "most modern" dirtyhelper.  For LZCNT
introduction:

SSE (1999) → Pentium III
SSE2 (2000) → Pentium 4
SSE3 (2004) → Pentium 4 Prescott
SSSE3 (2006) → Core 2 ← this is what sse3 helper emulates
SSE4.1 (2006) → Core 2 (Penryn)
SSE4.2 (2008) → Core i7 (Nehalem) ← POPCNT introduced here
AVX (2011) → Sandy Bridge
BMI1/BMI2 (2013) → Haswell ← LZCNT introduced here

In Valgrind, LZCNT is not SW emulated (like POPCNT is) but it also isn't
directly run in the HW.  Instead, in gen_LZCNT() it is treated as Iop_ClzNat32,
which translates to X86Instr_Bsfr32(False, ...) in host_x86_isel.c:1321, which
is the BSR (emit_X86Instr() in VEX/priv/host_x86_defs.c:2706).  That's a
"baseline x86" instruction that's not even advertised via CPUID (introduced
probably in i386 in 1985).

That said, since Valgrind's LZCNT could be advertised
x86g_dirtyhelper_CPUID_sse0() technically (because BSR is so old), given that
LZCNT was introduced in Haswell, a good place for it is
x86g_dirtyhelper_CPUID_sse3().  Another option is to create a new
dirtyhelper_CPUID function for it.  Tha attached patch uses the former
approach.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to