Add an explicit `volatile` qualifier for intrinsic functions that write to TEB; do not rely on the implicit volatile semantics, just because they have no output.

* If an asm statement reads memory that is not passed with "m", it shall
  clobber "memory".
* If an asm statement writes memory that is not passed with "m", it shall
  both be `volatile` and clobber "memory".

The conclusion is basing on the following experiment: 
https://gcc.godbolt.org/z/chMa3GMda




--
Best regards,
LIU Hao

From 90b41ca9c7478beef365502d05acfbb8afb3f16f Mon Sep 17 00:00:00 2001
From: LIU Hao <[email protected]>
Date: Fri, 8 Nov 2024 23:02:15 +0800
Subject: [PATCH] headers/intrin-impl: Fix segment accessors

First, these intrins read from and write to thread-local memory. The TEB
contains a pointer to itself in the DS segment, known as the `Self` field of
`struct _NT_TIB`, which means the TEB is semantically in the same address
space as other objects, so these asm statements must clobber "memory". If an
asm statement writes to memory that is not passed with "m" constraints, then
the compiler shall be noted that it has unknown side effects, by adding a
`volatile` qualifier.

Second, for Intel syntax, this commit removes superfluous prefixes in front of
segment register names.

Third, previously `Offset` was cast to a pointer and dereferenced, and then
passed to inline assembly as a memory operand using the `m` constraint. This
was a pure hack. GCC assumes that a memory operand should belong in the DS
segment, so it appeared to reference unknown memory, and caused warnings like

  intrin-impl.h:849:1: warning: array subscript 0 is outside array bounds of
  'long long unsigned int[0]' [-Warray-bounds=]
  849 | __buildreadseg(__readgsqword, unsigned __int64, "gs", "q")
      | ^~~~~~~~~~~~~~

This commit passes the address by register instead. For Intel syntax, there is
no way to print the `DWORD PTR` thing, so unfortunately the value also has to
be passed by register. It's suboptimal, but should be safe.

For x86-64, the use of a 32-bit address requires an address size override
prefix. However, it's deliberate, as zero-extending a 32-bit register (like
`mov edi, edi`) would require two additional bytes.

Signed-off-by: LIU Hao <[email protected]>
---
 mingw-w64-headers/include/psdk_inc/intrin-impl.h | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/mingw-w64-headers/include/psdk_inc/intrin-impl.h b/mingw-w64-headers/include/psdk_inc/intrin-impl.h
index a62822a39..872246a44 100644
--- a/mingw-w64-headers/include/psdk_inc/intrin-impl.h
+++ b/mingw-w64-headers/include/psdk_inc/intrin-impl.h
@@ -230,9 +230,10 @@ Parameters: (FunctionName, DataType, Segment)
 
 #define __buildreadseg(x, y, z, a) y x(unsigned __LONG32 Offset) { \
     y ret; \
-    __asm__ ("mov{" a " %%" z ":%[offset], %[ret] | %[ret], %%" z ":%[offset]}" \
+    __asm__ ("mov{" a " %%" z ":(%[offset]), %[ret] | %[ret], " z ":[%[offset]] }" \
         : [ret] "=r" (ret) \
-        : [offset] "m" ((*(y *) (size_t) Offset))); \
+        : [offset] "r" (Offset) \
+        : "memory"); \
     return ret; \
 }
 
@@ -247,9 +248,10 @@ Parameters: (FunctionName, DataType, Segment)
    */
 
 #define __buildwriteseg(x, y, z, a) void x(unsigned __LONG32 Offset, y Data) { \
-    __asm__ ("mov{" a " %[Data], %%" z ":%[offset] | %%" z ":%[offset], %[Data]}" \
-        : [offset] "=m" ((*(y *) (size_t) Offset)) \
-        : [Data] "ri" (Data)); \
+    __asm__ volatile ("mov{" a " %[Data], %%" z ":(%[offset]) | " z ":[%[offset]], %[Data] }" \
+        : \
+        : [offset] "r" (Offset), [Data] "r" (Data) \
+        : "memory"); \
 }
 
 /* This macro is used by _BitScanForward, _BitScanForward64, _BitScanReverse _BitScanReverse64
-- 
2.43.0

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to