Le jeu. 30 nov. 2023 à 14:20, Martin Storsjö <[email protected]> a écrit :
>
> On Wed, 29 Nov 2023, Martin Storsjö wrote:
>
> >> and related code in Chromium [2].
> >> The first doesn't mention ARM, but both use const_seg on 64-bits
> >> arches and data_seg on 32 bits.
> >
> > I wouldn't really take that as an authority for this matter here - I would
> > expect this to be a similar historical mistake; someone has used _WIN64 to
> > distinguish x86 and x64 at a time when no other architectures were relevant
> > in Windows code.
> >
> >>
> >> [1]:
> >> https://lallouslab.net/2017/05/30/using-cc-tls-callbacks-in-visual-studio-with-your-32-or-64bits-programs/
> >> [2]:
> >> https://chromium.googlesource.com/chromium/src/+/lkgr/base/win/dllmain.cc#64
> >>
> >> I'll try to write a sensible repro case and build it on all arches. I
> >> don't have ARM Windows machines, I hope that looking at the
> >> cross-compiled executable sections will be enough.
> >
> > Sure, hopefully inspecting that is enough. Otherwise I can help out running
> > any test binaries you want to inspect (you can probably mail me off-list for
> > that).
>
> I went ahead and tested this myself - and as expected, _WIN64 is indeed
> the wrong condition.
>
> ARM and ARM64 all use the same section type as x64, it's only x86 which is
> the odd one out. So like most other similar cases, you'll need to ifdef
> for e.g. _M_IX86 and use the other common codepath for the rest. It might
> be good to add a comment that explicitly clarifies which one each of the 4
> architectures needs (to indicate that the ifdef isn't a mistake but an
> intentional choice).
Thanks for noticing this and prompting me to triple-check the code.
Turns out we do have an ARM Windows laptop in my company :-)
The literature online uses `data_seg` on x86 and `const_seg` on x64.
In my tests, using `const_seg` worked on all arches (x86, x64, arm,
arm64). We can use `#pragma section` and `__declspec(allocate("xxx"))`
instead. Testing code now looks like this (without comments).
Note that `#pragma comment (linker, "/INCLUDE:_foo_tls_callback")` is
necessary otherwise the callback symbol is removed with Link-Time
Optimization (LTO).
I think I may have found a bug with MSVC: if you use LTO and
optimization (-O1 or -O2), then the symbol is always discarded. I'll
open a bug report and see what happens.
Does the example below look reasonable?
Many thanks,
-- Antonin
#include <windows.h>
#include <stdio.h>
#define STR(x) #x
static DWORD callback_reason = MAXDWORD;
#if defined (_M_IX86)
#pragma comment (linker, "/INCLUDE:__tls_used")
#pragma comment (linker, "/INCLUDE:__foo_tls_callback")
#elif defined (_WIN64) || defined (_M_ARM)
#pragma comment (linker, "/INCLUDE:_tls_used")
#pragma comment (linker, "/INCLUDE:_foo_tls_callback")
#endif
static void NTAPI __stdcall
foo_tls_callback(PVOID DllHandle, DWORD dwReason, PVOID Reserved)
{
UNREFERENCED_PARAMETER(DllHand
le);
UNREFERENCED_PARAMETER(dwReason);
UNREFERENCED_PARAMETER(Reserved);
callback_reason = dwReason;
}
#ifdef _MSC_VER
#pragma section(".CRT$XLF", long, read, shared)
#endif
const PIMAGE_TLS_CALLBACK
#if defined __GNUC__
__attribute__((__section__(".CRT$XLF")))
#elif defined _MSC_VER
__declspec(allocate(".CRT$XLF"))
#endif
_foo_tls_callback = foo_tls_callback;
void foo_init(void) {
switch(callback_reason) {
case MAXDWORD: printf("foo_tls_callback didn't fire.\n");
ExitProcess(1); break;
case DLL_PROCESS_ATTACH: printf(STR(DLL_PROCESS_ATTACH) "\n"); break;
case DLL_PROCESS_DETACH: printf(STR(DLL_PROCESS_DETACH) "\n"); break;
case DLL_THREAD_ATTACH: printf(STR(DLL_THREAD_ATTACH) "\n"); break;
case DLL_THREAD_DETACH: printf(STR(DLL_THREAD_DETACH) "\n"); break;
}
}
_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public