On Fri, 14 Jan 2022, Pali Rohár wrote:

On Thursday 13 January 2022 23:22:10 Martin Storsjö wrote:
On Wed, 12 Jan 2022, Pali Rohár wrote:

Original MSVC 6.0 msvcrt.dll library does not provide _vscprintf()
function. MinGW-w64 __ms_snprintf() and __ms_vsnprintf() implementations
call _vscprintf() function. So include fallback _vscprintf() implementation
into MinGW-w64 crt code to allow applications to use snprintf() with
original MSVC 6.0 msvcrt.dll library.
---
mingw-w64-crt/Makefile.am              |  1 +
mingw-w64-crt/lib-common/msvcrt.def.in |  4 ++
mingw-w64-crt/misc/_vscprintf.c        | 72 ++++++++++++++++++++++++++
3 files changed, 77 insertions(+)
create mode 100644 mingw-w64-crt/misc/_vscprintf.c

diff --git a/mingw-w64-crt/Makefile.am b/mingw-w64-crt/Makefile.am
index e6b644ec1d44..1be9250c1e53 100644
--- a/mingw-w64-crt/Makefile.am
+++ b/mingw-w64-crt/Makefile.am
@@ -282,6 +282,7 @@ src_msvcrt32=\
  misc/_create_locale.c \
  misc/_free_locale.c \
  misc/_get_current_locale.c \
+  misc/_vscprintf.c \
  misc/lc_locale_func.c \
  misc/wassert.c

diff --git a/mingw-w64-crt/lib-common/msvcrt.def.in 
b/mingw-w64-crt/lib-common/msvcrt.def.in
index 0ea9d388fbe4..17bc7e9a076e 100644
--- a/mingw-w64-crt/lib-common/msvcrt.def.in
+++ b/mingw-w64-crt/lib-common/msvcrt.def.in
@@ -1111,7 +1111,11 @@ _vprintf_l
_vprintf_p
_vprintf_p_l
_vprintf_s_l
+#ifdef DEF_I386
+; _vscprintf Replaced by emu
+#else
_vscprintf
+#endif

We've got neater macros that avoid needing to spend many lines on this; you
can make it F_NON_I386(_vscprintf) instead of spelling out the ifdef. See
def-include/func.def.in for the full set of macros you can use for
conditional availability.

Ok!

+int (__cdecl *__MINGW_IMP_SYMBOL(_vscprintf))(const char * __restrict__, 
va_list) = init_vscprintf;

This provides __imp___vscprintf only, but not __vscprintf, while
lib32_libmingwex_a-vsnprintf.o has an undefined reference to __vscprintf.

Linking still works, but this forces the linker to do an autoimport of the
symbol, which is quite suboptimal. So it would be best if this object file
would provide the prefix-less regular function too, which just calls what
the __imp_ prefixed symbol points at.

// Martin

Do you mean adding following line into source unit?

int __cdecl _vscprintf(const char * __restrict__ format, va_list arglist) { 
return __MINGW_IMP_SYMBOL(_vscprintf)(format, arglist); }

Or is there any macro for this purpose?

Yes, I think that should work. No I don't think we have a macro for that.

I still have not fully caught how autoimport works, what is exact
difference between __imp_ and non-imp symbols and what is suboptimal on
__imp_ symbols...

A quick example:

$ cat example.c
extern int regularGlobal;
extern int __declspec(dllimport) importedGlobal;
int regularFunction(void);
int __declspec(dllimport) importedFunction(void);

int getRegularGlobal(void) { return regularGlobal; }
int getImportedGlobal(void) { return importedGlobal; }

int callFunctions(void) {
  int ret = regularFunction();
  ret += importedFunction();
  return ret;
}
$ i686-w64-mingw32-gcc -S -o - example.c -O2
[... irrelevant details omitted ...]
_getRegularGlobal:
        movl    _regularGlobal, %eax
        ret
_getImportedGlobal:
        movl    __imp__importedGlobal, %eax
        movl    (%eax), %eax
        ret
_callFunctions:
[...]
        call    _regularFunction
[...]
        call    *__imp__importedFunction

When you interact with a symbol that is declared dllimport, all accesses go via a symbol named __imp_<normalsymbolname>. This is the Import Address Table (IAT) entry, which is updated by the Windows loader when the exe/dll is loaded, pointing at where the imported symbol really is in another dll.

So when reading regularGlobal, the mov instruction just ends up encoding a hardcoded address (i386) or offset (x86_64) to regularGlobal. When reading importedGlobal, the mov instruction keeps the address/offset of __imp_importedGlobal (which is the IAT entry in the same exe/dll). It then reads the address of the real importedGlobal (which resides in a different dll, and we don't know the address of it until we're loaded), and loads the value from it with a separate instruction.

For function calls, it either just calls directly ("call _regularFunction") or reads the address from __imp__importedFunction and calls that ("call *__imp__importedFunction").

Now the issue comes if regularFunction actually turns out to be imported from a different dll. Then while the real symbol would be __imp__regularFunction, the import library also contains a thunk, _regularFunction, which consists of one single instruction, "jmp *__imp__regularFunction".

So if you call a function that turns out to be dllimported, but the caller didn't see a dllimport attribute, the function call gets one extra indirection by jumping via the thunk function instead, but it's not a big issue.

This mechanism works fine for function calls, where you can bounce via a thunk without noticing the difference. But for accessing a variable like _getRegularGlobal above, you can't fix it by adding an extra thunk. For these cases, the mingw linker "autoimport" trick comes into play. As the "movl _regularGlobal, %eax" instruction encodes an absolute address/offset to the symbol, we can't make it indirect. If it turns out that regularGlobal actually is in a different DLL, the _regularGlobal symbol is undefined, but the __imp__regularGlobal symbol is defined, in the import library. Then the mingw linker (both GNU ld.bfd and lld) detect this, and adds the address of this location to a list of runtime pseudo relocations.

When your exe/dll is loaded, after the Windows loader has finished setting things up, the mingw runtime iterates over the list of runtime pseudo relocations, makes the relevant sections as read-write and fixes the addresses where necessary.

(There are more details and more trickery involved that make it work properly in 64 bit mode too, I'll omit those here to keep it as simple as possible - it's already complicated enough as is.)

This is what makes cross DLL variable access without dllimport work in mingw, but not in MSVC. But it comes at a price by being a little ugly (temporarily making readonly parts of your executable readwrite and fixing up addresses in them). So normally if possible, it's better to avoid this - in this case, by making sure the function has a version without the __imp_ prefix too. (You could also notice the difference if linking with "-Wl,--disable-auto-import -Wl,--disable-runtime-pseudo-reloc".)

// Martin

_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to