On Thursday 27 February 2025 21:45:51 Lasse Collin wrote: > On 2025-02-25 Pali Rohár wrote: > > On Tuesday 25 February 2025 20:11:12 Lasse Collin wrote: > > > MSVCRT's _stat() fails with "directory/", but in POSIX a trailing > > > slash should work for directories (and only for directories). > > > mingw-w64-crt/stdio/_stat.c provides stat() that, with some > > > exceptions, removes *one* trailing / or \ and then calls _stat(). > > > This hack is often enough, but as a side effect it then makes > > > "file/" succeed while it shouldn't. And "directory//" still fails > > > while it shouldn't. > > > > > > On UCRT, stat() simply calls _stat(). UCRT's _stat() has eerily > > > similar behavior with trailing slashes as mingw-w64's replacement > > > stat() has in MSVCRT builds. It's as if UCRT was made bug-for-bug > > > compatible with mingw-w64 instead of thinking what is the correct > > > but more complex thing to do. > > > > > > MSVCRT's _stat() doesn't work on \\.\ or \\?\ paths. UCRT's does, > > > but the buggy trailing slash removal means that only the last line > > > below succeeds with UCRT's _stat(): > > > > > > \\?\GLOBALROOT\Device\Harddisk0\Partition3 > > > \\?\GLOBALROOT\Device\Harddisk0\Partition3\ > > > \\?\GLOBALROOT\Device\Harddisk0\Partition3\\ > > > > > > MSVCRT's _stat() doesn't follow reparse points. UCRT's _stat() does > > > (on symlink or junction loops it fails with ENOENT, not ELOOP). > > > > MS _stat() function can do whatever it wants. It is not part of C or > > POSIX. > > I agree. > > > But mingw-w64's stat() function should be consistent and returns > > correct information. What you have figured out is another mess. It is > > not nice. > > I apologize what I said about UCRT's _stat. It's fine. :-) It doesn't > emulate any mingw-w64 bugs, and making stat == _stat is OK with UCRT. > It's mingw-w64 that is broken. > > I had tested on MSYS2 in MINGW32 and UCRT64 environments. In the > MINGW32 environment, _USE_32BIT_TIME_T is defined by default while in > UCRT64 it's not defined. > > Earlier I had built a stat/_stat testing program with optimizations > disabled. Building with optimizations enabled (-O2), the UCRT results > changed for both _stat and stat. Now trailing slashes work the way they > should. > > <sys/stat.h> provides __CRT_INLINE ("extern inline > __attribute__((__gnu_inline__))") wrapper for these functions: > > _fstat64i32 > _stat64i32 > fstat (two versions depending on _USE_32BIT_TIME_T) > > There are also two versions of stat but they have been commented out: > > /* Disable it for making sure trailing slash issue is fixed. */ > > The above inline functions are provided only when optimizations are > enabled. Even in optimized builds, compiler may decide to not use the > inline functions and instead may emit calls to the external functions. > It's essential that the inline implementation in the header and the > implementation in libmingwex have the same behavior. They are the same > for _fstat64i32. > > _stat64i32 has a bug: The inline function doesn't remove a trailing > slash but mingw-w64-crt/stdio/_stat64i32.c does. Thus the behavior of > the function depends on compiler optimizations. > > In MSYS2's MINGW32 environment, stat is defined to _stat32, so calling > _stat jumps into MSVCRT without any libmingwex code. A call to stat > goes to libmingwex's wrapper that removes one trailing slash. In this > case everything works the way I think it's intended to work. > > In the UCRT64 environment, _stat is defined to _stat64i32 in > <_mingw_stat64.h>. UCRT builds of libmingwex include > mingw-w64-crt/stdio/_stat64i32.c. This is wrong because UCRT's > functions handle trailing slashes (MSVCRT's don't). This explains why I > thought that UCRT had imitated mingw-w64's slash removal. Compiling > with optimizations enabled made the compiler use the __CRT_INLINE > version from <sys/stat.h>, skipping the slash removal version from > libmingwex. > > For UCRT builds, <sys/stat.h> uses __mingw_ovr to make stat an alias for > _stat. Thus stat and _stat behave the same. > > fstat __CRT_INLINE usage I didn't investigate much. At glance it looks > suspicious because the definition of "struct stat" depends on > time_t but there's only one non-inline fallback function. c0f06309823c > ("crt: Provide fstat function symbol via alias") removed _fstat.c with a > time_t related FIXME but left the inline functions in <sys/stat.h>. As > the FIXME said, the old method was known to be incorrect. > > The same FIXME is in _stat.c and _wstat.c still, but due to the > __mingw_ovr in <sys/stat.h> in UCRT builds, I suspect that the code > from _stat.c and _wstat.c is never called in UCRT builds. Thus the > time_t FIXME in these two files only matters to MSVCRT builds. (I don't > know if Autoconf detection of stat and wstat still needs a symbol in > libmingwex because AC_CHECK_FUNC doesn't look at headers.) > > -- > Lasse Collin
I did not wanted to open also this discussion about stat, but as you have opened it, I write some important details, which needs to be taken into account, otherwise another breakage can happen. For every underscored stat variant (stat, fstat, wstat) there are 4 exported function symbols: _func32 - struct stat argument has both timestamps and filesize as 32-bit _func64 - struct stat argument has both timestamps and filesize as 64-bit _func64i32 - struct stat argument has timestamps as 64-bit and filesize as 32-bit _func32i64 - struct stat argument has timestamps as 32-bit and filesize as 64-bit And then there are exported symbol aliases: for 32-bit systems: _func = _func32 _funci64 = _func32i64 for 64-bit systems: _func = _func64i32 _funci64 = _func64 So there are exported 6 functions symbols which you can use without any header file, just you need to provide correct function declaration. Every libmsvcr*.a import library for every platform in mingw-w64 provides these 6 symbols (some libmsvcr*.a import libraries do not provide all of them if the corresponding msvcr*.dll do not have required function for emulation). Now there are header files which do #define mess. Based on _USE_32BIT_TIME_T macro value, header file provides: #define _func #define _funci64 which expands to some of _func??? variant. And it does not have to be the one which matches the _func and _funci64 symbol alias. What is interesting that _func never expands to _func64 and hence even on 64-bit systems, the _func is not full 64-bit. But let this mess as is, all those are underscored functions which MS somehow defined and it is well known API/ABI. Now the interesting part is: How should non-underscored POSIX stat() and fstat() behaves? I think that header file for POSIX stat() and fstat() should follow application definition of _USE_32BIT_TIME_T or the largely used the "-D_TIME_BITS=64" or "-D_FILE_OFFSET_BITS=64" compile flags. As the msvcrt.dll's _stat() function has somehow strange behavior of trailing slash, for POSIX stat() purposes, it is needed to define mingw-w64 wrapper. But as there are 4 variants of "-D_TIME_BITS=64" + "-D_FILE_OFFSET_BITS=64" combinations, it is needed to define 4 variants of POSIX stat function. Maybe call them similarly: stat32, stat64, stat64i32 and stat32i64. I think that providing any inline function for any stat* variant would just cause a bigger mess. So I would propose to define 4 mingw-w64 wrappers (for each variant) and in header file just add a proper #define stat based on the macros configuration. Similarly it would be needed to define fstat macro, but should just expand to correct _fstat as there is no need for wrapper. _______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public