On Thursday 27 February 2025 21:45:51 Lasse Collin wrote:
> On 2025-02-25 Pali Rohár wrote:
> > On Tuesday 25 February 2025 20:11:12 Lasse Collin wrote:
> > > MSVCRT's _stat() fails with "directory/", but in POSIX a trailing
> > > slash should work for directories (and only for directories).
> > > mingw-w64-crt/stdio/_stat.c provides stat() that, with some
> > > exceptions, removes *one* trailing / or \ and then calls _stat().
> > > This hack is often enough, but as a side effect it then makes
> > > "file/" succeed while it shouldn't. And "directory//" still fails
> > > while it shouldn't.
> > > 
> > > On UCRT, stat() simply calls _stat(). UCRT's _stat() has eerily
> > > similar behavior with trailing slashes as mingw-w64's replacement
> > > stat() has in MSVCRT builds. It's as if UCRT was made bug-for-bug
> > > compatible with mingw-w64 instead of thinking what is the correct
> > > but more complex thing to do.
> > > 
> > > MSVCRT's _stat() doesn't work on \\.\ or \\?\ paths. UCRT's does,
> > > but the buggy trailing slash removal means that only the last line
> > > below succeeds with UCRT's _stat():
> > > 
> > >     \\?\GLOBALROOT\Device\Harddisk0\Partition3
> > >     \\?\GLOBALROOT\Device\Harddisk0\Partition3\
> > >     \\?\GLOBALROOT\Device\Harddisk0\Partition3\\
> > > 
> > > MSVCRT's _stat() doesn't follow reparse points. UCRT's _stat() does
> > > (on symlink or junction loops it fails with ENOENT, not ELOOP).  
> > 
> > MS _stat() function can do whatever it wants. It is not part of C or
> > POSIX.
> 
> I agree.
> 
> > But mingw-w64's stat() function should be consistent and returns
> > correct information. What you have figured out is another mess. It is
> > not nice.
> 
> I apologize what I said about UCRT's _stat. It's fine. :-) It doesn't
> emulate any mingw-w64 bugs, and making stat == _stat is OK with UCRT.
> It's mingw-w64 that is broken.
> 
> I had tested on MSYS2 in MINGW32 and UCRT64 environments. In the
> MINGW32 environment, _USE_32BIT_TIME_T is defined by default while in
> UCRT64 it's not defined.
> 
> Earlier I had built a stat/_stat testing program with optimizations
> disabled. Building with optimizations enabled (-O2), the UCRT results
> changed for both _stat and stat. Now trailing slashes work the way they
> should.
> 
> <sys/stat.h> provides __CRT_INLINE ("extern inline
> __attribute__((__gnu_inline__))") wrapper for these functions:
> 
>     _fstat64i32
>     _stat64i32
>     fstat  (two versions depending on _USE_32BIT_TIME_T)
> 
> There are also two versions of stat but they have been commented out:
> 
>     /* Disable it for making sure trailing slash issue is fixed.  */
> 
> The above inline functions are provided only when optimizations are
> enabled. Even in optimized builds, compiler may decide to not use the
> inline functions and instead may emit calls to the external functions.
> It's essential that the inline implementation in the header and the
> implementation in libmingwex have the same behavior. They are the same
> for _fstat64i32.
> 
> _stat64i32 has a bug: The inline function doesn't remove a trailing
> slash but mingw-w64-crt/stdio/_stat64i32.c does. Thus the behavior of
> the function depends on compiler optimizations.
> 
> In MSYS2's MINGW32 environment, stat is defined to _stat32, so calling
> _stat jumps into MSVCRT without any libmingwex code. A call to stat
> goes to libmingwex's wrapper that removes one trailing slash. In this
> case everything works the way I think it's intended to work.
> 
> In the UCRT64 environment, _stat is defined to _stat64i32 in
> <_mingw_stat64.h>. UCRT builds of libmingwex include
> mingw-w64-crt/stdio/_stat64i32.c. This is wrong because UCRT's
> functions handle trailing slashes (MSVCRT's don't). This explains why I
> thought that UCRT had imitated mingw-w64's slash removal. Compiling
> with optimizations enabled made the compiler use the __CRT_INLINE
> version from <sys/stat.h>, skipping the slash removal version from
> libmingwex.
> 
> For UCRT builds, <sys/stat.h> uses __mingw_ovr to make stat an alias for
> _stat. Thus stat and _stat behave the same.
> 
> fstat __CRT_INLINE usage I didn't investigate much. At glance it looks
> suspicious because the definition of "struct stat" depends on
> time_t but there's only one non-inline fallback function. c0f06309823c
> ("crt: Provide fstat function symbol via alias") removed _fstat.c with a
> time_t related FIXME but left the inline functions in <sys/stat.h>. As
> the FIXME said, the old method was known to be incorrect.
> 
> The same FIXME is in _stat.c and _wstat.c still, but due to the
> __mingw_ovr in <sys/stat.h> in UCRT builds, I suspect that the code
> from _stat.c and _wstat.c is never called in UCRT builds. Thus the
> time_t FIXME in these two files only matters to MSVCRT builds. (I don't
> know if Autoconf detection of stat and wstat still needs a symbol in
> libmingwex because AC_CHECK_FUNC doesn't look at headers.)
> 
> -- 
> Lasse Collin

I did not wanted to open also this discussion about stat, but as you
have opened it, I write some important details, which needs to be taken
into account, otherwise another breakage can happen.

For every underscored stat variant (stat, fstat, wstat) there are 4
exported function symbols:

_func32    - struct stat argument has both timestamps and filesize as 32-bit
_func64    - struct stat argument has both timestamps and filesize as 64-bit
_func64i32 - struct stat argument has timestamps as 64-bit and filesize as 
32-bit
_func32i64 - struct stat argument has timestamps as 32-bit and filesize as 
64-bit

And then there are exported symbol aliases:

for 32-bit systems:
_func = _func32
_funci64 = _func32i64

for 64-bit systems:
_func = _func64i32
_funci64 = _func64

So there are exported 6 functions symbols which you can use without any
header file, just you need to provide correct function declaration.
Every libmsvcr*.a import library for every platform in mingw-w64
provides these 6 symbols (some libmsvcr*.a import libraries do not
provide all of them if the corresponding msvcr*.dll do not have required
function for emulation).


Now there are header files which do #define mess.

Based on _USE_32BIT_TIME_T macro value, header file provides:

#define _func
#define _funci64

which expands to some of _func??? variant. And it does not have to be
the one which matches the _func and _funci64 symbol alias.

What is interesting that _func never expands to _func64 and hence even
on 64-bit systems, the _func is not full 64-bit.


But let this mess as is, all those are underscored functions which MS
somehow defined and it is well known API/ABI.

Now the interesting part is: How should non-underscored POSIX stat() and
fstat() behaves?

I think that header file for POSIX stat() and fstat() should follow
application definition of _USE_32BIT_TIME_T or the largely used the
"-D_TIME_BITS=64" or "-D_FILE_OFFSET_BITS=64" compile flags.

As the msvcrt.dll's _stat() function has somehow strange behavior of
trailing slash, for POSIX stat() purposes, it is needed to define
mingw-w64 wrapper.

But as there are 4 variants of "-D_TIME_BITS=64" + "-D_FILE_OFFSET_BITS=64"
combinations, it is needed to define 4 variants of POSIX stat function.
Maybe call them similarly: stat32, stat64, stat64i32 and stat32i64.

I think that providing any inline function for any stat* variant would
just cause a bigger mess. So I would propose to define 4 mingw-w64
wrappers (for each variant) and in header file just add a proper
#define stat based on the macros configuration.

Similarly it would be needed to define fstat macro, but should just
expand to correct _fstat as there is no need for wrapper.


_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to