On Wednesday 12 March 2025 16:33:42 Lasse Collin wrote:
> On 2025-03-09 Pali Rohár wrote:
> > On Friday 07 March 2025 20:31:09 Lasse Collin wrote:
> > > The commit 45bf2cca702b from 2009 added memset calls to zero the
> > > stat buffer on failure. That seems unncessary, but the commit
> > > message says it fixed something with binutils/bfd. I see you
> > > removed the memset, so I hope it isn't needed anymore.  
> > 
> > I have tested various CRT DLL import libraries and various _stat*
> > functions from them, and all of them do not touch stat buffer on
> > error.
> > 
> > Hence I removed those memset calls, to have mingw-w64 emulation
> > functions to behave in the same way as the native crtdll/msvcrt/ucrt
> > underscore _stat* functions.
> > 
> > For me it looks strange if the _fstat64i32() emulation for libmsvcrt.a
> > zero fill the stat buffer and the native _fstat64i32() from
> > libmsvcr80.a or libucrt.a, or native _fstat() from libmsvcrt.a does
> > not do it.
> > 
> > Now I'm looking at 45bf2cca702b and I do not think it fixed something
> > for UCRT or msvcr80+, as those changes are only for pre-msvcr80 for
> > which is needed mingw-w64 emulation (which the commit changed).
> > 
> > What could it fix is probably non-underscored stat function? But that
> > it also suspicious. Because now I have tested that Linux stat()
> > function from glibc does not touch struct stat buffer too.
> > 
> > So for me it looks like that the change in 45bf2cca702b is very
> > suspicious as it change behavior in a way which is different from
> > _both_ Linux glibc and also MS crtdll/msvcrt/ucrt.
> 
> Thanks for testing! I agree that the change in the commit is weird. If
> it really fixed something, it merely worked around a bug somewhere
> outside mingw-w64. It's should be fine to remove the memset calls.
> 
> > > > Ok. I'm sending 10 patches in the attachment.  
> > > 
> > > Thanks! Makefile.am changes didn't apply to master, but the rest did
> > > cleanly. I think I can test the .c/.h file changes easily still.  
> > 
> > Maybe you need 3 other changes which I have sent to list? Anyway, I
> > these changes on top of 99518b7fd9d0 ("Add missing comparison
> > operators for Microsoft::WRL::ComPtr") with those other 3 small
> > changes.
> 
> Thanks! Your earlier patches were merged to master and then these stat
> patches applied cleanly.
> 
> I don't have knowledge to review the .def files, but I have looked at
> the C files. In many ways the code is better now. :-) I did some very
> basic testing with MSVCRT including the tweaks I made. Even though
> things worked for me, I'm worried about a few details.
> 
> (1)
> Does the __MINGW_ASM_CALL usage work with control flow integrity (CFI)
> on Windows? Consider this:
> 
>     // <sys/stat.h>
>     int __cdecl stat(const char *_Filename, struct stat *_Stat)
>         __MINGW_ASM_CALL(stat64i32);
> 
>     // stat64i32.c
>     int __cdecl stat64i32(const char *_Filename,
>                           struct _stat64i32 *_Stat)
> 
> stat.h has "struct stat *" and stat64i32.c has "struct _stat64i32 *".
> On Linux and "clang -fsanitize=function", this kind of argument type
> mismatch doesn't work if the function is called indirectly using a
> function pointer. It might not be common to take the address of stat,
> but it should still work. (I didn't test how this works on Windows,
> sorry.)

__MINGW_ASM_CALL is just a function symbol redirection. I used
__MINGW_ASM_CALL specially for making aliases of those functions as they
have same ABI.

Basically whatever function you call (on 64-bit) the result should be
exactly same and I think that also pointers for those functions should
be same.

Is clang really complaining?

We had already start to use __MINGW_ASM_CALL for symbol redirection
instead of conditional #ifdef or inline functions, because of C++23
exports and C++ inline mechanism (which differs from C). It is needed
that emitted symbol in every object file has same meaning. And so,
the __MINGW_ASM_CALL is doing it.

> If needed, this alone should be easy to fix. stat*.c files could define
> _TIME_BITS and _FILE_OFFSET_BITS to get the correct "struct stat"
> variant. The pointer could be cast to "struct _stat64i32 *" when calling
> _stat64i32. But the C standard and compilers might not like that:
> 
> (2)
> Casting a pointer to/from two structs breaks strict aliasing rules even
> if the structs have identical members. Because the casts are done at
> translation unit boundaries, it might work in practice as long as -flto
> isn't used. Casting likely is OK also if calling a CRT function (not a
> libmingwex wrapper) because -flto shouldn't be able to see what the CRT
> DLLs do.

If needed, we can compile particular mingw-w64 source file with
-fno-strict-aliasing parameter. Or we can define pointer type in
mingw-w64 source code with attribute which is saying that aliasing is
allowed. I do not know how is that attribute called, but I know that gcc
has such (and I hope that clang would implement it too).

> Otherwise I suspect that extra code is needed. For example, stat32.c
> could use memcpy to convert struct _stat32 to struct stat like
> mingw-w64-crt/stdio/_stat.c does in the current mingw-w64 code. These
> wrappers could set _TIME_BITS and _FILE_OFFSET_BITS to get the correct
> copy of struct stat.

I think that it is stupid to use memcpy for cases which are 1:1 ABI
binary compatible.

Also those aliases in def files are doing same thing. Just marking
symbols as are ABI binary compatible.

> Strict aliasing issues often feel silly (a simple thing becomes more
> complex and slower), but it seems to have been a concern in the past.
> For example, see the commits ae36a4bbc0e6 and b38f5085b758.
> 
> (3)
> fstat, stat, and wstat aren't declared in <sys/stat.h> when
> __CRT__NO_INLINE is defined, thus those declarations are missing if
> compiling without optimizations. I changed those to _CRTBLD, but this
> broke ftw.c (not ftw64.c). <ftw.h> uses "struct stat" and "struct
> stat64". I almost fixed the ftw issue, but I didn't finish because
> figuring out the stat redirects should be finished first (*if* they need
> to be changed).

I know about this issue and that is why I defined __CRT__NO_INLINE. As
this problem is already there, it could be fixed later.

> If both (1) and (2) are a concern, there might be a need for five ftw
> variants: four due to time_t/off_t combinations, and the fifth due to
> ftw64 using struct stat64. ftw/ftw64 take a function pointer argument,
> and that function takes a pointer to a struct stat/stat64 as an
> argument. The typename may need to match for CFI.
> 
> Maybe using macros to redirect stat, wstat, and fstat could be simpler
> after all.

This will break C++. We already had to change some code from macros to
__MINGW_ASM_CALL because header files generated incorrect code when
compiled by C++ compiler.

> Then the four ftw variants would be built with distinct
> struct stat32/64/32i64/64i32 types, and a fifth variant for
> ftw64-with-stat64 wouldn't be needed.
> 
> Using macros means that four statxxx structs (instead of two) would be
> needed in <sys/stat.h>. They would be identical to the four in
> <_mingw_stat64.h> except that the leading underscore would be omitted.
> 
> The current <sys/stat.h> in mingw-w64 uses macros to redirect stat and
> fstat for _FILE_OFFSET_BITS == 64, so such macros should be OK in terms
> of application compatibility.
> 
> (4)
> Always using 64-bit time in struct stat64 doesn't sound correct because
> it should still use time_t. But it has been this way for a very long
> time and matches _stat64, thus I agree it's best to not change it. In
> 2025, few POSIX apps should be using the LFS functions with the "64"
> suffix anyway.

This is MS ABI, so I would follow it. stat64 in MS ABI uses 64-bit
time_t, so I would rather not change it.

> (5)
> This isn't about your patches, but I looked for other broken uses of
> __CRT__NO_INLINE. The macro comes from 4a0f06032b17 (2009).
> 
> <sys/utime.h> uses the macro to avoid declaring the inline functions
> that redirect _utime/_futime/_wutime to 32/64-bit versions. See the
> commits cad9b5a68bdc and e217baf22f0a (both from 2009). Thus,
> _USE_32BIT_TIME_T is ignored if an application is built with
> optimizations disabled. Seems that a wrong function might be called in
> that case. I didn't try to fix this now.
> 
> I attached my edits. They apply on top of your patches. I included even
> the S_IFBLK change because if API/ABI is touched anyway, it's good to
> do all changes at once, but I understand that it's out of scope of
> these stat changes. Feel free to adapt/edit/squash as you prefer, and
> skip those that you don't like. Remember that my _CRTBLD change broke
> ftw.c, but let's figure out the stat function redirects first.
> 
> Unfortunately I don't have time to look at this further this week, but I
> should have time some day next week.
> 
> -- 
> Lasse Collin

I will include your first change directly into my patches as it is
basically fixup of one of my change.


_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to