On 2025-01-14 Pali Rohár wrote:
> Well, warnings are warnings. They are always being added by new
> compiler versions, so I would not be afraid of adding also in new
> mingw-w64 version. And security "warning" for me sounds like a good
> idea.

OK, I agree. :-)

Remember that I'm not a MinGW-w64 developer. Don't put too much weight
on my opinions.

> > I have attached a draft patch (header bits are missing) and a demo
> > program. It has the above features so it's possible to think if the
> > extra code is worth it.  
> 
> I have already WIP something better, to handle all other parts, like
> not leaking wide global variables, properly initialize of narrow
> variables and fixing also direct usage of __getmainargs function by
> applications. All these parts are not handled in tour draft. I need
> more time to finish it and do more tests.

Great! :-)

> > With GB18030, U+FFFD consumes four bytes. So with that charset, the
> > maximum possible byte count is larger.  
> 
> It is possible to change local process ucrt encoding to GB18030?

I suspect that it's not. UTF-8 might be the only locale code page that
isn't single or double byte. So reserving space for UTF-8 should be
enough. Even if some locale code page supports longer encodings
someday, the name has to be very long to hit the limit and result in
ENAMETOOLONG.

With UTF-8, even the current 255 bytes should almost never be a
problem. Increasing NAME_MAX ensures that apps can list names in
an unusual case too (but they still cannot list unpaired surrogates). On
the other hand, the larger NAME_MAX may cause new problems if an app
assumes that a filename always fits in MAX_PATH (260) bytes. The dirent
API is from POSIX, so one would hope that apps ported from POSIX handle
it well. I don't know if that is too optimistic.

> > > Maybe we would need type versioning? Like it was with time_t or
> > > fpos_t which based on the compile time macro expands either to
> > > old (32-bit) or new (64-bit type).  
> > 
> > If a third party header has
> > 
> >     include <dirent.h>
> >     int foo(DIR *d);
> > 
> > it's not possible to know which version of the symbols were used
> > when the library was compiled. To do versioning with only header
> > macros, all participants have to co-operate. Ideally one doesn't
> > use this kind of data types in API/ABI at all.  
> 
> Yes, that is truth. But same thing is already being done for time_t
> types in both visual studio and mingw-w64 header files. There are some
> defaults and via #define you change the behavior.
> 
> Also same was used for a long time by UNIX LFN which changed in this
> way fpos_t and off_t types (plus redefined open, lseek and other
> functions).

Those are good points, thanks. I still fear it could be messy.

I see that sizeof(DIR) depends on _USE_32BIT_TIME_T because DIR
contains _finddata_t or _wfinddata_t. Luckily no one is supposed to
access that structure directly.

I will send a few dirent patches. I played around with flags that
re-enable best-fit mapping or disallow filenames over 255 bytes (if
someone needs those for compatibility reasons). Those won't be in the
first version.

  - Having more than one flag might make the API too fancy in the same
    sense as I commented about command line handling features. The
    main problem isn't the few lines of extra code in dirent.c, it's
    that few would use the extra features and that the risk of
    incorrect use increases.

  - A global variable works for _dowildcard but it's problematic for
    dirent because a library might want to set it too. A DLL and
    application would have their own flags which would work, but if a
    library is built statically then the same variable could be defined
    twice and cause a linker error. Or if the flag variable is only
    defined in the static library, it would affect the unsuspecting
    application too.

  - <dirent.h> could have _opendir_lossy(const char *). Then one could
    have something like:

        #ifdef _DIRENT_LOSSY
        #   define opendir _opendir_lossy
        #endif

    Apps could then #define _DIRENT_LOSSY and the code would be *source
    compatible* with both old and new MinGW-w64. If an application
    itself does "#define opendir _opendir_lossy" then the code would
    only compile with new MinGW-w64.

  - It's easy to add d_lossy flag to struct dirent to mark which names
    weren't properly converted. But again, it could be too fancy.

-- 
Lasse Collin


_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to