El 24/02/2021 a las 11:58, Bart via lazarus escribió:

Hello,

In my code there is non 100% unicode compatibility when using the
"CaseInsensitive" mode as as it uses lowercase mask and lowercase string
to perform the test which is wrong by definition

Currently Masks unit does the same.

Yes, but in example in my case I can not success test mask "ä*" vs string "Ä*" because "Ä" is not lowercased to "ä" (Windows 7).

Sometimes I wish we would migrate to using UnicodeString by default.
It would make life a bit easier.
(And yes I know you would have to deal with composed characters
(grapheme defined by more than 1 16-bit word)).

That's a can of worms! UTF8 forces you to write "correct code" (at least try it) for any character >127, with UnicodeString you get the false apparence that everything magically works until everything cracks when a string with surrogate pairs come in play :-) and ALL you text handling must be rewritten, and most of them completly rewritten.

There are no tests for MatchesWindowsMask() yet.
I tested that extensively on my machine with all scenarios I could think of.
But others most likely can think of scenarios I did not test.
It was based on current behaviour of Windows NT platform (Win7 at the
time to be precise).
Who defines which are right and which are wrong ?
Well, I did ;-)
(Nobody else bothered at the time, and nobody complained either.)

And mostly will not as almost everything matches the expected behaviour for an user, like typical "*.txt" but there are some non supported cases like:

Filename:='test.txt'
Mask:='test??.txt?'
Match must be true

This is the doc from my code about Windows matching, Quirks can be enabled or disabled for compatibility:

----------------8<----------------------------8<---------------------
Windows mask works in a different mode than regular mask, it has too many quirks and corner cases inherited from CP/M, then adapted to DOS (8.3) filenames and adapted again for long file names.

        Anyth?ng.abc    = "?" matches exactly 1 char
        Anyth*ng.abc    = "*" matches 0 or more of chars

        ------- Quirks -------

        --eWindowsQuirk_AnyExtension
          Anything*.*     = ".*" is removed.

        --eWindowsQuirk_FilenameEnd
          Anything??.abc  = "?" matches 1 or 0 chars (except '.')
                         (Not the same as "Anything*.abc", but the same
                          as regex "Anything.{0,2}\.abc")
                          Internally converted to "Anything[??].abc"

        --eWindowsQuirk_Extension3More
          Anything.abc    = Matches "Anything.abc" but also
                           "Anything.abc*" (3 char extension)
          Anything.ab     = Matches "Anything.ab" and never
                           "anything.abcd"

        --eWindowsQuirk_EmptyIsAny
          ""              = Empty string matches anything "*"

        --eWindowsQuirk_AllByExtension (Not in use anymore)
          .abc            = Runs as "*.abc"

        --eWindowsQuirk_NoExtension
          Anything*.      = Matches "Anything*" without extension

----------------8<----------------------------8<---------------------

--

--
_______________________________________________
lazarus mailing list
lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to