El 24/02/2021 a las 11:58, Bart via lazarus escribió:
Hello,
In my code there is non 100% unicode compatibility when using the
"CaseInsensitive" mode as as it uses lowercase mask and lowercase string
to perform the test which is wrong by definition
Currently Masks unit does the same.
Yes, but in example in my case I can not success test mask "ä*" vs
string "Ä*" because "Ä" is not lowercased to "ä" (Windows 7).
Sometimes I wish we would migrate to using UnicodeString by default.
It would make life a bit easier.
(And yes I know you would have to deal with composed characters
(grapheme defined by more than 1 16-bit word)).
That's a can of worms! UTF8 forces you to write "correct code" (at least
try it) for any character >127, with UnicodeString you get the false
apparence that everything magically works until everything cracks when a
string with surrogate pairs come in play :-) and ALL you text handling
must be rewritten, and most of them completly rewritten.
There are no tests for MatchesWindowsMask() yet.
I tested that extensively on my machine with all scenarios I could think of.
But others most likely can think of scenarios I did not test.
It was based on current behaviour of Windows NT platform (Win7 at the
time to be precise).
Who defines which are right and which are wrong ?
Well, I did ;-)
(Nobody else bothered at the time, and nobody complained either.)
And mostly will not as almost everything matches the expected behaviour
for an user, like typical "*.txt" but there are some non supported cases
like:
Filename:='test.txt'
Mask:='test??.txt?'
Match must be true
This is the doc from my code about Windows matching, Quirks can be
enabled or disabled for compatibility:
----------------8<----------------------------8<---------------------
Windows mask works in a different mode than regular mask, it has too
many quirks and corner cases inherited from CP/M, then adapted to DOS
(8.3) filenames and adapted again for long file names.
Anyth?ng.abc = "?" matches exactly 1 char
Anyth*ng.abc = "*" matches 0 or more of chars
------- Quirks -------
--eWindowsQuirk_AnyExtension
Anything*.* = ".*" is removed.
--eWindowsQuirk_FilenameEnd
Anything??.abc = "?" matches 1 or 0 chars (except '.')
(Not the same as "Anything*.abc", but the same
as regex "Anything.{0,2}\.abc")
Internally converted to "Anything[??].abc"
--eWindowsQuirk_Extension3More
Anything.abc = Matches "Anything.abc" but also
"Anything.abc*" (3 char extension)
Anything.ab = Matches "Anything.ab" and never
"anything.abcd"
--eWindowsQuirk_EmptyIsAny
"" = Empty string matches anything "*"
--eWindowsQuirk_AllByExtension (Not in use anymore)
.abc = Runs as "*.abc"
--eWindowsQuirk_NoExtension
Anything*. = Matches "Anything*" without extension
----------------8<----------------------------8<---------------------
--
--
_______________________________________________
lazarus mailing list
lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus