On Fri, Feb 26, 2021 at 10:24 PM Bart via lazarus <
lazarus@lists.lazarus-ide.org> wrote:
> On Fri, Feb 26, 2021 at 7:15 PM Bart wrote:
> My backup and some related programs still compile, but instatly raise
> an exception when they start to perform their main task.
> Thank you very much.
>
> The normal way of doing this is:
> Deprecate the function in question, but d NOT kill it's functionality.
> Add a useful deprecated message.
> Remove the function in the next major release (deprecate in 2.1, and
> so 2.2, only remove in 2.3, si't be gone in 2.4).
> Simply removing functionality like you have done now will alienate
> users from Lazarus, since apparently "we" cannot be trusted.
>
> Juha: you seem to be obsessed with speeding up string handling code.
> This is not really a problem as long as you are not deaf to arguments
> against your changes.
> You introduce new bugs, remove old features, all for the sake of speed.
> All that when, in my perception, this code is mostly used in
> conjunction with file IO, which is orders of magnitude slower than
> even slopy string handling.
>
True, it created more conflicts than I anticipated.
I reverted the new TMask in r64675. It must be worked later in trunk. Now a
2.2 fork will happen in few weeks.
Sorry for the hassle.
José and others. you can see my adaptation of your code in Lazarus trunk
just before the revert, eg. r64674.
I also attach the unit here.
I fixed the case-insensitive Unicode match by simply replacing LowerCase()
with UTF8LowerCase(). It is a well optimized function.
First I planned to use UTF8CompareLatinTextFast() but it did not fit here.
There is a unit test project in components/lazutils/test/.
The code passes all tests there!
Unicode is fully supported also in mask ranges.
Let's continue the integration later.
Regards,
Juha
{
*
This file is part of LazUtils.
See the file COPYING.modifiedLGPL.txt, included in this distribution,
for details about the license.
*
Match text using wildcards and sets.
Current version is from José Mejuto. When porting to LazUtils,
functions from LazUTF8 were used for full Unicode support.
}
unit Masks;
{$mode objfpc}{$H+}
// RANGES_AUTOREVERSE
// If reverse ranges if needed, so range "[z-a]" is interpreted as "[a-z]"
{$DEFINE RANGES_AUTOREVERSE}
{$DEFINE USE_INLINE}
interface
uses
Classes, SysUtils, Contnrs, LazUtilsStrConsts, LazUTF8;
type
{ EMaskError }
EMaskError=class(EConvertError)
public
type
TMaskExceptionCode=(eMaskException_InternalError,
eMaskException_InvalidCharMask,
eMaskException_MissingClose,
eMaskException_IncompleteMask,
eMaskException_InvalidEscapeChar,
eMaskException_InvalidUTF8Sequence
);
protected
cCode: TMaskExceptionCode;
public
constructor Create(const msg: string; aCode: TMaskExceptionCode);
constructor CreateFmt(const msg: string; args: array of const; aCode: TMaskExceptionCode);
property Code: TMaskExceptionCode read cCode;
end;
{ TMaskBase }
TMaskBase = class
private
procedure SetMaskEscapeChar(AValue: Char);
protected
type
// Literal = It must match
// Range = Match any char in the range
// Negate = Negate match in a group
// AnyChar = It matches any char, but one must match
// AnyCharOrNone = Matches one or none char (only in a group)
// AnyCharToNext = Matches any chars amount, if fail, restart in the
// next position up to finish the mask or the matched string
// OptionalChar = Optional char
// CharsGroupBegin = Begin optional chars or ranges "["
// CharsGroupEnd = End optional chars or ranges "]"
TMaskOpCode = (
Literal=0,
Range=1,
Negate=2,
AnyChar=3,
AnyCharOrNone=4,
AnyCharToNext=5,
OptionalChar=6,
CharsGroupBegin=10,
CharsGroupEnd=11
);
TMaskOpcodesEnum=(eMaskOpcodeAnyChar,
eMaskOpcodeAnyCharOrNone,
eMaskOpcodeAnyText,
eMaskOpcodeRange,
eMaskOpcodeOptionalChar,
eMaskOpcodeNegateGroup,
eMaskOpcodeEscapeChar);
TMaskOpcodesSet=set of TMaskOpcodesEnum;
TMaskFailCause = (
Success = 0,
MatchStringExhausted = 1,
MaskExhausted = 2,
MaskNotMatch = 3,
UnexpectedEnd = 4
);
(*
Windows mask works in a different mode than regular mask, it has too many
quirks and corner cases inherited from CP/M, then adapted to DOS (8.3) file
names and adapted again for long file names.
Anyth?ng.abc= "?" matches exactly 1