Iterating over a string is for the purpose of doing something with each
individual character..whether it is a ‘A’ or a 'A' with a ^ (caret) on
top of it. When I said the number of bytes in a character varies I was not
meaning the number of bytes in a Char - I was meaning the total
John,
I think you are confusing Canonical Normalized versions of the same Unicode
string (in the example s1 is canonical, s2 is normalized) and the effect of
local codepage conversion.
Windows-1252 codepage (latin ISO 8859-1) has support for characters like the
ö (ascii code #246) and é
I think you are confusing Canonical Normalized versions
of the same Unicode string (in the example s1 is canonical,
s2 is normalized) and the effect of local codepage conversion.
Yep, and for the record I think this is a big problem with the way Embarcadero
implemented Unicode.
By
John, the problem is that in Unicode single character is meaningless unless
you have performed some pre-processing to GIVE that term some meaning. There
are some standard forms for such processing, called Normalisations.
The problem is that a single character to your eyes, e.g. an accented a,
As people know, I always recommend IconWorkshop for icons and think it
is the ants pants. Tried Gimp (no thanks), IcoFx is ok for a free
product.
Anyway, it is on special at the moment, almost half off. It is a
lifetime license as well.
It is an offer via SWREG so you can't just go to the
Hi John
You can find out whether a unicode string is inside the BMP by
converting it to UTF-32 and checking that the new string is twice the
length of the original (UTF-16) string.
A user could specifically choose to enter that character in either form -
this is unlikely, yes. Or, two users
?I read in one of the references that UTF-32 was a more common standard on
Unix systems - which means I guess they have chosen the simplest format at
the trade off of using more space?
I think linux/Windows/MacOS use UTF-16 more commonly...
Anyway for the time being, as long as the data in
Anyway for the time being, as long as the data in
strings is unicode, but is still Latin 8859 (ie ASCII
characters) I can without worrying too much iterate over
a string one character at a time...using length.
Yep. But you are building an app that now supports Unicode.
If your users are
It's a shame UTF-8 wasn't made the standard in Delphi. It's commonly used in
audio file tags, for example, which I have to deal with.
My software needs to search for songs with specific artists or titles, and it
sounds like I'm going to have problems where the information is visually the
same
You should be fine - you just have to ensure you normalise the strings.
You're going to have to convert from UTF-8 to UTF-16 to bring them in to your
Delphi app anyway, for processing, so you may as well normalise them in the
process.
UTF-16 was chosen in Delphi because it is also the native
Hi Jolyon
I spotted that they fixed that a while ago -- I remember having to fix
the issue myself many years ago so was quite pleased to see that it
was now taken care of in TInterfaceObject as a matter of course. For
some reason I never noticed the omission of the same facility in the
Actually, this would be better
function TamObject._Release: Integer;
begin
Result := InterlockedDecrement(FCount);
if (FCount = 0) then
begin
//add a reference count, incase an interface is acquired and
released during destruction
InterlockedIncrement(FCount);
self.Destroy;
Yep - I remember that my fix was to set the destructing state indicator in
a BeforeDestruction() override. This was then tested in _Release() to
render it a NO-OP during execution of the destructor chain (incomplete,
obviously, just to give the idea):
Procedure BeforeDestruction;
Yep -- I remember that my fix was to set the destructing state
indicator in a BeforeDestruction() override. This was then tested in
_Release() to render it a NO-OP during execution of the destructor
chain (incomplete, obviously, just to give the idea):
Procedure BeforeDestruction;
Yep -- I remember that my fix was to set the destructing state
indicator in a BeforeDestruction() override. This was then tested in
_Release() to render it a NO-OP during execution of the destructor
chain (incomplete, obviously, just to give the idea):
Procedure BeforeDestruction;
No, the State and csDestroying elements were part of my framework, not
the mechanism that is part of TComponent (though of course there were
obvious parallels in some cases - note however that TComponent uses
ComponentState, not just State and ControlState is introduced by the
controls part of the
16 matches
Mail list logo