Graeme Geldenhuys schrieb:
On 2013-07-15 17:43, Hans-Peter Diettrich wrote:
Another workaround: use the appropriate codepage for storing strings in
the database, so that all characters are single bytes.
You should know by now that not all characters can be represented in a
single byte.
I know that, the question is whether the user and DB understand that, too.
Also Unicode was developed to overcome the many code-page
issues and standardise text storage. So I think it would be silly using
anything other than one of the Unicode encodings (I would opt for UTF-8
or UTF-16) in this day and age.
Depends on the requested DB/SQL operations. Sizing, searching and
sorting of strings is fastest with SBCS of a specific encoding, Unicode
requires much more code and computation power. Even then I wonder how
strings of different languages will be sorted together, most probably a
"raw" sort (by codepoints) is the only solution.
The missing link is fixing SqlDB or Zeos etc, as it seems that the
database servers themselves already support UTF-8 and UTF-16 text
fields. [at least Firebird does]
My solution (work-around) for now is to not specify a charset for
Firebird and always store text as UTF-8. My fields are defined as
follows [in bytes size]: <desired size in characters> * 1.5
By business objects are coded to notify the user interface to only allow
<desired size in characters> text input. 6 years on, and I haven't had a
single client complain that text was truncated [maybe a little bit of
luck has something to do with it too]. I guess I must also add that our
products are mainly geared towards English, Afrikaans and Portuguese. So
large quantities of multi-byte characters are at a minimum.
You see that Unicode introduces new problems. Even in UTF-32 the element
count does not always equal the character count. Your calculation
obviously is based on latin characters, while Unicode supports many more
character sets or codepages. In your case I'd prefer an SBCS supporting
all your languages, what's certainly feasable, even if it may not be an
registered ISO/ANSI codepage.
DoDi
--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus