Product: Database Access
          Type: new
         Title: UTF-8 encoding for dBase databases
     Posted by: [EMAIL PROTECTED]
      Affected: -
Effective from: CWS dba22ui


*Flags*
-------
API/ BASIC [ ]
Configuration [ ]
File format change [ ]
Help/ Guide [ ]
Performance test [ ]
Translation [ ]
UI relevant [ ]


*Description*
-------------
The character set restriction previously imposed on dBase databases
has been relaxed: Previously, it was not possible to use an encoding
where different characters are to be encoded to different by counts.
For instance, in UTF-8 encoding, a single Unicode character might be
encoded as one, two, or even more bytes.

With CWS dba22ui, all such encodings, and UTF-8 in particular, are
allowed (as long as the restrictions from
http://dba.openoffice.org/specifications/character_sets.html still apply).

When using such an encoding, be aware that the "field length" of dBase
table columns becomes a somewhat fuzzy meaning: According to the dBase
file format, the field length denotes the number of bytes reserved for
data in this field. In the OOo user interface, people tend to assume
that the field length means the maximum number of characters which can
be written into the field. This is not true for the UTF-8 encoding
anymore.

As a consequence, when the user enters a string which has less
characters than specified in the field length, but is encoded to more
bytes than the field length, an error message is shown to the user,
explaining the situation.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to