[The background to this is that Lincoln and I have been working on
Unicode support for DBD::Oracle. (Actually Lincoln's done most of
the heavy lifting, I've mostly been setting goals and directions
at the DBI API level and scratching at edge cases. Like this one.)]
On Thu, Apr 29, 2004 at
Dear Tim,
CESU-8 defines an encoding scheme for Unicode identical to UTF-8
except for its representation of supplementary characters. In CESU-8,
supplementary characters are represented as six-byte sequences
resulting from the transformation of each UTF-16 surrogate code
unit into an eight-bit
On Fri, Apr 30, 2004 at 03:49:13PM +0300, Jarkko Hietaniemi wrote:
Okay. Thanks.
Basically I need to document that Oracle AL32UTF8 should be used
as the client charset in preference to the older UTF8 because
UTF8 doesn't do the best? thing with surrogate pairs.
because what Oracle