Re: AL32UTF8

2004-04-30 Thread Tim Bunce
[The background to this is that Lincoln and I have been working on Unicode support for DBD::Oracle. (Actually Lincoln's done most of the heavy lifting, I've mostly been setting goals and directions at the DBI API level and scratching at edge cases. Like this one.)] On Thu, Apr 29, 2004 at

Re: AL32UTF8

2004-04-30 Thread Martin Hosken
Dear Tim, CESU-8 defines an encoding scheme for Unicode identical to UTF-8 except for its representation of supplementary characters. In CESU-8, supplementary characters are represented as six-byte sequences resulting from the transformation of each UTF-16 surrogate code unit into an eight-bit

Re: AL32UTF8

2004-04-30 Thread Tim Bunce
On Fri, Apr 30, 2004 at 03:49:13PM +0300, Jarkko Hietaniemi wrote: Okay. Thanks. Basically I need to document that Oracle AL32UTF8 should be used as the client charset in preference to the older UTF8 because UTF8 doesn't do the best? thing with surrogate pairs. because what Oracle