subject:"Re\: AL32UTF8"

Re: AL32UTF8

2004-05-02 Thread Jarkko Hietaniemi

So the key question is... can we just do SvUTF8_on(sv) on either of these kinds of Oracle UTF8 encodings? Seems like the answer is yes, from what Jarkko says, because they are both valid UTF8. We just need to document the issue. No, Oracle's UTF8 is very much not valid UTF-8. Valid UTF-8

Re: AL32UTF8

2004-05-02 Thread Lincoln A. Baxter

On Sat, 2004-05-01 at 00:37, Lincoln A. Baxter wrote: On Fri, 2004-04-30 at 08:03, Tim Bunce wrote: On Thu, Apr 29, 2004 at 10:42:18PM -0400, Lincoln A. Baxter wrote: On Thu, 2004-04-29 at 11:16, Tim Bunce wrote: Am I right in thinking that perl's internal utf8 representation

Re: AL32UTF8

2004-05-02 Thread Tim Bunce

On Sat, May 01, 2004 at 05:35:58PM -0400, Lincoln A. Baxter wrote: Hello Owen, On Sat, 2004-05-01 at 16:46, Owen Taylor wrote: On Fri, 2004-04-30 at 08:03, Tim Bunce wrote: You can use UTF8 and AL32UTF8 by setting NLS_LANG for OCI client applications. If you do not need

Re: AL32UTF8

2004-05-01 Thread Jungshik Shin

Tim Bunce wrote: On Fri, Apr 30, 2004 at 10:58:19PM +0700, Martin Hosken wrote: IIRC AL32UTF8 was introduced at the behest of Oracle (a voting member of Unicode) because they were storing higher plane codes using the surrogate pair technique of UTF-16 mapped into UTF-8 (i.e. resulting in 2

Re: AL32UTF8

2004-04-30 Thread Tim Bunce

[The background to this is that Lincoln and I have been working on Unicode support for DBD::Oracle. (Actually Lincoln's done most of the heavy lifting, I've mostly been setting goals and directions at the DBI API level and scratching at edge cases. Like this one.)] On Thu, Apr 29, 2004 at

Re: AL32UTF8

2004-04-30 Thread Martin Hosken

Dear Tim, CESU-8 defines an encoding scheme for Unicode identical to UTF-8 except for its representation of supplementary characters. In CESU-8, supplementary characters are represented as six-byte sequences resulting from the transformation of each UTF-16 surrogate code unit into an eight-bit

Re: AL32UTF8

2004-04-30 Thread Tim Bunce

On Fri, Apr 30, 2004 at 03:49:13PM +0300, Jarkko Hietaniemi wrote: Okay. Thanks. Basically I need to document that Oracle AL32UTF8 should be used as the client charset in preference to the older UTF8 because UTF8 doesn't do the best? thing with surrogate pairs. because what Oracle

Re: AL32UTF8

2004-04-29 Thread Jarkko Hietaniemi

Tim Bunce wrote: Am I right in thinking that perl's internal utf8 representation represents surrogates as a single (4 byte) code point and not as two separate code points? Mmmh. Right and wrong... as a single code point, yes, since the real UTF-8 doesn't do surrogates which are only a UTF-16

Re: AL32UTF8

Re: AL32UTF8

Re: AL32UTF8

Re: AL32UTF8

Re: AL32UTF8

Re: AL32UTF8

Re: AL32UTF8

Re: AL32UTF8

8 matches

Site Navigation

Mail list logo

Footer information