RE: Transcoding patch

2001-10-10 Thread Dan Sugalski
At 01:36 PM 10/10/2001 +0200, Henrik Tougaard wrote: >From: Dan Sugalski [mailto:[EMAIL PROTECTED]] > >... > > strnative's the native encoding, right? It shouldn't be US-ASCII by > > default, particularly, at least not for everyone. (Does > > anyone handy have > > an 8-bit set that's not US ASCII

Re: Transcoding patch

2001-10-10 Thread Dan Sugalski
At 01:36 PM 10/10/2001 +0200, Bart Lateur wrote: >On Tue, 09 Oct 2001 21:12:00 -0400, Dan Sugalski wrote: > > >Does anyone handy have > >an 8-bit set that's not US ASCII as their default character set? > >EBCDIC? Or any ASCII variant with a different set of high-bit characters. If we could get,

RE: Transcoding patch

2001-10-10 Thread Henrik Tougaard
From: Dan Sugalski [mailto:[EMAIL PROTECTED]] >... > strnative's the native encoding, right? It shouldn't be US-ASCII by > default, particularly, at least not for everyone. (Does > anyone handy have > an 8-bit set that's not US ASCII as their default character > set? I use ISO-8859-1 - its no

Re: Transcoding patch

2001-10-10 Thread Bart Lateur
On Tue, 09 Oct 2001 21:12:00 -0400, Dan Sugalski wrote: >Does anyone handy have >an 8-bit set that's not US ASCII as their default character set? EBCDIC? Not me. -- Bart.

Re: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 12:58 AM 10/10/2001 +0100, Simon Cozens wrote: >On Tue, Oct 09, 2001 at 10:37:22AM -0400, Dan Sugalski wrote: > > On the other hand, I'd really, *really* rather not have Unicode > > constants in anything other than UTF-32 > >That's a bizarre decision; I'm sure you mean UCS-4 by that. Nope, I m

Re: Transcoding patch

2001-10-09 Thread Simon Cozens
On Tue, Oct 09, 2001 at 10:37:22AM -0400, Dan Sugalski wrote: > On the other hand, I'd really, *really* rather not have Unicode > constants in anything other than UTF-32 That's a bizarre decision; I'm sure you mean UCS-4 by that. I don't think UTF-32 can address outside of the BMP, but I can't q

RE: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 10:29 PM 10/9/2001 +0100, Tom Hughes wrote: >I havn't added the A prefix because I'm still not clear what >encoding those are supposed to map to. I can understand the >following mappings: > > N => enc_native > U => enc_utf32 > >but what is A supposed to map to exactly? or is the assembler >

RE: Transcoding patch

2001-10-09 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Dan Sugalski <[EMAIL PROTECTED]> wrote: > utf8 and utf16 are both variable length encodings for space reasons. > There's not much reason to space-compact something then expand the heck out > of it. On the other hand, I'd really, *really* rather not have Un

RE: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 03:03 PM 10/9/2001 -0500, Gibbs Tanton - tgibbs wrote: > > At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: > > >This looks good. > > > > > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use >utf32_t and > > >then mask off the lower 8 or 16 bits? We can still have utf8_t

RE: Transcoding patch

2001-10-09 Thread Gibbs Tanton - tgibbs
> At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: > >This looks good. > > > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use utf32_t and > >then mask off the lower 8 or 16 bits? We can still have utf8_t be defined > >as char to allow sizeof to work right and we can do siz

RE: Transcoding patch

2001-10-09 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Dan Sugalski <[EMAIL PROTECTED]> wrote: > At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: > >This looks good. > > > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use utf32_t and > >then mask off the lower 8 or 16 bits? We can still

RE: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: >This looks good. > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use utf32_t and >then mask off the lower 8 or 16 bits? We can still have utf8_t be defined >as char to allow sizeof to work right and we can do sizeof(utf8_t)*

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
Thanks! Applied. -Original Message- From: Tom Hughes To: [EMAIL PROTECTED] Sent: 10/8/2001 6:51 PM Subject: RE: Transcoding patch In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
From: Tom Hughes To: [EMAIL PROTECTED] Sent: 10/8/2001 6:51 PM Subject: RE: Transcoding patch In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I'll commit this. However, we > also need

RE: Transcoding patch

2001-10-08 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I'll commit this. However, we > also need the ability to do unicode in the assembler (I'll do this later > today if no one beats me to it), and we need some way

Re: Transcoding patch

2001-10-08 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton <[EMAIL PROTECTED]> wrote: > > - The utf8_t, utf16_t and utf32_t types will need to be determined > >by configure as they will currently break on some machines. Plus > >machines without native 8, 16 and 32 bit types will be a problem. >

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
> Absolutely. A few other issues that I remembered last night are: > > - The current code assumes that the string data will be two >byte aligned for UTF-16 and four byte aligned for UTF-32 which >is probably reasonable but maybe not. Yeah, I think we can handle that in the constant secti

Re: Transcoding patch

2001-10-08 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton <[EMAIL PROTECTED]> wrote: > I've applied this patch. I just did an update and noticed the new files had appeared about two seconds before your mail arrived ;-) > I realize that we have a ways to go before we can fully support unicode, but > I

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
he assembler should be able to get what encoding to use from a file. Thanks! Tanton -Original Message- From: Tom Hughes To: [EMAIL PROTECTED] Sent: 10/7/2001 10:23 AM Subject: Transcoding patch The attached patch is a first stab at implementing string transcoding and the unicode stri

Re: Transcoding patch

2001-10-07 Thread Simon Cozens
On Sun, Oct 07, 2001 at 11:08:56AM -0500, Gibbs Tanton - tgibbs wrote: > I guess the question with native strings is will it always be ASCII or will > it be Shift-JIS etc...? Can I just say: locales. -- Ah the joys of festival + Gutenburg project. I can now have Moby Dick read to me by Steph

RE: Transcoding patch

2001-10-07 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I'll commit this. However, we > also need the ability to do unicode in the assembler (I'll do this later > today if no one beats me to it), and we need some way

RE: Transcoding patch

2001-10-07 Thread Gibbs Tanton - tgibbs
Sent: 10/7/2001 10:23 AM Subject: Transcoding patch The attached patch is a first stab at implementing string transcoding and the unicode string types. The transcoder will currently only map one UTF type to another - there is no attempt to implement mapping to or from native strings as I wasn'

Transcoding patch

2001-10-07 Thread Tom Hughes
The attached patch is a first stab at implementing string transcoding and the unicode string types. The transcoder will currently only map one UTF type to another - there is no attempt to implement mapping to or from native strings as I wasn't sure what the plan was for that. Presumably we will h