Re: Source data for perl encodings

Nick Ing-Simmons Mon, 08 Jan 2001 01:11:12 -0800
=?Iso-8859-1?Q?Keld_j=F8rn_simonsen?= <[EMAIL PROTECTED]> writes:
>On Sun, Jan 07, 2001 at 10:46:02AM +0000, [EMAIL PROTECTED] wrote:
>> Keld,
>> 
>> As you may be aware we are adding suuport for UTF-8 encoded Unicode
>> to perl5. This is finally coming together. So now we need mechanism
>> to translate other encodings into and out of Unicode.
>
>I was not aware of that. Could you give me a pointer to the spec?

The spec is a little sketchy but the main documentation we have 
to date can be found as:

http://www.perldoc.com/perl5.7/pod/perlunicode.html

>Do you mean unicode or do you mean ISO 10646?

I am not an expert on the differences. Perl characters are now "logically"
(up to at least) 32-bit values held internally as UTF-8 encoded strings. 
The language visible properties (case, alpha-ness, digit-ness, ...)
are derived from the tables at ftp.unicode.org - the 3.0.1 version.

>> 
>> The tables there seem to be suitable for my/our purposes.
>> So I have a few questions:
>> 
>> 0. Is use/redistribution of these tables in OpenSource projects
>>    permitted?
>
>Yes, they are

Excellent.

>
>> 1. Is the format formally defined anywhere?
>>    It seems straight forward enough.
>
>The format is defined in the POSIX-2 standard ISO/IEC 9945-2:1993.
>(Aka IEEE 1003.2).
>
>> 2. Are the data actively maintained?
>
>Yes, by me, and submissions I get. I am a little slow at times, tho.
>
>> 3. Are in cultreg and i18n charmaps "identical"
>
>No, i18n are more up to date. But cultreg are official ISO.
>They are very syncronized, however.

There is also (I discovered via another web page) a WG15 tree.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.
Re: Source data for perl encodings

Reply via email to