Gisle Aas <[EMAIL PROTECTED]> writes:
>Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>
>> Please take a look at the (very rough) first draft of Encode, an extension
>> for character encoding conversions for Perl 5:
>> 
>>      http://www.iki.fi/jhi/Encode.tgz
>> 
>> Download, plop it into the Perl 5.7 source directory, unpack,
>> re-Configure, rebuild.  (Or, if you have a Perl 5.7 in your path,
>> cd to ext/Encode, perl Makefile.PL, make).
>
>I did not really understand the interface.  It seems like you expose
>the fact that perl (currently) use utf8 internally too much.
>
>I would like to see these convert perl strings to bytes:
>
>  to_utf7
>  to_utf8
>
>
>And these convert a sequence of bytes to perl strings:
>
>  from_utf8
>  from_utf8_strict    # croak on out-of-range UTF8, over-long sequences, etc.
>  from_utf16_be
>  from_utf32_be
>
>You seem to want to define these function the opposite way.  Perhaps
>the names are just too confusing.

I can see why either way round makes a kind of sense.
I think the 'from_' names are more confusing than the 'to_' names.

The snag with either is the "other" side is implcit.

My stab at names would be:

     utf8bytes_to_chars()
    
     chars_to_utf8bytes();

With variants for the other utf* if necessary.  

The important thing to remember is that internaly perl has sequence of 
characters. And that in perl-5.6+ characters can be bigger than 8 bits.

The fact that we internally represent the characters with UTF8 encodings
should be irrelevant to the API. The only place it should matter is that 
some of the "give me the string like this" functions will not 
have to _do_ very much.


-- 
Nick Ing-Simmons

Reply via email to