Re: Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

Juerd Waalboer Mon, 22 Oct 2007 09:59:05 -0700

E R skribis 2007-10-22 11:47 (-0500):
> I think I'm trying to make a slightly different point: part of what
> Encode::encode MUST do is to create a Perl string with a particular
> internal representation. For example, in:
>   $a = Encode::encode(...);
>   chop($b = $a."\x{101}");
> we have $a eq $b, but $r->print($b) will probably not give you the
> output you want.
> I find the implications of this interesting. In particular, the
> conventional internal representation (the one Perl uses when the
> string has never seen any character ordinals > 255) cannot be left out
> of any presentation of Perl strings since it is required for
> communication with modules such as mod_perl, etc. The utf8
> representation, on the other hand, can be left out as programmers
> should not care how Perl internally represents the string when there
> are characters matching [^\0-\377].


This is exactly true. I was merely pointing out the same thing from a
different perspective.

Note, by the way, that even strings that have never contained a chr
0..255 may be utf8 internally. This should never happen with binary
operations, but it may happen with text operations.
-- 
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <[EMAIL PROTECTED]>  <http://juerd.nl/sig>
  Convolution:     ICT solutions and consultancy <[EMAIL PROTECTED]>

Re: Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

Reply via email to