Re: Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

2007-10-22 Thread Juerd Waalboer
E R skribis 2007-10-22 11:47 (-0500): > I think I'm trying to make a slightly different point: part of what > Encode::encode MUST do is to create a Perl string with a particular > internal representation. For example, in: > $a = Encode::encode(...); > chop($b = $a."\x{101}"); > we have $a eq $b

Re: Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

2007-10-22 Thread E R
On 10/22/07, Juerd Waalboer <[EMAIL PROTECTED]> wrote: > There's an alternative way of viewing this: there are two types of > strings: binary and text. If you encode text, you get binary. I think I'm trying to make a slightly different point: part of what Encode::encode MUST do is to create a Per

Re: Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

2007-10-22 Thread Juerd Waalboer
E R skribis 2007-10-22 7:01 (-0500): > So this raises another interesting point... not only must > Encode::encode et al. perform the proper encoding (as in translations > to character ordinals), but they also must return a Perl string whose > internal representation is, shall we say, the "conventi

Re: Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

2007-10-22 Thread E R
On 10/19/07, Juerd Waalboer <[EMAIL PROTECTED]> wrote: > E R skribis 2007-10-19 17:14 (-0500): > > So it seems that in light of this one should always use Encode::encode with > > these modules to ensure the data is represented the way you want it. > > Encode::encode, Encode::encode_utf8, or utf8::e

Re: Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

2007-10-19 Thread Juerd Waalboer
E R skribis 2007-10-19 17:14 (-0500): > The problem I need to understand now is the following: > # using mod_perl 1.28 > # note: binmode(STDOUT, ":utf8") has no effect > $r->print($x); # emits 1 octet > $r->print($y); # emits 2 octets > I get similar behavior when storing $y into an Oracle

Explaining this behavior (was Re: good name for characters matching [^\0-\377]?)

2007-10-19 Thread E R
On 10/18/07, Juerd Waalboer <[EMAIL PROTECTED]> wrote: > E R skribis 2007-10-18 16:21 (-0500): ... > To be honest, I'm not sure you know enough about Perl's string model to > be giving a presentation about Unicode in Perl. You just learnt very > important aspects, and from the things you write, I'd

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Juerd Waalboer
E R skribis 2007-10-18 16:21 (-0500): > I should have added that in my presentation I am attempting to present > Perl strings from a character set agnostic perspective. That is silly, because Perl itself is not at all character set agnostic. It has unicode strings and it has binary strings, but t

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Juerd Waalboer
John Delacour skribis 2007-10-18 20:24 (+0100): > >They are "characters outside the latin-1 range". > Latin-1 has nothing to do with it. Blocks of characters have names in Unicode. One of those names is "Latin-1 Supplement". It has a lot to do with it. However, I was mistaken: "latin-1" in Unico

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Thomas L. Shinnick
At 02:24 PM 10/18/2007, John Delacour wrote: Juerd Waalboer wrote: E R skribis 2007-10-18 9:50 (-0500): I'm preparing a presentation about Perl and Unicode support, and I'd like to give a name for characters with ordinals above 255. Is there a good name for that class? They are "characters ou

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread E R
I should have added that in my presentation I am attempting to present Perl strings from a character set agnostic perspective. So, even though there is a strong bias for Perl to treat character ordinals > 255 as Unicode code-points, I don't want people to automatically think Unicode when encounteri

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread John Delacour
Juerd Waalboer wrote: E R skribis 2007-10-18 9:50 (-0500): I'm preparing a presentation about Perl and Unicode support, and I'd like to give a name for characters with ordinals above 255. Is there a good name for that class? They are "characters outside the latin-1 range". Latin-1 has nothi

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Juerd Waalboer
Georg Bauhaus skribis 2007-10-18 17:01 (+0200): > Isn't it about time to find a good name for crippled character sets > with ordinals below 256 only? These are "single byte encodings". I prefer to add the word "legacy" too. -- Met vriendelijke groet, Kind regards, Korajn salutojn, Juerd Waal

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Juerd Waalboer
E R skribis 2007-10-18 9:50 (-0500): > I'm preparing a presentation about Perl and Unicode support, and I'd > like to give a name for characters with ordinals above 255. Is there a > good name for that class? They are "characters outside the latin-1 range". > How about "extended characters"???

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Martin Hosken
Dear Georg, > Isn't it about time to find a good name for crippled character sets > with ordinals below 256 only? Otherwise Unicode characters will > continue to be considered the special case... > Legacy encodings. Nicely derogatory and generally accepted. Yours, Martin

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Georg Bauhaus
On Thu, 2007-10-18 at 09:50 -0500, E R wrote: > I'm preparing a presentation about Perl and Unicode support, and I'd > like to give > a name for characters with ordinals above 255. Is there a good name > for that class? Isn't it about time to find a good name for crippled character sets with ordin

good name for characters matching [^\0-\377]?

2007-10-18 Thread E R
I'm preparing a presentation about Perl and Unicode support, and I'd like to give a name for characters with ordinals above 255. Is there a good name for that class? How about "extended characters"??? ER