[CVS ci] Strings. Finally.

2005-02-28 Thread Leopold Toetsch
After Dan's string patch got merged to head (thanks to Will Coleda for sending me a diff), I've put in some more string stuff with these new opcodes: * charset, charsetname, find_charset * is_whitespace, is_digit, is_wordchar, is_punctuation, is_newline * find_whitespace, find_digit,

Re: Strings. Finally.

2004-06-16 Thread Damien Neil
On Jun 14, 2004, at 1:54 PM, Dan Sugalski wrote: Parrot provides code points for all graphemes, even for those character sets/encodings which don't inherently do so. Most sets that have variable-length encodings use an escape sequence scheme--the value of the first byte in a character determines

Re: Strings. Finally.

2004-06-15 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote: Synthesized code points === Parrot provides code points for all graphemes, even for those character sets/encodings which don't inherently do so. Most sets that have variable-length encodings use an escape sequence scheme--the value

Re: Strings. Finally.

2004-06-15 Thread Dan Sugalski
At 8:41 PM -0700 6/14/04, Brent 'Dax' Royal-Gordon wrote: Sorry to reply to this, but I feel that this is a request for clarifications, not for a change. :^) Dan Sugalski wrote: Synthesized code points === ... becomes two integers, 0x0041 and 0x82A9. (Though it could

Re: Strings. Finally.

2004-06-15 Thread Dan Sugalski
At 4:04 PM +0200 6/15/04, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Synthesized code points === Parrot provides code points for all graphemes, even for those character sets/encodings which don't inherently do so. Most sets that have variable-length

Re: Strings. Finally.

2004-06-15 Thread Dan Sugalski
At 4:33 PM -0700 6/15/04, Damien Neil wrote: On Jun 14, 2004, at 1:54 PM, Dan Sugalski wrote: Parrot provides code points for all graphemes, even for those character sets/encodings which don't inherently do so. Most sets that have variable-length encodings use an escape sequence scheme--the value

Strings. Finally.

2004-06-14 Thread Dan Sugalski
The official, 1.0, final version, modulo a more correct name for 'grapheme', or spelling/grammar errors. Do please note that whatever objection you may have to this has at least three people who disagree differently, and one or more (who aren't me) who agree with what you disagree with. Also

Re: Strings. Finally.

2004-06-14 Thread Brent 'Dax' Royal-Gordon
Sorry to reply to this, but I feel that this is a request for clarifications, not for a change. :^) Dan Sugalski wrote: Synthesized code points === ... becomes two integers, 0x0041 and 0x82A9. (Though it could represent them as 16-bit integers, since no character