After Dan's string patch got merged to head (thanks to Will Coleda for
sending me a diff), I've put in some more string stuff with these new
opcodes:
* charset, charsetname, find_charset
* is_whitespace, is_digit, is_wordchar, is_punctuation, is_newline
* find_whitespace, find_digit,
On Jun 14, 2004, at 1:54 PM, Dan Sugalski wrote:
Parrot provides code points for all graphemes, even for those
character sets/encodings which don't inherently do so. Most sets that
have variable-length encodings use an escape sequence scheme--the
value of the first byte in a character determines
Dan Sugalski [EMAIL PROTECTED] wrote:
Synthesized code points
===
Parrot provides code points for all graphemes, even for those
character sets/encodings which don't inherently do so. Most sets that
have variable-length encodings use an escape sequence scheme--the
value
At 8:41 PM -0700 6/14/04, Brent 'Dax' Royal-Gordon wrote:
Sorry to reply to this, but I feel that this is a request for
clarifications, not for a change. :^)
Dan Sugalski wrote:
Synthesized code points
===
...
becomes two integers, 0x0041 and 0x82A9. (Though it could
At 4:04 PM +0200 6/15/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
Synthesized code points
===
Parrot provides code points for all graphemes, even for those
character sets/encodings which don't inherently do so. Most sets that
have variable-length
At 4:33 PM -0700 6/15/04, Damien Neil wrote:
On Jun 14, 2004, at 1:54 PM, Dan Sugalski wrote:
Parrot provides code points for all graphemes, even for those
character sets/encodings which don't inherently do so. Most sets that
have variable-length encodings use an escape sequence scheme--the
value
The official, 1.0, final version, modulo a more correct name for
'grapheme', or spelling/grammar errors.
Do please note that whatever objection you may have to this has at
least three people who disagree differently, and one or more (who
aren't me) who agree with what you disagree with. Also
Sorry to reply to this, but I feel that this is a request for
clarifications, not for a change. :^)
Dan Sugalski wrote:
Synthesized code points
===
...
becomes two integers, 0x0041 and 0x82A9. (Though it could
represent them as 16-bit integers, since no character