\G problems in 5.10

2009-06-04 Thread Martin Hosken
Dear All, Is this me or is it a problem in 5.10? Code that previously worked for me in 5.8 has stopped working in 5.10. The best way to show this (not that I have 5.8 now) is that: perl -e 'use utf8; $t=abc; pos($t) = 1; print scalar $t =~ m/a\Gb/gcs;' prints 1 and: perl -e 'use utf8;

Re: loading necessary module to Interpreter

2008-09-11 Thread Martin Hosken
Dear Gemma, I have downloaded ActiveState for windows, and cannot get scripts to run other then the perl -v and perl -h command. The hello program was saved to the desktop in a folder called perlscripts. The commands typed in the console window are as follows. cd \desktop cd \perlscripts

Re: good name for characters matching [^\0-\377]?

2007-10-18 Thread Martin Hosken
Dear Georg, Isn't it about time to find a good name for crippled character sets with ordinals below 256 only? Otherwise Unicode characters will continue to be considered the special case... Legacy encodings. Nicely derogatory and generally accepted. Yours, Martin

Re: CGI::Util unescape() after escape() loses utf8 flag

2005-09-28 Thread Martin Hosken
. Perhaps it might be time to add an optional argument to escape, that would allow for creating the u form? It would be simple enough to do, I'd expect. Isn't this what the first match is doing. And should that be /u[0-9a-fA-F]{4,6}/ to allow for multi-lingual plane stuff? Martin

Re: AL32UTF8

2004-04-30 Thread Martin Hosken
Dear Tim, CESU-8 defines an encoding scheme for Unicode identical to UTF-8 except for its representation of supplementary characters. In CESU-8, supplementary characters are represented as six-byte sequences resulting from the transformation of each UTF-16 surrogate code unit into an eight-bit