an integer ordinal value.
What happens is the following:
73 6f a0 65 69 6e a0 4b c3-a4 73 65 (UTF8 flag on)
l1 l1 u8
This is wrong. It is a bug.
--
Met vriendelijke groet, // Kind regards, // Korajn salutojn,
Juerd Waalboer ju...@tnx.nl
TNX
with no Unicode
significance.
The documentation I referred to is outdated. Sorry for that.
Indeed this documentation is wrong. Current documentation, as of Perl
version 5.8.9 (december 2008) no longer has this paragraph.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer
groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker ##...@juerd.nl http://juerd.nl/sig
Convolution: ICT solutions and consultancy sa...@convolution.nl
Andreas J. Koenig skribis 2009-05-25 8:30 (+0200):
On Sun, 24 May 2009 10:09:25 +0200, Juerd Waalboer
ju...@convolution.nl said:
Although it's safe on output, it's better to get used to using
:encoding(utf8) instead of :utf8. Using :utf8 on input can cause
stability and security
regards, Korajn salutojn,
Juerd Waalboer: Perl hacker ##...@juerd.nl http://juerd.nl/sig
Convolution: ICT solutions and consultancy sa...@convolution.nl
1;
have no idea.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Convolution: ICT solutions and consultancy [EMAIL PROTECTED]
1;
regards, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Convolution: ICT solutions and consultancy [EMAIL PROTECTED]
have to differ on this :-)
Yes, although my opinion on this is not strong. undef or replacement
character - both are good options. One argument in favor of the
replacement character would be backwards compatibility.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl
/?node_id=644786
For input, both get the correct characters, assuming the input
bytestream was indeed correct.
Yes, but if the bytestream is incorrect, you may have a security issue
if you used :utf8 instead of :encoding.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer
to a float, whenever that is needed.)
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Convolution: ICT solutions and consultancy [EMAIL PROTECTED]
,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Convolution: ICT solutions and consultancy [EMAIL PROTECTED]
that is neither complete nor accurate, but it provides more information
than most documentation does. Unfortunately I lack tuits to send bug
reports and make patches.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Georg Bauhaus skribis 2007-10-18 17:01 (+0200):
Isn't it about time to find a good name for crippled character sets
with ordinals below 256 only?
These are single byte encodings. I prefer to add the word legacy
too.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer
proof) to work around this problem by
using the Unicode::Semantics module's up() function, or the built-in
utf8::upgrade().
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Convolution: ICT solutions
E R skribis 2007-10-17 15:56 (-0500):
for (my $i = 0; $i length($x); $i++) {
$new .= chr(ord(substr($x, $i, 1)));
}
utf8::downgrade();
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Convolution
of the bytestring is seen as a single
ISO-8859-1 character, so a multi-byte UTF8 sequence will *not* be
interpreted as a single character.
Perhaps helpful: http://tnx.nl/perlunitut,perlunifaq
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED
, Korajn salutojn,
Juerd Waalboer: Perl hacker [EMAIL PROTECTED] http://juerd.nl/sig
Convolution: ICT solutions and consultancy [EMAIL PROTECTED]
17 matches
Mail list logo