I was wondering if there any other good examples of when Perl will
silently upgrade
(as in utf8::upgrade) a string. For instance, perl will do this when
you concatenate
a non-utf8 string with a utf8 string:
$a = Hello; # utf8 flag not set
$b = chr(1024);
$a .= $b; # $a now has its
On 10/19/07, Juerd Waalboer [EMAIL PROTECTED] wrote:
E R skribis 2007-10-19 17:14 (-0500):
So it seems that in light of this one should always use Encode::encode with
these modules to ensure the data is represented the way you want it.
Encode::encode, Encode::encode_utf8, or utf8::encode
On 10/22/07, Juerd Waalboer [EMAIL PROTECTED] wrote:
There's an alternative way of viewing this: there are two types of
strings: binary and text. If you encode text, you get binary.
I think I'm trying to make a slightly different point: part of what
Encode::encode MUST do is to create a Perl
On 10/17/07, Juerd Waalboer [EMAIL PROTECTED] wrote:
utf8::downgrade();
Thanks!
I should have added that in my presentation I am attempting to present
Perl strings from a character set agnostic perspective. So, even
though there is a strong bias for Perl to treat character ordinals
255 as Unicode code-points, I don't want people to automatically think
Unicode when
is this regex:
$has_wide = ($str =~ m/[^\0-\377]/);
the same as this function?
sub has_wide {
my $str = shift;
for my $i (0..length($str)-1) {
return 1 if (ord(substr($str, $i, 1)) = 256);
}
0;
}
but this doesn't seem to work:
$has_wide = ($str =~ m/[\x{100}-]/);
Hello,
I need an efficient way to do this:
my $buf;
sub append {
my $x = shift;
my $new;
for (my $i = 0; $i length($x); $i++) {
$new .= chr(ord(substr($x, $i, 1)));
}
$buf .= $new;
}
In practice, $buf will not have its utf8 flag set, and $x may have it
set, but will not contain
Just a couple of questions:
1. What is the result of Encode::encode(iso-8559-1, $x) if $x is not
a utf8 string (i.e. Encode::is_utf8($x) returns false.)
2. What is the result of $string = decode(iso-8859-1, $octets) if
$octets is a utf8 string?
Thanks!
the overhead to constantly look up the
encoder sub for every fragment of HTML I need to escape.
Thanks...
On 10/15/07, Juerd Waalboer [EMAIL PROTECTED] wrote:
E R skribis 2007-10-15 16:25 (-0500):
1. What is the result of Encode::encode(iso-8559-1, $x) if $x is not
a utf8 string (i.e. Encode