Re: HTML::Entities and unicode

2013-01-08 Thread Victor Efimov
Hm, seems my previous comment was wrong. $ perl -e 'use Devel::Peek; use HTML::Entities; $str = " "; HTML::Entities::decode_entities( $str); print Dump($str)' SV = PV(0xc7fb78) at 0xca35b0 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0xc9db30 "\240"\0 CUR = 1 LEN = 8 (bytes string, ISO-8859-1, co

Re: HTML::Entities and unicode

2013-01-08 Thread Vangelis Katsikaros
Hi again I checked the Entities.pm: line 230 and 223 The entities has is populated by chr. In the chr() 128-255 range something doesn't seem to work well: For the uuml entity (U+00FC): === perl -e 'use Devel::Peek; $t = chr(252); Dump($t)' SV =

Re: HTML::Entities and unicode

2013-01-08 Thread Victor Efimov
So, sometimes it returns correct UTF-8 character string perl -e 'use open qw/:std :utf8/; use Encode; use Devel::Peek; use HTML::Entities; $str = "€ "; HTML::Entities::decode_entities( $str ); print Dump($str)' SV = PV(0xd67b78) at 0xd95220 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0xd85b60 "\

Re: HTML::Entities and unicode

2013-01-08 Thread Vangelis Katsikaros
Hi Victor :) Yes this is definetely needed if I want to "see" the character in my console properly. However, I am looking at the bytes too. Indeed the Devel::Peek is a much better alternative so see things propelry, thanks! $

Re: HTML::Entities and unicode

2013-01-08 Thread Victor Efimov
Hi, Vangelis =) try perl -e 'use open qw/:std :utf8/; use Encode; use Data::Dumper; use HTML::Entities; $str = " "; HTML::Entities::decode_entities( $str ); print Dumper($str)' perl -e 'use Encode; use Devel::Peek; use HTML::Entities; $str = " "; HTML::Entities::decode_entities( $str ); print Dum