Hi All! Im using LWP to get pages and HTML::TreeBuilder to get needed info.
Here is basic scheme: $ua = new LWP::UserAgent; $r = $ua->get($url); $html = decode('web_page encoding', $r->content); at this point i have utf8 content in $html. $r = HTML::TreeBuilder->new_from_content($html); $r->dump; and in output I see lots of html encoded entities: Ссылка н all of them valid HEX codes of unicode symbols. What i need to do to get everythink working as expected ? I meant that i want to see chars as is and not as html encoded entities ? Thanks. PS: Described scheme works well for most of web pages, but with some of theme i have such problems. -- If you think of MS-DOS as mono, and Windows as stereo, then Linux is Dolby Digital and all the music is free...
signature.asc
Description: Digital signature