On Fri, Aug 14, 2009 at 5:35 PM, Shawn H. Corey<shawnhco...@gmail.com> wrote: > Roman Makurin wrote: >> >> dump result is html encoded entities: >> >> <h4> @0.1.5.1 >> <a class="a01" href="hidden_url" rel="bookmark" >> title="Ссылка ">@0.1.5.1.0 >> >> all html entities are valid unicode code points of symbols. But why >> HTML::TreeBuilder convert symbols to entities ? > > Because some browsers do not understand Unicode. Or they didn't. > >> >> If I just do >> print $content, $/; >> everything is ok, all symbols are symbols not html encoded entities. > > Yes, this output is to your screen, not to a browser, so it's encoding in > way that would make it readable. >
I used such scheme with lots of utf8 encoded pages and problems arise only with this page. Why in one cases HTML::TreeBuilder produces human readable output and in others not ? I really dont understand it :( > > -- > Just my 0.00000002 million dollars worth, > Shawn > > Programming is as much about organization and communication > as it is about coding. > > I like Perl; it's the only language where you can bless your > thingy. > -- If you think of MS-DOS as mono, and Windows as stereo, then Linux is Dolby Digital and all the music is free... -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/