>>>>> "Pedro" == Pedro Proen�A <[EMAIL PROTECTED]> writes:

Pedro> Hi all,
Pedro> When I pass the following string to HTML::Parser:parse()

Pedro> "String containing entities to be replaced, for instance &uarr2;a";

Pedro> this is what I get in my text handler:

Pedro> "String containing entities to be replaced, for instance"

Pedro> I am using Perl 5.6.0 on Mandrake Linux 8.0 (kernel 2.4.3-20mdk) and
Pedro> the latest HTML::Parser version (3.25).
Pedro> It his a known problem?  Is there any work around it?

    $ perl
    use HTML::Parser;
    my @a;
    my $p = HTML::Parser->new( handlers => { text => [\@a, "text" ] });
    $p->parse("String containing entities to be replaced, for instance &uarr2;a");
    $p->eof;

    print map "[$_->[0]]", @a;
    ^D
    [String containing entities to be replaced, for instance][ &uarr2;a]
    $ 

Looks fine to me.  Try that example.  Notice that it pulls it in two pieces.
That's expected unless you also set $p->unbroken_text(1) before parsing.

print "Just another Perl hacker,";
-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[EMAIL PROTECTED]> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Reply via email to