Is it possible to use HTML::TokeParser to return the raw HTML between
two start tags (from <A> to <A>, not <A> to </A>), as opposed to just the text? My source file contains several blocks of code--containing anchor links for each--that I'm trying to extract by section while maintaining formatting.


Code:

my $p = HTML::TokeParser->new("file.txt" || die "Can't open file.");
while (my $t = $p->get_tag("a")) {
my $name = $t->[1]{name};
next unless $name && ($name eq "anchor");
print "$name : " . $p->get_text("a");

Example HTML source:

<A NAME='anchor1'></A><p>Some text and HTML formatting</p><BR>
<A NAME='anchor2'></A><p>Some text and HTML formatting</p><BR>
...
<A NAME='anchor10'></A><p>Some text and HTML formatting</p><BR>

The above code returns the "Some text and HTML formatting" portions nicely, albeit only as text. Is there an easy way to do this using
HTML::Parser to return the desired portion, with HTML markup included?




Reply via email to