Hi. I had to parse a rather large number of web pages and I needed to do exactly the same thing. What I did was to set:
$/ = "<tag>"; This sets the end of line to what ever is between the quotes. Thus when you read in a line you will move from, in your case, tag <A> to <A>. Just be sure that you switch $/ back to an end of line when you finish parsing the file. It can play havoc with the rest of your program if you forget. -Daniel --- Maqo <[EMAIL PROTECTED]> wrote: > Is it possible to use HTML::TokeParser to return the > raw HTML between > two start tags (from <A> to <A>, not <A> to </A>), > as opposed to just > the text? My source file contains several blocks of > code--containing > anchor links for each--that I'm trying to extract by > section while > maintaining formatting. > > Code: > > my $p = HTML::TokeParser->new("file.txt" || die > "Can't open file."); > while (my $t = $p->get_tag("a")) { > my $name = $t->[1]{name}; > next unless $name && ($name eq "anchor"); > print "$name : " . $p->get_text("a"); > > Example HTML source: > > <A NAME='anchor1'></A><p>Some text and HTML > formatting</p><BR> > <A NAME='anchor2'></A><p>Some text and HTML > formatting</p><BR> > ... > <A NAME='anchor10'></A><p>Some text and HTML > formatting</p><BR> > > The above code returns the "Some text and HTML > formatting" portions > nicely, albeit only as text. Is there an easy way > to do this using > HTML::Parser to return the desired portion, with > HTML markup included? > >