Thanks for your info Rob. But look at what you wrote: "I'm pretty sure", "I understand it to mean", "I believe"... My comments and solutions are based on my _actually_writing_code_ to try to do the things you muse about, and _it_did_not_work_. Don't take this the wrong way Rob, I just want to make things clear for other people reading this who might run into the same problem and/or be inclined to try it out.
- - Martin > -----Original Message----- > From: Rob Dixon [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 23, 2003 5:27 PM > To: [EMAIL PROTECTED] > Subject: Re: html tree question. clumsy ? > > > Martin Thurn wrote: > > I ran into similar problems for my module WWW::Search. > > No, out-of-the-box you can not re-use an > HTML::TreeBuilder object to parse > > a new file. > > I'm pretty sure you're wrong about that. HTML::TreeBuilder subclasses > HTML::Parser, which provides the 'new', 'parse', > 'parse_file', and 'eof' > methods. The documentation says: > > After $p->eof has been called, the parse() and > parse_file() methods > can be invoked to feed new documents with the parser object. > > which is poor English, but I understand it to mean that, once > the 'eof' > method has been called, any further calls to 'parse' or 'parse_file' > will create a new HTML tree from scratch. > > > BUT you can use the following code as a "reset". I.e. call > > parse, muck with the tree, do the following four lines, and > call parse > > again. This does the same as new() but without changing > the store_comments, > > store_pis settings, etc: > > > > $self->{'_head'} = $self->insert_element('head',1); > > $self->{'_pos'} = undef; # pull it back up > > $self->{'_body'} = $self->insert_element('body',1); > > $self->{'_pos'} = undef; # pull it back up again > > HTML::Parser will itself insert any implicit <html>, <head> and <body> > tags when further input is parsed. > > > The reason you can't re-use your HTML::Element is because it's a > > reference, and when the tree gets deleted, your Element > gets deleted right > > along with it. > > Once you have called $tree->delete the object no longer exists, but I > believe $tree->delete_content or $tree->eof will allow you to reuse > the same object for parsing a new document. > > Rob > >