Thanks for your info Rob.  But look at what you wrote: "I'm pretty sure",
"I understand it to mean", "I believe"...
  My comments and solutions are based on my _actually_writing_code_ to try
to do the things you muse about, and _it_did_not_work_.  
  Don't take this the wrong way Rob, I just want to make things clear for
other people reading this who might run into the same problem and/or be
inclined to try it out.

 - - Martin

> -----Original Message-----
> From: Rob Dixon [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, September 23, 2003 5:27 PM
> To: [EMAIL PROTECTED]
> Subject: Re: html tree question. clumsy ?
> 
> 
> Martin Thurn wrote:
> >   I ran into similar problems for my module WWW::Search.
> >   No, out-of-the-box you can not re-use an 
> HTML::TreeBuilder object to parse
> >   a new file.
> 
> I'm pretty sure you're wrong about that. HTML::TreeBuilder subclasses
> HTML::Parser, which provides the 'new', 'parse', 
> 'parse_file', and 'eof'
> methods. The documentation says:
> 
>         After $p->eof has been called, the parse() and 
> parse_file() methods
>         can be invoked to feed new documents with the parser object.
> 
> which is poor English, but I understand it to mean that, once 
> the 'eof'
> method has been called, any further calls to 'parse' or 'parse_file'
> will create a new HTML tree from scratch.
> 
> > BUT you can use the following code as a "reset". I.e. call
> > parse, muck with the tree, do the following four lines, and 
> call parse
> > again.  This does the same as new() but without changing 
> the store_comments,
> > store_pis settings, etc:
> >
> >   $self->{'_head'} = $self->insert_element('head',1);
> >   $self->{'_pos'} = undef;  # pull it back up
> >   $self->{'_body'} = $self->insert_element('body',1);
> >   $self->{'_pos'} = undef;  # pull it back up again
> 
> HTML::Parser will itself insert any implicit <html>, <head> and <body>
> tags when further input is parsed.
> 
> >   The reason you can't re-use your HTML::Element is because it's a
> > reference, and when the tree gets deleted, your Element 
> gets deleted right
> > along with it.
> 
> Once you have called $tree->delete the object no longer exists, but I
> believe $tree->delete_content or $tree->eof will allow you to reuse
> the same object for parsing a new document.
> 
> Rob
> 
> 

Reply via email to