I've managed to sort this out. If you're interested, I changed from a recursive algorithm to iteration and my problem is solved as follows:
#Gets the *next* text token in tree
#Nb. Using POE
sub traverse_tree { my @pile = @{$_[HEAP]{PILE}}; #retrieve pile - dereference array
#pile is a queue of element awaiting
#processing
my $stack = $_[HEAP]{SEEN}; #seen queue - elements already done
my $output = ''; #Quit on End of Tree OR when a text token is found
while ( @pile and $output eq '' ) { #Text
if ( !ref $pile[0] ){
$output .= shift @pile;
}
#An HTML element not previously encountered
elsif ( !$stack->EXISTS($pile[0]) ) {
#Remove this element from pile
my $temp_obj = shift @pile;
#Add children to pile
unshift ( @pile, $temp_obj->content_list );
#Place this element onto seen stack
$stack->Push( $temp_obj=>1 );
}
#A previously encountered element - search for next element
else {
#remove the offending item
my $temp_obj = shift @pile;
#search depth first for 'next' unseen element
until ( @pile ){
#get parent
$temp_obj = $temp_obj->parent();
#add to head of pile if not previously seen
unshift @pile, $temp_obj unless $stack->EXISTS($temp_obj);
}
}}
#Preserve pile
$_[HEAP]{PILE} = [EMAIL PROTECTED];
#Return first text
return $output;}
James Brown wrote:
Dear All,
I have some HTML stored in a tree structure (courtesy of HTML::Treebuilder) and now need to traverse this tree in pre-order.
Basically, I need to get the text-only content of the tree, starting from the specified node, but stop traversal when the first text is received (resuming from this node on next call).
The $node->as_text() method won't help me because it doesn't stop when it reaches the *first* text content - it simply dumps all the text below the specified node.
I have attempted some code, but unfortunately, it gets stuck in an infinite loop. It can traverse a single branch, but it won't navigate up the tree to try the next branch to the right:
#should return the next text-only content in the tree sub get_next_text{
my $tree = shift; my ( @pile ) = ( $tree ); # HTML::TreeBuilder/Element obj array # Where we start our traversal my $text= '';
while ( @pile and !$exit ) {
#######################Debug##################### foreach ( @pile ){ print $_, '=>', $_->tag(), "\n" if ref($_); print $_, "\n" if !ref($_); } print "\n**********\n"; sleep 5; #################################################
if ( !ref($pile[0]) ) { # Text only - store $text .= shift @pile; $exit = 1; # all done } else { # Children - enqueue them unshift @pile, @{$pile[0]->{'_content'}}; }
}; print "\n$text\n"; return $pile[0]; }
Don't feel obliged to help me, but if you get a spare minute, I'd definitely appreciate it ;-)
Thanks,
James.
_______________________________________________ ActivePerl mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
_______________________________________________ ActivePerl mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
