From: "Beginner" <[EMAIL PROTECTED]> > Hi, > > I have to do some sanity checks on a large xml file of addresses (snip > below). I have been using XML::LibXML and seem to have started ok but > I am struggling to navigate around a record. > > In the sample date below your'll see some addresses with "DO NOT..." > in. I can locate them easily enough but I am struggling to navigate > back up the DOM to access the code so I can record the code with > faulty addresses.
A bit late and again using a different module: use XML::Rules; # find the tags and print <code> my $parser_find = XML::Rules->new( rules => [ _default => '', line => sub {$_[1]->{_content}."\n\t"}, 'code,lines' => 'content', address => sub { if ($_[1]->{lines} =~ /\s+NOT\s+/) { print $_[1]->{code}."\n"; } } ], ); $parser_find->parse($xml); # filter the <address> tags my $parser_remove = XML::Rules->new( rules => [ _default => 'raw', line => sub { my ($tag, $attrs, $context, $parents) = @_; if ($attrs->{_content} =~ /\s+NOT\s+/) { $parents->[-2]{_remove} = 1; # skip the <lines> and set the attribute # directly in <address> } return [$tag => $attrs]; }, address => sub { return $_[0] => $_[1] unless ($_[1]->{_remove}); return; } ], style => 'filter', ); my $result; open my $FH, '>', \$result; $parser_remove->filter($xml, $FH); close $FH; print $result; __END__ The plus is that this doesn't keep the whole XML in memory, but instead processes the bits as they are read&parsed, which may make a big difference with huge files. Jenda ===== [EMAIL PROTECTED] === http://Jenda.Krynicky.cz ===== When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/