Just a small message to say that my module XML::Reader is on CPAN: http://search.cpan.org/~keichner/XML-Reader-0.37/lib/XML/Reader.pm
It is most useful to extract XML sequentially (uses constant memory, even with huge XML files). Feedback is most welcome. Here is an example from the documentation: use XML::Reader; my $rdr = XML::Reader->new(\$line2, {filter => 5}, { root => 'customer', branch => ['/@name', '/street', '/city'] }, { root => 'p', branch => '*' }, ); my $out0 = ''; my $out1 = ''; while ($rdr->iterate) { if ($rdr->rx == 0) { my @rv = $rdr->value; $out0 .= sprintf " Cust: Name = %-7s Street = %-12s City = %s\n", $rv[0], $rv[1], $rv[2]; } elsif ($rdr->rx == 1) { $out1 .= " P: ".$rdr->value."\n"; } } print "output0:\n$out0\n"; print "output1:\n$out1\n"; Given the following XML structure as input: my $line2 = q{ <data> <supplier>ggg</supplier> <customer name="o'rob" id="444"> <street>pod alley</street> <city>no city</city> </customer> <customer1 name="troy" id="333"> <street>one way</street> <city>any city</city> </customer1> <tcustomer name="nbc" id="777"> <street>away</street> <city>acity</city> </tcustomer> <supplier>hhh</supplier> <zzz> <customer name='"sue"' id="111"> <street>baker street</street> <city>sidney</city> </customer> </zzz> <order> <database> <customer name="<smith>" id="652"> <street>high street</street> <city>boston</city> </customer> <customer name="&jones" id="184"> <street>maple street</street> <city>new york</city> </customer> <customer name="stewart" id="520"> <street> ring road </street> <city> "'&<A>'" </city> </customer> </database> </order> <dummy value="ttt">test</dummy> <supplier>iii</supplier> <supplier>jjj</supplier> <p> <p>b1</p> <p>b2</p> </p> <p> b3 </p> </data> }; This is the output: output0: Cust: Name = o'rob Street = pod alley City = no city Cust: Name = "sue" Street = baker street City = sidney Cust: Name = <smith> Street = high street City = boston Cust: Name = &jones Street = maple street City = new york Cust: Name = stewart Street = ring road City = "'&<A>'" output1: P: <p><p>b1</p><p>b2</p></p> P: <p>b3</p>