All - Figured out the problem. Sterling Hanenkamp got me going in the right direction.
Anyway... I was using an abstract example to ask my question, so here is an explanation and my actual code. I am working with the Qualys API and I wanted to pull all scan data back from Qualys so that I can store and mashup the data against other data sources. The DTD for the Qualys xml is: https://qualysapi.qualys.com/scan-1.dtd (This will give you the structure of the XML file) Here is the basic code that I ended up with. This works on the xml file after being retrieved from Qualys. ************************************************* #!/usr/bin/perl -w # Indentation style: 1 tab = 4 spaces require XML::Twig; sub info { my ($xml, $info) = @_; my $elt = $info; if ($elt->is_elt =~ m/(VULN|SERVICE|INFO|PRACTICE)/) { printf "VALUE: %s \n", $elt->parent->parent->parent->att("value"); printf "ENT: %s \n", $elt->is_elt; } if ($elt->is_elt =~ m/(OS|NETBIOS_HOSTNAME)/) { printf "VALUE: %s \n", $elt->parent->att("value"); printf "ENT: %s \n", $elt->is_elt; printf "%s\n", $elt->text; } while ($elt= $elt->next_elt($info) ) { my $localname = $elt->local_name; if ($localname ne '#CDATA' && $localname ne '#PCDATA') { printf "%s: ", $localname; printf "%s\n", $elt->text; } } printf "\n\n"; } #=================================================== #Main program section $xml = new XML::Twig( TwigHandlers => { SERVICE => \&info, VULN => \&info, OS => \&info, NETBIOS_HOSTNAME => \&info, INFO => \&info, PRACTICE => \&info, HEADER => \&info, #_all_ => \&info, # not using _all_ to ignore the toplevel SCAN tag }, error_context => 1, ); # Parse the XML $xml->parsefile('sample.xml'); ****************************************************************** On Fri, Jun 25, 2010 at 7:31 PM, Daryl Fallin <[email protected]> wrote: > Hi All .... > > I have been trying to work with XML::Twig lately to parse an xml file. > > I just want to dump every element/Tag of the xml file. But my while loops > seems to be doing something weird or its the way that XML::Twig is working, > not sure, but I get duplicate information from the original XML file. Its > like it is running part of the while loop twice. > > I know there are other modules that I could use but I am using XML::Twig > for other parts of what will be a larger program and I want the chunking > that XML:Twig allows. > > Any help would be greatly appreciated. > > Here is my sample code: > > #!/usr/bin/perl -w > > require XML::Twig; > > sub info { > my ($xml, $info) = @_; > my $elt = $info; > while ($elt= $elt->next_elt($info) ) > { > $elt->set_remove_cdata(1); > $elt->set_pretty_print("record"); # print one field per > line > printf "%s\n", $elt->sprint; > } > } > > $xml = new XML::Twig( > TwigHandlers => { > XML_DIZ_INFO => \&info, > } > ); > > # Parse the XML > $xml->parsefile('sample.xml'); > > ************************ > > sample.xml > ----------------- > <?xml version="1.0" ?> > <XML_DIZ_INFO> > <MASTER_PAD_VERSION_INFO> > <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION> > <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR> > <MASTER_PAD_INFO>information would go here > </MASTER_PAD_INFO> > </MASTER_PAD_VERSION_INFO> > <Company_Info> > <Company_Name>Moyea Software Co., Ltd.</Company_Name> > <Country>China</Country> > <Company_WebSite_URL>http://www.whatever.com > </Company_WebSite_URL> > <Contact_Info> > <Author_First_Name>Bob</Author_First_Name> > <Author_Last_Name>King</Author_Last_Name> > <Author_Email>[email protected]</Author_Email> > </Contact_Info> > </Company_Info> > </XML_DIZ_INFO> > > ============================================ > The following is the output I get. After the closing </Company_Info> it > should stop. > ============================================ > > <MASTER_PAD_VERSION_INFO> > <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION> > <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR> > <MASTER_PAD_INFO>information would go here </MASTER_PAD_INFO> > </MASTER_PAD_VERSION_INFO> > > <MASTER_PAD_VERSION>1.0</MASTER_PAD_VERSION> > 1.0 > > <MASTER_PAD_EDITOR>Master Editor here</MASTER_PAD_EDITOR> > Master Editor here > > <MASTER_PAD_INFO>information would go here </MASTER_PAD_INFO> > information would go here > > <Company_Info> > <Company_Name>Moyea Software Co., Ltd.</Company_Name> > <Country>China</Country> > <Company_WebSite_URL>http://www.whatever.com</Company_WebSite_URL> > <Contact_Info> > <Author_First_Name>Bob</Author_First_Name> > <Author_Last_Name>King</Author_Last_Name> > <Author_Email>[email protected]</Author_Email> > </Contact_Info> > </Company_Info> > > <Company_Name>Moyea Software Co., Ltd.</Company_Name> > Moyea Software Co., Ltd. > > <Country>China</Country> > China > > <Company_WebSite_URL>http://www.whatever.com</Company_WebSite_URL> > http://www.whatever.com > > <Contact_Info> > <Author_First_Name>Bob</Author_First_Name> > <Author_Last_Name>King</Author_Last_Name> > <Author_Email>[email protected]</Author_Email> > </Contact_Info> > > <Author_First_Name>Bob</Author_First_Name> > Bob > > <Author_Last_Name>King</Author_Last_Name> > King > > <Author_Email>[email protected]</Author_Email> > [email protected] > > >
_______________________________________________ kc mailing list [email protected] http://mail.pm.org/mailman/listinfo/kc
