XML parsing with python
Hi All, I am new to xml . I need to parse the xml file . After reading and browsing on the web , I could get much help . I guess SAX would be better suited for my requirement . Could some juct provide me a sample python code so that I can execute it and see how the parsing actually happens . Lets say my xml file - I,Robot 100 Isaac Asimov Blade Runner 400 Philip K. Dick Lord Of The Rings 2 Tolkien XML-Schema Specification 5000 W3C Aladin 150 Don't know -- I need the output to be - (elements containing 'title' ) I,Robot Blade Runner Lord Of The Rings XML-Schema Specification Aladin Your responses are greatly appreciated . Thanks in advace -- http://mail.python.org/mailman/listinfo/python-list
Re: XML parsing with python
On Aug 17, 8:31 pm, John Posner wrote: > > Use the iterparse() function of the xml.etree.ElementTree package. > > >http://effbot.org/zone/element-iterparse.htm > >http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk > > > Stefan > > iterparse() is too big a hammer for this purpose, IMO. How about this: > > from xml.etree.ElementTree import ElementTree > tree = ElementTree(None, "myfile.xml") > for elem in tree.findall('//book/title'): > print elem.text > > -John Thanks for the prompt reply . I feel let me try using iterparse. Will it be slower compared to SAX parsing ... ultimately I will have a huge xml file to parse ? Another question , I will also need to validate my xml against xsd . I would like to do this validation through the parsing tool itself . Does there exist an Unix utility which validates or even a python library call would be fine ? Thanks in advance -- http://mail.python.org/mailman/listinfo/python-list
Re: XML parsing with python
On Aug 18, 11:24 am, Stefan Behnel wrote: > inder wrote: > > On Aug 17, 8:31 pm, John Posner wrote: > >>> Use the iterparse() function of the xml.etree.ElementTree package. > >>>http://effbot.org/zone/element-iterparse.htm > >>>http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk > >>> Stefan > >> iterparse() is too big a hammer for this purpose, IMO. How about this: > > >> from xml.etree.ElementTree import ElementTree > >> tree = ElementTree(None, "myfile.xml") > >> for elem in tree.findall('//book/title'): > >> print elem.text > > >> -John > > > Thanks for the prompt reply . > > > I feel let me try using iterparse. Will it be slower compared to SAX > > parsing ... ultimately I will have a huge xml file to parse ? > > If you use the cElementTree module, it may even be faster. > > > Another question , I will also need to validate my xml against xsd . I > > would like to do this validation through the parsing tool itself . > > In that case, you can use lxml instead of ElementTree. > > http://codespeak.net/lxml/ > > Stefan Hi , Is lxml part of standard python package ? I am having python 2.5 . I might not be able to use any additional package other than the standard python . Could you please suggest something part of standard python package ? Thanks -- http://mail.python.org/mailman/listinfo/python-list