> For example: > > <biological_processess> > <biological_process> > Signal transduction > </biological_process> > <biological_process> > Energy process > </biological_process> > </biological_processess> > > I looked at some tutorials (eg. Ogbuji). Those > examples described to extract all text of nodes and > child nodes.
Hi Mdan, The following might help: http://article.gmane.org/gmane.comp.python.tutor/24986 http://mail.python.org/pipermail/tutor/2005-December/043817.html The second post shows how we can use the findtext() method from an ElementTree. Here's another example that demonstrates how we can treat elements as sequences of their subelements: ################################################################## from elementtree import ElementTree from StringIO import StringIO text = """ <people> <person> <lastName>skywalker</lastName> <firstName>luke</firstName> </person> <person> <lastName>valentine</lastName> <firstName>faye</firstName> </person> <person> <lastName>reynolds</lastName> <firstName>mal</firstName> </person> </people> """ people = ElementTree.fromstring(text) for person in people: print "here's a person:", print person.findtext("firstName"), person.findtext('lastName') ################################################################## Does this make sense? The API allows us to treat an element as a sequence that we can march across, and the loop above marches across every person subelement in people. Another way we could have written the loop above would be: ########################################### >>> for person in people.findall('person'): ... print person.find('firstName').text, ... print person.find('lastName').text ... luke skywalker faye valentine mal reynolds ########################################### Or we might go a little funkier, and just get the first names anywhere in people: ########################################### >>> for firstName in people.findall('.//firstName'): ... print firstName.text ... luke faye mal ########################################### where the subelement "tag" that we're giving findall is really an XPath-query. ".//firstName" is an query in XPath format that says "Give me all the firstName elements anywhere within the current element." The documentation in: http://effbot.org/zone/element.htm#searching-for-subelements should also be helpful. If you have more questions, please feel free to ask. _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor