Hello, I've spent the morning trying to parse a simple xml file and have the
following:
import sys
from xml.dom import minidom
doc=minidom.parse('topstories.xml')
items = doc.getElementsByTagName("item")
text=''
for i in items:
t = i.firstChild
print t.nodeName
if t.nodeType == t.TEXT_NODE:
print "TEXT_NODE"
print t.nodeValue
text += t.data
print text
I can't figure out how to print the text value for a text node type. There
must be something obvious I'm missing, any suggestions?
Thanks.
XML is as follows:
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title>Stuff.co.nz - Top Stories</title>
<link>http://www.stuff.co.nz</link>
<description>Top Stories from Stuff.co.nz. New Zealand, world, sport,
business & entertainment news on Stuff.co.nz. </description>
<language>en-nz</language>
<copyright>Fairfax New Zealand Ltd.</copyright>
<ttl>30</ttl>
<image>
<url>/static/images/logo.gif</url>
<title>Stuff News</title>
<link>http://www.stuff.co.nz</link>
</image>
<item id="4423924" count="1">
<title>Prince Harry 'wants to live in Africa'</title>
<link>http://www.stuff.co.nz/4423924a10.html?source=RSStopstories_20080303
</link>
<description>For Prince Harry it must be the ultimate dark irony: to be in
such a privileged position and have so much opportunity, and yet be unable
to fulfil a dream of fighting for the motherland.</description>
<author>EDMUND TADROS</author>
<guid isPermaLink="false">stuff.co.nz/4423924</guid>
<pubDate>Mon, 03 Mar 2008 00:44:00 GMT</pubDate>
</item>
</channel>
</rss>
--
http://mail.python.org/mailman/listinfo/python-list