jakecjacobson, 29.01.2010 18:25: > I need to take a XML web resource and split it up into smaller XML > files. I am able to retrieve the web resource but I can't find any > good XML examples. I am just learning Python so forgive me if this > question has been answered many times in the past. > > My resource is like: > > <document> > ... > ... > </document> > <document> > ... > ... > </document>
Is this what you get as a document or is this just /contained/ in the document? Note that XML does not allow more than one root element, so the above is not XML. Each of the two <document>...</document> parts form an XML document by themselves, though. > So in this example, I would need to output 2 files with the contents > of each file what is between the open and close document tag. Are the two files formatted as you show above? In that case, you can simply iterate over the lines and cut the document when you see "<document>". Or, if you are sure that "<document>" only appears as top-most elements and not inside of the documents, you can search for "<document>" in the content (a string, I guess) and split it there. As was pointed out before, once you have these two documents, use the xml.etree package to work with them. Something like this might work: import xml.etree.ElementTree as ET data = urllib2.urlopen(url).read() for part in data.split('<document>'): document = ET.fromstring('<document>'+part) print(document.tag) # ... do other stuff Stefan -- http://mail.python.org/mailman/listinfo/python-list