Hal Styli, 01.03.2010 00:15: > Stefan, I was happy to see such concise code. > Your python worked with only very minor modifications. > > Hai's test xml data *without* the first and last line is close enough > to the data I am using: > > <order customer="john" product="eggs" quantity="12" /> > <order customer="cindy" product="bread" quantity="1" /> > <order customer="larry" product="tea bags" quantity="100" /> > <order customer="john" product="butter" quantity="1" /> > <order product="chicken" quantity="2" customer="derek" /> > > ... quirky. > > I get a large file given to me in this format. I believe it is > created by something like: > grep 'customer=' *.xml, where there are a large number of xml files.
Try to get this fixed at the source. Exporting non-XML that looks like XML is not a good idea in general, and it means that everyone who wants to read the data has to adapt, instead of fixing the source once and for all. > I had to edit the data to include the first and last lines, <orders> > and </orders>, > to get the python code to work. It's not an arduous task(!), but can > you recommend a way to get it to work without > manually editing the data? Iff this cannot be fixed at the source, you can write a file-like wrapper around a file that simply returns the boundary tags before and after reading from the file itself. All you need is a .read(n) method, see the documentation of the file type. > One other thing, what's the Roland Mueller post above about (I'm > viewing htis in google groups)? What would the test.xsl file look > like? This is the XSLT script he posted: ============================ <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" version="1.0"> <!-- text output because we want to have an CSV file --> <xsl:output method="text"/> <!-- remove all whitespace coming with input XML --> <xsl:strip-space elements="*"/> <!-- matches any <order> element and extracts the customer,product&quantity attributes --> <xsl:template match="order"> <xsl:value-of select="@customer"/> <xsl:text>,</xsl:text> <xsl:value-of select="@product"/> <xsl:text>,</xsl:text> <xsl:value-of select="@quantity"/> <xsl:text> </xsl:text> </xsl:template> </xsl:stylesheet> ============================ Stefan -- http://mail.python.org/mailman/listinfo/python-list