Stefan Behnel, 09.12.2011 09:02:
I think Py3.3 would be a good milestone for cleaning up the stdlib support
for XML.
[...]

I still think it is, so let me sum up the current discussion here.


What should change?

a) The stdlib documentation should help users to choose the right tool
right from the start.

It looks like there's agreement on this part.


Instead of using the totally misleading wording that
it uses now, it should be honest about the performance characteristics of
MiniDOM and should actively suggest that those who don't know what to
choose (or even *that* they can choose) should not use MiniDOM in the first
place.

There was some disagreement on whether MiniDOM should publicly disclose its performance characteristics in the documentation, and whether its use should be discouraged, even just for new users.

However, it seemed that there was enough consensus to settle on Nick Coghlan's proposal for a compromise to move ElementTree up to the top of the list, and to add a visible note to the top of each of the XML modules like this:

"Note: The
<whatever> module is a <yada, yada, DOM based, whatever>. If all you
are trying to do is read and write XML files, consider using the
xml.etree.ElementTree module instead"

That template could (with a bit of peaking into the getopt documentation) be expanded into the following.

"""
[[Note: The xml.dom.minidom module provides an implementation of the W3C-DOM whose API is similar to that in other programming languages. Users who are unfamiliar with the W3C-DOM interface or who would like to write less code for processing XML files should consider using the xml.etree.ElementTree module instead.]]
"""

I think this should go on the xml.dom.minidom page as well as the xml.dom package page. Hand-wavingly, users who are new to the DOM are more likely to hit the package page first, whereas those who know it already will likely find the MiniDOM page directly.

Note that I'd still encourage the removal of the misleading word "lightweight" until it makes sense to put it back in a meaningful way. I therefore propose the following minimalistic changes to the first paragraph on the minidom page:

"""
xml.dom.minidom is a [-XXX: light-weight] implementation of the Document Object Model interface. It is intended to be simpler than the full DOM and also [+XXX: provide a] significantly smaller [+XXX: API].
"""

@Martin: note how the original paragraph does not refer to "4DOM" or "PyXML". It only generically mentions "the DOM interface". It is certainly not true that MiniDOM is more "light-weight" and "significantly smaller" than (most) other DOM interface implementations outside of the Python world, for example. So the current wording actually makes no sense at all.

Additionally, the documentation on the xml.sax page would benefit from the following paragraph:

"""
[[Note: The xml.sax package provides an implementation of the SAX interface whose API is similar to that in other programming languages. Users who are unfamiliar with the SAX interface or who would like to write less code for efficient stream processing of XML files should consider using the iterparse() function in the xml.etree.ElementTree module instead.]]
"""

If these changes are considered acceptable, I'll copy the above over to the documentation bug I opened at

http://bugs.python.org/issue11379

Can these doc changes go into both 2.7 and 3.3? Given that there is no important difference between the implementations, I don't see why the documentation should differ in Py2.


b) cElementTree should finally loose it's "special" status as a separate
library and disappear as an accelerator module behind ElementTree.

There was no opposition and a general agreement on this in the thread, except for the warning that Fredrik Lundh should have a word in this. I wrote him an e-mail and didn't get a response so far. We can wait a little longer, I guess, there's still time before 3.3beta.

Stefan

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to