rustom wrote:
On May 10, 9:49 pm, Steve Howell <showel...@yahoo.com> wrote:
On May 10, 9:10 am, Rustom Mody <rustompm...@gmail.com> wrote:
I am trying to write a recursive filter to remove whitespace-only
nodes for minidom.
The code is below.
Strangely it deletes some whitespace nodes and leaves some.
If I keep calling it -- like so: fws(fws(fws(doc))) then at some
stage all the ws nodes disappear
Does anybody have a clue?
from xml.dom.minidom import parse
#The input to fws is the output of parse("something.xml")
def fws(ele):
""" filter white space (recursive)"""
for c in ele.childNodes:
if isWsNode(c):
ele.removeChild(c)
#c.unlink() Makes no diff whether this is there or not
elif c.nodeType == ele.ELEMENT_NODE:
fws(c)
def isWsNode(ele):
return (ele.nodeType == ele.TEXT_NODE and not ele.data.strip())
I would avoid doing things like delete/remove in a loop. Instead
build a list of things to delete.
Yeah I know. I would write the whole damn thing functionally if I knew
how. But cant figure out the API.
I actually started out to write a (haskell-style) copy out the whole
tree minus the unwanted nodes but could not figure it out
def fws(ele):
""" filter white space (recursive)"""
empty_nodes = []
for c in ele.childNodes:
if isWsNode(c):
empty_nodes.append(c)
elif c.nodeType == ele.ELEMENT_NODE:
fws(c)
for c in empty_nodes:
ele.removeChild(c)
--
http://mail.python.org/mailman/listinfo/python-list