[lxml] Re: [newbie] Way to get tree from root?

Stefan Behnel Thu, 02 Sep 2021 09:26:56 -0700

codecompl...@free.fr schrieb am 02.09.21 um 16:53:

I'm still learning about lxml, and was wondering if there's a way to get the 
tree from the root to avoid writing the file to disk before re-reading it just 
for that:


INPUTFILE = "input.kml"

#get rid of NS
with open(INPUTFILE) as reader:
        content = reader.read()
content= re.sub('<kml.*?>', '<kml>', content,0, re.DOTALL)

If you really want the namespace declarations stripped out, I'd rather doit after parsing, not before. (In fact, I would not do it at all, but youseem to be inclined to do it, for some reason.) Here, you are relying onspecific syntax being used for them, which may or may not be the case in agiven document.

#Read from memory to avoid writing cleaned file to disk and re-read
parser = et.XMLParser(remove_blank_text=True)
root = et.fromstring(bytes(content, encoding='utf8'), parser)
#NameError: name 'tree' is not defined
r = tree.xpath('/Document/name')
print(r[0].tag)


You do not need an ElementTree instance for this. Just use the root element.

I recommend using the find/findall/iterfind() methods over using xpath(),though. They are faster and support incremental searches. And they simplifynamespace usage.


Stefan
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

[lxml] Re: [newbie] Way to get tree from root?

Reply via email to