On 10/08/2022 16:34, Stefan Behnel wrote:
Gilles schrieb am 10.08.22 um 15:20:
for row in tree.iter("wpt"):
lat,lon = row.attrib.values()
Note that this assignment depends on the order of the two attributes
in the XML document, i.e. in data that you may not control yourself.
It will break if the provider of your input documents ever decides to
change the order.
I'd probably just use
lat, lon = row.get('lat'), row.get('lon')
Also:
> #remove dups
> no_dups = []
> for row in tree.iter("wpt"):
> lat,lon = row.attrib.values()
> if lat not in no_dups:
> no_dups.append(lat)
> else:
> row.getparent().remove(row)
You're using a list here instead of a set. It might be that a list is
faster for very small amounts of data, but I'd expect a set to win
quite quickly. Regardless of my guessing, you shouldn't be using a
list here unless benchmarking tells you that it's faster. And if you
do, you'd better add a comment for the reasoning. It's just too
surprising to see this implemented with a list, so readers will end up
wasting their time thinking more into it than there is.
Thanks for the tips.
#remove dups
no_dups = set()
for row in tree.iter("wpt"):
lat, lon = row.get('lat'), row.get('lon')
if lat not in no_dups:
no_dups.add(lat)
else:
row.getparent().remove(row)
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com