Thanks for the tip.
On 09/08/2022 17:49, Majewski, Steven Dennis (sdm7g) wrote:
You can also do this maybe more simply in XQuery.
In that case, you may want to remove any whitespace differences on ingest ( or
else, use normalize-space() in comparisons ) [ In BaseX, there is an option to
strip whitespace on parse. ]
Exactly how, depends on how you are defining EQUALs, ( i.e. comparing @lon &
@lat values alone, or including text in the comparison ) and what you want to do in
a conflict ( use first or use last or maybe drop both and report )
let $doc := <data>
<entries>
<wpt lat="46.98520" lon="6.8831">
<name>London</name>
</wpt>
<wpt lat="46.98520" lon="2.8831">
<name>Paris</name>
</wpt>
<wpt lat="46.98520" lon="-4.8831">
<name>Manhattan</name>
</wpt>
<wpt lat="46.98520" lon="6.8831">
<name>London 2</name>
</wpt>
<wpt lat="46.98520" lon="-4.8831">
<name>New York</name>
</wpt>
</entries>
</data>
for $x in $doc/entries/wpt
where not( $x/@lon = $x/following-sibling::wpt/@lon and $x/@lat =
$x/following-sibling::wpt/@lat )
return $x
The above compares @lon & @lat and only includes the last in a conflict:
<wpt lat="46.98520" lon="2.8831"><name>Paris</name></wpt>
<wpt lat="46.98520" lon="6.8831"><name>London</name></wpt>
<wpt lat="46.98520" lon="-4.8831"><name>New York</name></wpt>
Change the “where” condition to compare the complete wpt element and keep the
first occurrence by changing the comparison to preceding-sibling::
where not( $x = $x/preceding-sibling::wpt )
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com