Hello all, I'm using BaseX to cluster a set of millions of small XML fragments which look something like this:
<affiliation> <organization>Institut für Organische Chemie der Universität Heidelberg</organization> <country iso-code="DEU"/> </affiliation> I need to cluster based on fragment similarity - so taking into account elements, attributes and text nodes. If I use the entire XML fragment as a grouping key, something like this: for $a at $c in db:open('DB')/item/*/affiliation group by $val := $a ... then will the grouping be equivalent to the functionality of the deep-equal function? First results seem to suggest this, but I want to make sure that grouping is not done on text node value alone or anything like that. Incidentally, BaseX is simply unbelievably fast at executing this - a million fragments clustered and written out to another DB in 16 seconds on a laptop. My congratulations on an amazing product. Regards, Constantine ________________________________ Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.