This might get you started:

let $size := 1000
let $distinct-element-qnames := distinct-values(
   for $i in doc()[1 to $size]//*
   return node-name($i) )
for $qn in $distinct-element-qnames
let $frequency := xdmp:estimate(
   cts:search(doc(), cts:element-query($qn, cts:and-query(()) ) ) )
order by $frequency descending
return element element {
   attribute local-name { local-name-from-QName($qn) },
   attribute namespace { namespace-uri-from-QName($qn) },
   attribute frequency { $frequency } }

The frequencies will cover the entire database, but you may need to 
increase $size until you are confident that you have coverage of all 
QNames. Starting from doc()[1 to $size] ensures a random sample of the 
available documents in stable order.

-- Mike

On 2010-09-16 08:49, Alf Eaton wrote:
> I'm hoping to be able to inspect a fairly large collection of
> documents and list the distinct element names, attribute names and
> their usage frequency. Is it possible to do this using built-in
> MarkLogic functions (perhaps by inspecting the indexes)?
>
> alf
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to