Hi!

I want to find out which document(s) that have the highest number of
occurrences of a particular element-attribute regardless of its value. In
this case //mets:file/@SIZE

I have an range index on this, and operations like getting the count, sum,
avg, min and max works fine and are very performant. But they operate on
the whole database or a subset constrained by another cts:query.

The included code gives me what i want but it is not very performant.

Running this query on a database with 300k documents returns in 1 minute
and 25 seconds.

Is there a better way to solve this apart from including the count of
mets:file/@SIZE as a separate element in each document?

--- code --
xquery version "1.0-ml";
declare namespace mets="http://www.loc.gov/METS/";;
declare variable $sizeRef :=
cts:element-attribute-reference(xs:QName("mets:file"), xs:QName("SIZE"));

declare function local:updateMax($theMap, $size, $uri) {
  let $currentMax := (map:get($theMap, "max"),0)[1]
  let $currentUris := (map:get($theMap, "uris"))
  return if($size > $currentMax) then (
  map:put($theMap, "max", $size),
  map:put($theMap, "uris", $uri)
  ) else if ($size = $currentMax) then (
    map:put($theMap, "uris", ($currentUris, $uri))
  ) else ()
};

let $map := map:new((
    map:entry("max",0),
    map:entry("uris",())
  ))

let $puts := for $uri in cts:uris()
  return local:updateMax($map, cts:count-aggregate($sizeRef,(),
cts:document-query($uri)), $uri)

return $map

--- end code ---

Regards,
Johan Mörén
National Library of Sweden
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to