Re: [MarkLogic Dev General] Re: Fwd: performance question

James A. Robinson Fri, 06 Jul 2007 09:26:11 -0700

While I'd certainly take the advice of the MarkLogic folks as a first
course to test, I'd be curious whether or not a rewrite of the XQuery
code might net any improvement.


If I'm reading the code correctly, you are building a cartesian product
of the tuple (surname, fname, middlename) for all authors, and then
executing a series of joins to get the actual unique names? And your
goal is to generate a unique list of author names across the entire
volume?

If my reading is correct, I wonder if it might not be faster to build
a key from the names and then process those keys to get back the
distinct values?

I don't have a dataset matching yours, so I can't be sure of this
code, but I wonder if this sort of logic might be any faster?

<result>{
  let $authors :=
    for $author in cts:search(/article, 
cts:directory-query("/journal/coden/vol/","infinity"))/front/authgrp/author
    let $surname := $author/surname
    let $fname   := $author/fname
    let $midname := string-join($author/middlename, " ")
    let $key     := string-join(($surname,$fname,$midname), "|")
    where exists($author/surname)
    return
      <author key="{$key}">{
        $surname,
        $fname,
        if ($midname ne '') then <midname>{$midname}</midname> else ()
      }</author>
  let $keys :=
    for $key in distinct-values($authors/@key)
    order by $key
    return $key
  return
    for $key in $keys
    return <author>[EMAIL PROTECTED] eq $key][1]/*}</author>
}</result>

I imagine the last statement, extracting the unique author names, would
still be pokey of course (since it can't use an index), so it's probably
not really a solution to your problem.


Jim

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
James A. Robinson                       [EMAIL PROTECTED]
Stanford University HighWire Press      http://highwire.stanford.edu/
+1 650 7237294 (Work)                   +1 650 7259335 (Fax)
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Re: Fwd: performance question

Reply via email to