Hi folks,

I have a question which touches on a thread from a few days ago, on
'sorting efficency question'.  We're exploring using XML files to store
metadata in a URI hierarchy.  One thing we will need to be able to do
is take potentially large result sets (e.g., 'find me all the children
under this branch in the hierarchy') and filter the results.

It seemed obvious that we could program the actual selection of data
directly using MarkLogic specific functions.  However, we'd also like
to be able to use this hierarchy outside of MarkLogic (e.g., exporting
portions of the tree to a filesystem with a Saxon based engine serving
up results).

We've tried to tackle this in two phases, and I was wondering if anyone
on the list has done anything similar, and might have comments regarding
what did (or did not) work well. First, an outline of what we are doing:

Right now we've defined a simple WXS which declares a simple XML language
for declaring what kind of items are of interest. For example, calling
a series of 'selector compose' functions we've written:

  sc:and((
    sc:at("1997"),
    sc:lang("fr")
  ))

might result in the following 'filter' document:

<sel:and xmlns:sel="http://schema.highwire.org/Publishing/Selector";
         xmlns:pub="http://schema.highwire.org/Publishing";>
   <sel:and>
      <sel:period scale="1" test="ge"/>
      <sel:after  inclusive="true"
                  pub:time="1997-01-01T00:00:00"/>
      <sel:before inclusive="false"
                  pub:time="1998-01-01T00:00:00"/>
   </sel:and>
   <sel:lang xml:lang="fr"/>
</sel:and>

telling us we are interested in documents published in the year
year 1997 and which specify that they are in french.

I've written a basic set of evaluator functions to walk the above tree,
testing each terminal node (sel:period, sel:after, sel:before, and
sel:lang) for a match, returning true() or false() as necessary,
short circuiting as appropriate for AND, OR, and NOT, etc.

But it seems like it'd be much better if, when executing within MarkLogic
Server, I could take advantage of cts:search.  I could quickly, I think,
write a function which takes the above selector document and turn it
into a cts query.  I was thinking this might then be used as a 'set'
with which to handle the filtering.

The kicker is, I'd like very much to be able to use either evaluator
module, the 'basic' one or the MarkLogic one, based on which environment
I'm in.  Is the only way I can see to do this is to have main modules
which statically declare an import for one or the other.  Is there a
dynamic way to resolve which file I want to import?

If XQuery were in XML like XSLT I'd just write a stylesheet to process the
import directives as needed for export into the non-MarkLogic environment!


Jim

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
James A. Robinson                       [EMAIL PROTECTED]
Stanford University HighWire Press      http://highwire.stanford.edu/
+1 650 7237294 (Work)                   +1 650 7259335 (Fax)
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to