I'm looking for advice about how best to solve a querying problem for 
one of our customers using MarkLogic.  We need to be able to search 
documents that are stored in various versions, and we only want to 
search the newest version. What I am interested in here is whether it is 
possible to do this if the "newest version" property has to be computed, 
not stored.  So something like:

for $doc in cts:search(doc(), $random-query)
where $doc/@version = max(doc()/doc[@uri=$doc/@uri]/@version)
return $doc

our documents store the document identifier that associates different 
versions as /doc/@uri

Is that it? Maybe a range index on /doc/@uri would help there?

The real kicker is that we also need to be apply an additional 
constraint in that some users may have access only to certain 
document-versions, so in those cases we need to search only the newest 
accessible version of each doc.

Of course there can be lots of docs matching the $random-query, we want 
to be able to apply sorting criteria, and get the first page of results 
efficiently.

Currently I am planning to generate tags at load-time that should make 
the querying efficient (basically I will mark, for every possible 
accessibility condition, the most current version - this is possible, if 
irritating, due to the structure of the access control rules), but this 
will introduce pain during document ingestion, and relies on 
restrictions of the kind of access control rules we can have. So I'm 
wondering if there is a passable query-time implementation I could use.

-Mike
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to