I'm wondering how to write well-factored and optimized XQuery. Here's an 
excerpt from the code for RunDMC (on which developer.marklogic.com runs):

declare variable $Announcements := /Announcement[draft:listed(.)]; (: "News"   
:)
declare variable $Events        := /Event       [draft:listed(.)]; (: "Events" 
:)
declare variable $Articles      := /Article     [draft:listed(.)]; (: "Learn"  
:)
declare variable $Posts         := /Post        [draft:listed(.)]; (: "Blog"   
:)
declare variable $Projects      := /Project     [draft:listed(.)]; (: "Code"   
:)

The problem with the above code is that it ignores the indexes. In fact, each 
one of these expressions filters all the (some 5000, many of them not even XML) 
fragments in the database. I made one addition to each line that helped things 
quite a bit:

declare variable $Announcements := /Announcement/self::*[draft:listed(.)]; (: 
"News"   :)

This makes the first step searchable, since node tests by themselves 
("Announcement") aren't allowed to be searchable, only full steps. By moving 
the unsearchable predicate into a separate (subsequent) step, I've now reduced 
the number of fragments to filter down to, say, 30 <Announcement> docs instead 
of all ~5000 fragments. The addition of the extra "/self::*" is not 
particularly pretty, but it's not terrible either, and it was a small change 
with a huge positive impact (at least according to query-trace()).

But I'd like to do better. To make the paths fully searchable, I'll need to 
pull my constraint out of my function call. Now I have:

declare variable $Announcements :=
 if ($draft:public-docs-only) then /Announcement[fn:not(@preview-only)][@status 
eq 'Published']
                              else /Announcement[fn:not(@preview-only)];
declare variable $Events        :=
 if ($draft:public-docs-only) then /Event       [fn:not(@preview-only)][@status 
eq 'Published']
                              else /Event       [fn:not(@preview-only)];
declare variable $Articles      :=
 if ($draft:public-docs-only) then /Article     [fn:not(@preview-only)][@status 
eq 'Published']
                              else /Article     [fn:not(@preview-only)];
declare variable $Posts         :=
 if ($draft:public-docs-only) then /Post        [fn:not(@preview-only)][@status 
eq 'Published']
                              else /Post        [fn:not(@preview-only)];
declare variable $Projects      :=
 if ($draft:public-docs-only) then /Project     [fn:not(@preview-only)][@status 
eq 'Published']
                              else /Project     [fn:not(@preview-only)];

Now, all of my XPath expressions are fully searchable, but things have gotten 
messy and a bunch of code is duplicated. There are also cases where I'll want 
to use range indexes to further constrain the results (such as "get the latest 
two Announcements"), so this will likely only get worse, because as far as I 
can tell, a path that references a variable is unsearchable, e.g., 
$Announcements[date gt …]

Am I pushing index usage too far? Is there another way that I'm not seeing? I 
assume I'm just not approaching things the correct "MarkLogic way". Any ideas 
would be greatly appreciated.

Thanks,
Evan


Evan Lenz
Software Developer, Community
MarkLogic Corporation

Phone +1 360 297 0087
email  [email protected]<mailto:[email protected]>
web    developer.marklogic.com<http://developer.marklogic.com/>

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to