Mike,

Thanks for the insights and the article link. I liked your idea of using 
xdmp:value(). In XSLT, I sometimes declare entities in the internal DTD subset 
and use that as a macro mechanism for defining reusable match patterns. I know 
xdmp:value() does dynamic evaluation rather than static macro substitution, but 
it will do for my purposes.

Here's what I came up with using xdmp:value():

declare variable $Announcements := docs('Announcement');
declare variable $Events        := docs('Event');
declare variable $Articles      := docs('Article');
declare variable $Posts         := docs('Post');
declare variable $Projects      := docs('Project');

declare function docs($element-name) {
  let $expr := fn:concat("if ($draft:public-docs-only)",
                         "then 
/",$element-name,"[fn:not(@preview-only)][@status eq 'Published']",
                         "else /",$element-name,"[fn:not(@preview-only)]"
                        )
  return
    xdmp:value($expr)
};

Since I want to extend this further (making use of range indexes, for example), 
I'll probably pull xdmp:value() out of the docs() function so that docs() just 
returns the expression part (a string). This is a different model than I'm used 
to and I'm sure there are more things to consider, but I like knowing that I 
have this option. Basically: building up expressions rather than values opens 
up one possible way to exploit indexes without duplicating code.

Evan Lenz
Software Developer, Community
MarkLogic Corporation


On 3/3/11 4:05 PM, "Michael Blakeley" 
<[email protected]<mailto:[email protected]>> wrote:

Evan,

Sometimes you have to make a choice between tidy code and fast code. In this 
case you can't easily avoid repeating code. You could probably build an XPath 
string and evaluate it using xdmp:value(), but that might add more complexity 
than it would be worth.

I touched on this point in the XQuery code review post at 
http://blakeley.com/wordpress/archives/518

Review any function calls within XPath predicates
Function calls inside an XPath predicate can be horrible for performance, since 
the function must be called for every item in the predicate’s input sequence. 
If the result of the function call is static, simply bind the result to a 
variable. This is also true of operators: even simple math operations like:
$list[$start to $start + $size]
should be rewritten as
$list[$start to $stop]
If you have trouble seeing why this might be a problem, consider a list with 
100 items. Now consider this expression:
$list[ xdmp:sleep(100) ]
Evaluation will cost 100-ms per item, or 10 seconds total. Every expression 
takes a finite amount of time to evaluate, and performance optimization is 
sometimes a matter of reducing the expression count.


Those fn:not() calls aren't good for performance either. Consider inverting the 
test, using an attribute "public" instead of "private".

-- Mike

On 3 Mar 2011, at 15:16 , Evan Lenz wrote:

I'm wondering how to write well-factored and optimized XQuery. Here's an 
excerpt from the code for RunDMC (on which developer.marklogic.com runs):
declare variable $Announcements := /Announcement[draft:listed(.)]; (: "News"   
:)
declare variable $Events        := /Event       [draft:listed(.)]; (: "Events" 
:)
declare variable $Articles      := /Article     [draft:listed(.)]; (: "Learn"  
:)
declare variable $Posts         := /Post        [draft:listed(.)]; (: "Blog"   
:)
declare variable $Projects      := /Project     [draft:listed(.)]; (: "Code"   
:)
The problem with the above code is that it ignores the indexes. In fact, each 
one of these expressions filters all the (some 5000, many of them not even XML) 
fragments in the database. I made one addition to each line that helped things 
quite a bit:
declare variable $Announcements := /Announcement/self::*[draft:listed(.)]; (: 
"News"   :)
This makes the first step searchable, since node tests by themselves 
("Announcement") aren't allowed to be searchable, only full steps. By moving 
the unsearchable predicate into a separate (subsequent) step, I've now reduced 
the number of fragments to filter down to, say, 30 <Announcement> docs instead 
of all ~5000 fragments. The addition of the extra "/self::*" is not 
particularly pretty, but it's not terrible either, and it was a small change 
with a huge positive impact (at least according to query-trace()).
But I'd like to do better. To make the paths fully searchable, I'll need to 
pull my constraint out of my function call. Now I have:
declare variable $Announcements :=
  if ($draft:public-docs-only) then 
/Announcement[fn:not(@preview-only)][@status eq 'Published']
                               else /Announcement[fn:not(@preview-only)];
declare variable $Events        :=
  if ($draft:public-docs-only) then /Event       
[fn:not(@preview-only)][@status eq 'Published']
                               else /Event       [fn:not(@preview-only)];
declare variable $Articles      :=
  if ($draft:public-docs-only) then /Article     
[fn:not(@preview-only)][@status eq 'Published']
                               else /Article     [fn:not(@preview-only)];
declare variable $Posts         :=
  if ($draft:public-docs-only) then /Post        
[fn:not(@preview-only)][@status eq 'Published']
                               else /Post        [fn:not(@preview-only)];
declare variable $Projects      :=
  if ($draft:public-docs-only) then /Project     
[fn:not(@preview-only)][@status eq 'Published']
                               else /Project     [fn:not(@preview-only)];
Now, all of my XPath expressions are fully searchable, but things have gotten 
messy and a bunch of code is duplicated. There are also cases where I'll want 
to use range indexes to further constrain the results (such as "get the latest 
two Announcements"), so this will likely only get worse, because as far as I 
can tell, a path that references a variable is unsearchable, e.g., 
$Announcements[date gt …]
Am I pushing index usage too far? Is there another way that I'm not seeing? I 
assume I'm just not approaching things the correct "MarkLogic way". Any ideas 
would be greatly appreciated.
Thanks,
Evan
Evan Lenz
Software Developer, Community
MarkLogic Corporation
Phone +1 360 297 0087
email  [email protected]<mailto:[email protected]>
web    developer.marklogic.com
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to