Evan, Sometimes you have to make a choice between tidy code and fast code. In this case you can't easily avoid repeating code. You could probably build an XPath string and evaluate it using xdmp:value(), but that might add more complexity than it would be worth.
I touched on this point in the XQuery code review post at http://blakeley.com/wordpress/archives/518 > Review any function calls within XPath predicates > Function calls inside an XPath predicate can be horrible for performance, > since the function must be called for every item in the predicate’s input > sequence. If the result of the function call is static, simply bind the > result to a variable. This is also true of operators: even simple math > operations like: > > $list[$start to $start + $size] > should be rewritten as > > $list[$start to $stop] > If you have trouble seeing why this might be a problem, consider a list with > 100 items. Now consider this expression: > > $list[ xdmp:sleep(100) ] > Evaluation will cost 100-ms per item, or 10 seconds total. Every expression > takes a finite amount of time to evaluate, and performance optimization is > sometimes a matter of reducing the expression count. Those fn:not() calls aren't good for performance either. Consider inverting the test, using an attribute "public" instead of "private". -- Mike On 3 Mar 2011, at 15:16 , Evan Lenz wrote: > I'm wondering how to write well-factored and optimized XQuery. Here's an > excerpt from the code for RunDMC (on which developer.marklogic.com runs): > > declare variable $Announcements := /Announcement[draft:listed(.)]; (: "News" > :) > declare variable $Events := /Event [draft:listed(.)]; (: > "Events" :) > declare variable $Articles := /Article [draft:listed(.)]; (: "Learn" > :) > declare variable $Posts := /Post [draft:listed(.)]; (: "Blog" > :) > declare variable $Projects := /Project [draft:listed(.)]; (: "Code" > :) > > The problem with the above code is that it ignores the indexes. In fact, each > one of these expressions filters all the (some 5000, many of them not even > XML) fragments in the database. I made one addition to each line that helped > things quite a bit: > > declare variable $Announcements := /Announcement/self::*[draft:listed(.)]; (: > "News" :) > > This makes the first step searchable, since node tests by themselves > ("Announcement") aren't allowed to be searchable, only full steps. By moving > the unsearchable predicate into a separate (subsequent) step, I've now > reduced the number of fragments to filter down to, say, 30 <Announcement> > docs instead of all ~5000 fragments. The addition of the extra "/self::*" is > not particularly pretty, but it's not terrible either, and it was a small > change with a huge positive impact (at least according to query-trace()). > > But I'd like to do better. To make the paths fully searchable, I'll need to > pull my constraint out of my function call. Now I have: > > declare variable $Announcements := > if ($draft:public-docs-only) then > /Announcement[fn:not(@preview-only)][@status eq 'Published'] > else /Announcement[fn:not(@preview-only)]; > > declare variable $Events := > if ($draft:public-docs-only) then /Event > [fn:not(@preview-only)][@status eq 'Published'] > else /Event [fn:not(@preview-only)]; > > declare variable $Articles := > if ($draft:public-docs-only) then /Article > [fn:not(@preview-only)][@status eq 'Published'] > else /Article [fn:not(@preview-only)]; > > declare variable $Posts := > > if ($draft:public-docs-only) then /Post > [fn:not(@preview-only)][@status eq 'Published'] > else /Post [fn:not(@preview-only)]; > > declare variable $Projects := > > if ($draft:public-docs-only) then /Project > [fn:not(@preview-only)][@status eq 'Published'] > else /Project [fn:not(@preview-only)]; > > > Now, all of my XPath expressions are fully searchable, but things have gotten > messy and a bunch of code is duplicated. There are also cases where I'll want > to use range indexes to further constrain the results (such as "get the > latest two Announcements"), so this will likely only get worse, because as > far as I can tell, a path that references a variable is unsearchable, e.g., > $Announcements[date gt …] > > Am I pushing index usage too far? Is there another way that I'm not seeing? I > assume I'm just not approaching things the correct "MarkLogic way". Any ideas > would be greatly appreciated. > > Thanks, > Evan > > Evan Lenz > Software Developer, Community > MarkLogic Corporation > > Phone +1 360 297 0087 > email [email protected] > web developer.marklogic.com > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
