Hi Michael, thanks for your reply. It made my day! From > 4secs to < 0.4 in less than two hours! :-)
On Sat, May 7, 2011 at 18:16, Michael Blakeley <[email protected]> wrote: > As far as I am aware there is no optimization of XPath expressions across > variable bindings. So the existing query isn't using indexed lookups for much > of anything, but is evaluating many many in-memory expressions. The evaluator > is traversing the entire document structure of everything in > $these_agreements for each code value, looking for matching nodes. > > There are three basic approaches to query optimization, which can be traded > off for specific use cases. We can reduce the expression count; we can > improve the use of indexes; we can reduce the number of database round-trips. > Let's start by trying to reduce the expression count. You could optimize this > a little bit by telling the evaluator that there will be only one match per > code, allowing it to stop as soon as it finds the first match. > > for $one_jurisdiction_code in $related_jurisdictions > (: all agreements between the two jurisdictions :) > let $these_agreements := ( > $the_agreements/eoi:agreement[ > eoi:jurisdictions/eoi:jurisdiction eq $one_jurisdiction_code] )[1] I understand what you mean but I have to get all agreement elements. My return clause is probably misleading as it suggests I'm interested in only one value which is not the case. I have to inspect all agreements and find out whether at least one of them has been "signed", "ratified" or "enforced". > I'd expect a 50% improvement from that change. You could also eliminate one > node-traversal step per code-agreement pair by doing that work up front. I'm > not sure how much that will save, but easy so it's worth a try. > > let $the_agreements := > collection('http://www.eoi-portal.org/agreements')/eoi:agreement > ... > for $one_jurisdiction_code in $related_jurisdictions > (: all agreements between the two jurisdictions :) > let $these_agreements := ( > $the_agreements[ > eoi:jurisdictions/eoi:jurisdiction eq $one_jurisdiction_code] )[1] > > But I suspect it will be more efficient to repeat the collection call > instead. This adds to the number of database round-trips, but should greatly > reduce the expression count. > > let $collection-name := 'http://www.eoi-portal.org/agreements' > for $one_jurisdiction_code in $related_jurisdictions > (: all agreements between the two jurisdictions :) > let $these_agreements := collection($collection-name)/eoi:agreement[ > eoi:jurisdictions/eoi:jurisdiction eq $one_jurisdiction_code ] > ... > > This will result in one call to collection() per code, but each call will use > an indexed lookup on the code. So I suspect it will be more efficient than > filtering all the agreements in memory for every code. If not, you could try > using single collection call to pre-calculate a map, using the codes as map > keys. That did the trick! Using only this made the query ten times faster! Actually, analysing it a bit closer, I also replaced the * with eoi:agreement which also helped. > Note that you don't need those .../text() steps. See > http://blakeley.com/wordpress/archives/518 for some discussion of that idiom. Thanks for reminding me of your article which I read a couple of months ago and which was very instructive. boolean($these_agreements/eoi:enforced/text()) In this case, if all eoi:enforced elements are empty this will return false, if there is at least one which contains a date it will return true. I guess, the following expression would be equivalent but easier to understand? boolean($these_agreements/eoi:enforced[text()]) > -- Mike Thank you very much. Jakob. _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
