Will, I suspect your query or xpath was faster because it was loading fewer fragments. 90% of the time, one doesn't use fragment roots (or children), so the fragments are then full documents. But by defining fragment roots for large documents the server can retrieve just part of a document to compute a query or xpath expression, which speeds things up in special cases.
Normally, xdmp:query-meters() is a good tool to add. It will show the number of fully-retreived, "expanded" documents. Add together the expanded-tree-cache-hits and expanded-tree-cache-misses to get the total number of documents loaded off disk or from cache and "expanded" into their full internal XML form. If the total number of expanded documents is larger than the number of documents that contain the items actually returned, that suggests the indexes didn't do enough "index resolution" and "filtering" was used to remove many candidate matches. Yours, Damon ________________________________ From: [email protected] [[email protected]] On Behalf Of Will Thompson [[email protected]] Sent: Tuesday, August 02, 2011 4:10 PM To: General MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] xpath to cts query question Which query tuning/analysis tool should be used to determine if descendant::some-node is hitting an index? I eventually realized that I didn’t have all of the descendant elements I was searching for set as fragment roots, which once configured sped up the query almost 100x in some cases. I was using xdmp:query-trace(), which indicated that my paths were all fully searchable; however, I don’t know how I would have determined which additional elements needed to be set as fragment roots. Thanks, Will From: [email protected] [mailto:[email protected]] On Behalf Of Will Thompson Sent: Thursday, July 28, 2011 11:57 AM To: General MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] xpath to cts query question Thanks Darin. I did some more testing, and it looks like an optimized Xpath approach (more along the lines of Mike’s second suggestion) might actually be the fastest. (I’m not sure how to use a cts:uris query like you suggested, so I didn’t get a chance to test it) – this is what I landed on: xdmp:directory($coll, "infinity")/(descendant-or-self::chapter|descendant-or-self::subchapter|descendant-or-self::section) [@enum eq $enum] The fastest cts:query approach was a naïve union of cts:searches, but it was still much slower than the Xpath: (cts:search(xdmp:directory($coll, "infinity")//chapter, cts:element-attribute-value-query( xs:QName("chapter"), xs:QName("enum"),$enum)) |cts:search(xdmp:directory($coll, "infinity")//subchapter, cts:element-attribute-value-query( xs:QName("subchapter"), xs:QName("enum"),$enum)) |cts:search(xdmp:directory($coll, "infinity")//section, cts:element-attribute-value-query( xs:QName("section"), xs:QName("enum"),$enum))) Based on CQ’s profiler, the cts:searches had far fewer expressions to evaluate (17), but still took 0.147s to execute, while the xpath evaluated 1370 expressions and executed in only 0.01s. I don’t know the explanation for this other than the ML XPath evaluator must be pretty good – a query-trace() confirmed that is was also hitting the range index. And it appears that there is some overhead to a cts:search that Xpath doesn’t have. Thanks for all of your suggestions. -Will From: [email protected] [mailto:[email protected]] On Behalf Of Darin McBeath Sent: Thursday, July 28, 2011 8:21 AM To: General MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] xpath to cts query question One other idea you might want to try (not sure if it will help or not but would be easy enough to experiment with). Assuming you have a Lexicon URI, you could use a cts:uris query to identify those documents containing the chapter, subchapter, or section with the attribute value. You could then iterate over these URIs with a doc($uri) and a simple XPath expression to retrieve only the chapter, subchapter, section with the attribute value. If you don't know the QNames in advance, you could use xdmp:unpath. Logically, I believe this is what Mike is suggesting below ... just doing it a bit different. Like I said, don't know if it would be slower/faster. Darin. ________________________________ From: Michael Blakeley <[email protected]> To: General MarkLogic Developer Discussion <[email protected]> Sent: Wednesday, July 27, 2011 7:59 PM Subject: Re: [MarkLogic Dev General] xpath to cts query question You might get a little faster by using doc() for the cts:search arg1, and relying on XPath to walk the trees. And since you are doing that, don't bother with filtering in cts:search. let $qnames := for $i in ("chapter", "subchapter", "section") return xs:QName($i) return cts:search( doc(), cts:and-query( (cts:directory-query($coll, "infinity"), cts:element-attribute-value-query( $qnames, xs:QName("enum"), $enum))), "unfiltered")//(chapter | subchapter | section)[@enum eq $enum] I don't know if that will be faster or slower, but it's worth a try. Another variation is to go back to the XPath, and enumerate all the possible path expressions. /(a[@v eq $v] | a/b[@v eq $v] | a/b/c[@v eq $v]) -- Mike On 27 Jul 2011, at 16:50 , Will Thompson wrote: > This is the best I could come up with: > > let $qnames := for $i in ("chapter","subchapter","section") return > xs:QName($i) > return cts:search(//(chapter|subchapter|section), > cts:and-query(( > cts:directory-query($coll,"infinity"), > cts:element-attribute-value-query($qnames, > xs:QName("enum"),$enum))))[@enum eq $enum] > > This is still about twice as fast as the xpath, even if I can't easily work > around the predicate at the end. > > What's interesting is that the value query is slightly faster than the > element range query. I assume they're both using the range index, and the > value query is just doing it in less steps. > > Thanks for your help. > > -Will > > -----Original Message----- > From: > [email protected]<mailto:[email protected]> > > [mailto:[email protected]<mailto:[email protected]>] > On Behalf Of Will Thompson > Sent: Wednesday, July 27, 2011 6:07 PM > To: General MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] xpath to cts query question > > The data unfortunately won't allow for a more specific path. I was trying to > do something along the lines of what Mike suggested to utilize an attribute > range index, but the problem is that the cts:search will return multiple > documents because it includes ancestors, while the Xpath does not. > > Here's a less vague example: > > //(chapter|subchapter|section)[@enum="123"] will only return, say, the > section that matches, > > but this: > > cts:search( > doc(), > cts:element-attribute-range-query( > (xs:QName("chapter"), xs:QName("subchapter"), xs:QName("section")), > "=", > xs:QName("enum"), $enum) > ) > ) > > will return the section and its ancestor chapter and subchapter, since they > are included in the searchable expression. The only way I could think to work > around this is separate queries, each with a searchable expression that > corresponds to the range query. > > -Will > > > -----Original Message----- > From: > [email protected]<mailto:[email protected]> > > [mailto:[email protected]<mailto:[email protected]>] > On Behalf Of Danny Sokolsky > Sent: Wednesday, July 27, 2011 5:51 PM > To: General MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] xpath to cts query question > > Another thing that might help is if you know the full path to the nodes. > With //, it will have to look for the nodes anywhere in the documents. > > -Danny > > -----Original Message----- > From: > [email protected]<mailto:[email protected]> > > [mailto:[email protected]<mailto:[email protected]>] > On Behalf Of Michael Blakeley > Sent: Wednesday, July 27, 2011 3:49 PM > To: General MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] xpath to cts query question > > xdmp:plan help with that: > > ... > <qry:info-trace>Analyzing path: > fn:collection()/descendant-or-self::node()/(a|b|c|d|e|f|g)[@foo = > "bar"]</qry:info-trace> > <qry:info-trace>Step 1 is searchable: fn:collection()</qry:info-trace> > <qry:info-trace>Step 2 does not use indexes: > descendant-or-self::node()</qry:info-trace> > <qry:info-trace>Step 3 is searchable: (a|b|c|d|e|f|g)[@foo = > "bar"]</qry:info-trace> > <qry:info-trace>Path is fully searchable.</qry:info-trace> > <qry:info-trace>Gathering constraints.</qry:info-trace> > <qry:info-trace>Step 3 predicate 1 contributed 1 constraint: @foo = > "bar"</qry:info-trace> > ... > > The cts:query would be something like: > > xdmp:plan( > cts:search(doc(), > cts:element-attribute-value-query( > for $i in ('a', 'b', 'c', 'd', 'e', 'f', 'g') > return xs:QName($i), > xs:QName('foo'), 'bar'))) > > If that isn't fast enough, the next step might be an element-attribute range > index on every element-attribute combination, and switching to > cts:element-range-query with operator '='. > > -- Mike > > On 27 Jul 2011, at 15:36 , Will Thompson wrote: > >> Thanks Danny. What I'm mainly trying to do is speed up some slow xpath. I've >> optimized a lot of this module, but this xpath seems to be one of the >> remaining bottlenecks: //(a|b|c|d|e|f|g)[@foo = "bar"]. I thought that by >> converting it to a cts:query it would be faster. Or is this Xpath already >> going to be optimized by MLS? >> >> -Will >> >> -----Original Message----- >> From: >> [email protected]<mailto:[email protected]> >> >> [mailto:[email protected]<mailto:[email protected]>] >> On Behalf Of Danny Sokolsky >> Sent: Wednesday, July 27, 2011 5:20 PM >> To: General MarkLogic Developer Discussion >> Subject: Re: [MarkLogic Dev General] xpath to cts query question >> >> Hi Will, >> >> I might not be understanding what you are doing here, but here are a few >> ideas. >> >> I think you can use that XPath in the first arg of cts:search, as long as >> you do not put any variables in it. Something like this: >> >> cts:search(//(a|b|c|d|e|f|g)[@foo = "bar"], "hello") >> >> Also, in cts:query, you can do a cts:element-query with the >> cts:element-attribute-query as its second arg. Something like: >> >> cts:element-query((xs:QName("a"), xs:QName("b")), >> cts:element-attribute-word-query((xs:QName("a"), >> xs:QName("b")), xs:QName("foo"), "bar")) >> >> -Danny >> >> -----Original Message----- >> From: >> [email protected]<mailto:[email protected]> >> >> [mailto:[email protected]<mailto:[email protected]>] >> On Behalf Of Will Thompson >> Sent: Wednesday, July 27, 2011 2:51 PM >> To: General MarkLogic Developer Discussion >> Subject: [MarkLogic Dev General] xpath to cts query question >> >> I'm trying to create the cts equivalent of essentially this: >> >> //(a|b|c|d|e|f|g)[@attr = $val] >> >> But it seems like I would have join multiple cts:search()s, one for each >> element, since I only want the matching element, and not its parent (so I >> can't do something like cts:search(//(a|b|c|d|e|f|g), >> cts:element-attribute-value-query((xs:QName("a"),...,(xs:QName("g")),xs:QName("attr"),$val)). >> >> cts:search(//a, >> cts:element-attribute-value-query(xs:QName("a"),xs:QName("attr"),$val)) >> | cts:search(//b, >> cts:element-attribute-value-query(xs:QName("b"),xs:QName("attr"),$val)) >> | cts:search(//c, >> cts:element-attribute-value-query(xs:QName("c"),xs:QName("attr"),$val)) >> ... >> | cts:search(//g, >> cts:element-attribute-value-query(xs:QName("g"),xs:QName("attr"),$val)) >> >> Is there a better way to do this? >> >> Thank you! >> >> -Will >> _______________________________________________ >> General mailing list >> [email protected]<mailto:[email protected]> >> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected]<mailto:[email protected]> >> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected]<mailto:[email protected]> >> http://developer.marklogic.com/mailman/listinfo/general >> > > _______________________________________________ > General mailing list > [email protected]<mailto:[email protected]> > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected]<mailto:[email protected]> > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected]<mailto:[email protected]> > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected]<mailto:[email protected]> > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected]<mailto:[email protected]> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
