Aha, secret sauce! Thanks for the explanation, Mary! Ellis.
On 11 Dec 2013, at 15:39, "Mary Holstege" <[email protected]> wrote: > On Wed, 11 Dec 2013 07:35:28 -0800, Ellis Pritchard <[email protected]> > wrote: > >> Great, that's 10x faster than either; care to explain why? :) >> >> Ellis. > > The optimizer recognizes this pattern specifically and executes it > as a single call across the cluster. cts:search and xdmp:exists > are not really functions; they are just written with functional syntax. > cts:contains, on the other hand, is a function. > > You can observe the difference when you stop talking to the > database at all: > > xdmp:exists(<node>stuff</node>) => error! > cts:contains(<node>stuff</node>,"stuff") => true() > > //Mary > >> >> On 11 Dec 2013, at 15:29, Jason Hunter <[email protected]> wrote: >> >>> Try this pattern: >>> >>> xdmp:exists(cts:search(doc(), $query)) >>> >>> -jh- >>> >>> On Dec 11, 2013, at 10:18 AM, Ellis Pritchard <[email protected]> wrote: >>> >>>> Hi, >>>> >>>> I want to find if *any* document in the database matches a cts:query. >>>> >>>> I had assumed that cts:contains() [1] would be the appropriate >>>> function to use, because it doesn't return any documents, and running >>>> this under the profiler, verses an xdmp:estimate(cts:search()) >>>> equivalent [2] shows cts:contains() to slightly faster, but just want >>>> to check that's the right assumption, because: >>>> >>>> 1) xdmp:query-trace() seems to indicate that everything in my query is >>>> searchable when using cts:contains(), but gives me a "Selected n >>>> fragments to filter" message, which the xdmp:estimate(cts:search()) >>>> version doesn't. It also doesn't mention that it's using the >>>> element-range-index, whereas the cts:search() version does. >>>> >>>> 2) xdmp:query-meters() shows that URIs are being returned from the >>>> cluster for cts:contains() but the xdmp:estimate() version doesn't >>>> show this. >>>> >>>> 3) The profiler shows a fn:collection()/ad:Audit step taking 90% of >>>> the time for cts;contains() but has no such step for cts:search(). >>>> >>>> >>>> That said, on 1.2 million documents, both are within 0.01S of each >>>> other, with cts:contains() the faster, but can someone confirm one or >>>> the other is best and explain why? >>>> >>>> Cheers! >>>> >>>> Ellis. >>>> >>>> [1] >>>> cts:contains(/ad:Audit,cts:element-range-query(xs:QName('ad:TimeStamp'),'<',xs:dateTime('2013-12-01T00:00:00Z'))) >>>> >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: xdmp:eval("xquery >>>> version "1.0-ml"; declare namespace ad...", (), >>>> <options >>>> xmlns="xdmp:eval"><database>4116264424646523266</database><modules>494024936077796...</options>) >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Analyzing path: >>>> fn:collection()/ad:Audit >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 1 is >>>> searchable: fn:collection() >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 2 is >>>> searchable: ad:Audit >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Path is fully >>>> searchable. >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Gathering >>>> constraints. >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 2 >>>> contributed 1 constraint: ad:Audit >>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Executing search. >>>> 2013-12-11 15:10:29.003 Info: App-Services: at 7:15: Selected 1047 >>>> fragments to filter >>>> >>>> >>>> [2] >>>> xdmp:estimate(cts:search(/ad:Audit,cts:element-range-query(xs:QName('ad:TimeStamp'),'<',xs:dateTime('2013-12-01T00:00:00Z')))) >>>> >>>> 2013-12-11 15:11:36.241 Info: App-Services: at 7:26: xdmp:eval("xquery >>>> version "1.0-ml"; declare namespace ad...", (), >>>> <options >>>> xmlns="xdmp:eval"><database>4116264424646523266</database><modules>494024936077796...</options>) >>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Analyzing path >>>> for search: fn:collection()/ad:Audit >>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 1 is >>>> searchable: fn:collection() >>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 2 is >>>> searchable: ad:Audit >>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Path is fully >>>> searchable. >>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Gathering >>>> constraints. >>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 2 >>>> contributed 1 constraint: ad:Audit >>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Comparison >>>> contributed dateTime range value constraint: ad:TimeStamp < >>>> xs:dateTime("2013-12-01T00:00:00Z") >>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Search query >>>> contributed 1 constraint: >>>> cts:element-range-query(xs:QName("ad:TimeStamp"), "<", >>>> xs:dateTime("2013-12-01T00:00:00Z"), (), 1) >>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Executing search. >>>> >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > > > -- > Using Opera's revolutionary email client: http://www.opera.com/mail/ > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
