Aha, secret sauce! Thanks for the explanation, Mary!

Ellis.

On 11 Dec 2013, at 15:39, "Mary Holstege" <[email protected]> wrote:

> On Wed, 11 Dec 2013 07:35:28 -0800, Ellis Pritchard <[email protected]>  
> wrote:
> 
>> Great, that's 10x faster than either; care to explain why? :)
>> 
>> Ellis.
> 
> The optimizer recognizes this pattern specifically and executes it
> as a single call across the cluster.  cts:search and xdmp:exists
> are not really functions; they are just written with functional syntax.
> cts:contains, on the other hand, is a function.
> 
> You can observe the difference when you stop talking to the
> database at all:
> 
> xdmp:exists(<node>stuff</node>)  => error!
> cts:contains(<node>stuff</node>,"stuff") => true()
> 
> //Mary
> 
>> 
>> On 11 Dec 2013, at 15:29, Jason Hunter <[email protected]> wrote:
>> 
>>> Try this pattern:
>>> 
>>> xdmp:exists(cts:search(doc(), $query))
>>> 
>>> -jh-
>>> 
>>> On Dec 11, 2013, at 10:18 AM, Ellis Pritchard <[email protected]> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I want to find if *any* document in the database matches a cts:query.
>>>> 
>>>> I had assumed that cts:contains() [1] would be the appropriate  
>>>> function to use, because it doesn't return any documents, and running  
>>>> this under the profiler, verses an xdmp:estimate(cts:search())  
>>>> equivalent [2] shows cts:contains() to slightly faster, but just want  
>>>> to check that's the right assumption, because:
>>>> 
>>>> 1) xdmp:query-trace() seems to indicate that everything in my query is  
>>>> searchable when using cts:contains(), but gives me a "Selected n  
>>>> fragments to filter" message, which the xdmp:estimate(cts:search())  
>>>> version doesn't. It also doesn't mention that it's using the  
>>>> element-range-index, whereas the cts:search() version does.
>>>> 
>>>> 2) xdmp:query-meters() shows that URIs are being returned from the  
>>>> cluster for cts:contains() but the xdmp:estimate() version doesn't  
>>>> show this.
>>>> 
>>>> 3) The profiler shows a fn:collection()/ad:Audit step taking 90% of  
>>>> the time for cts;contains() but has no such step for cts:search().
>>>> 
>>>> 
>>>> That said, on 1.2 million documents, both are within 0.01S of each  
>>>> other, with cts:contains() the faster, but can someone confirm one or  
>>>> the other is best and explain why?
>>>> 
>>>> Cheers!
>>>> 
>>>> Ellis.
>>>> 
>>>> [1]  
>>>> cts:contains(/ad:Audit,cts:element-range-query(xs:QName('ad:TimeStamp'),'<',xs:dateTime('2013-12-01T00:00:00Z')))
>>>> 
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: xdmp:eval("xquery  
>>>> version &quot;1.0-ml&quot;;&#10;&#10;declare namespace ad...", (),  
>>>> <options  
>>>> xmlns="xdmp:eval"><database>4116264424646523266</database><modules>494024936077796...</options>)
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Analyzing path:  
>>>> fn:collection()/ad:Audit
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 1 is  
>>>> searchable: fn:collection()
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 2 is  
>>>> searchable: ad:Audit
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Path is fully  
>>>> searchable.
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Gathering  
>>>> constraints.
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 2  
>>>> contributed 1 constraint: ad:Audit
>>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Executing search.
>>>> 2013-12-11 15:10:29.003 Info: App-Services: at 7:15: Selected 1047  
>>>> fragments to filter
>>>> 
>>>> 
>>>> [2]  
>>>> xdmp:estimate(cts:search(/ad:Audit,cts:element-range-query(xs:QName('ad:TimeStamp'),'<',xs:dateTime('2013-12-01T00:00:00Z'))))
>>>> 
>>>> 2013-12-11 15:11:36.241 Info: App-Services: at 7:26: xdmp:eval("xquery  
>>>> version &quot;1.0-ml&quot;;&#10;&#10;declare namespace ad...", (),  
>>>> <options  
>>>> xmlns="xdmp:eval"><database>4116264424646523266</database><modules>494024936077796...</options>)
>>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Analyzing path  
>>>> for search: fn:collection()/ad:Audit
>>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 1 is  
>>>> searchable: fn:collection()
>>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 2 is  
>>>> searchable: ad:Audit
>>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Path is fully  
>>>> searchable.
>>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Gathering  
>>>> constraints.
>>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 2  
>>>> contributed 1 constraint: ad:Audit
>>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Comparison  
>>>> contributed dateTime range value constraint: ad:TimeStamp <  
>>>> xs:dateTime("2013-12-01T00:00:00Z")
>>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Search query  
>>>> contributed 1 constraint:  
>>>> cts:element-range-query(xs:QName("ad:TimeStamp"), "<",  
>>>> xs:dateTime("2013-12-01T00:00:00Z"), (), 1)
>>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Executing search.
>>>> 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>> 
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> 
> 
> -- 
> Using Opera's revolutionary email client: http://www.opera.com/mail/
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to