On Wed, 11 Dec 2013 07:35:28 -0800, Ellis Pritchard <[email protected]>  
wrote:

> Great, that's 10x faster than either; care to explain why? :)
>
> Ellis.

The optimizer recognizes this pattern specifically and executes it
as a single call across the cluster.  cts:search and xdmp:exists
are not really functions; they are just written with functional syntax.
cts:contains, on the other hand, is a function.

You can observe the difference when you stop talking to the
database at all:

xdmp:exists(<node>stuff</node>)  => error!
cts:contains(<node>stuff</node>,"stuff") => true()

//Mary

>
> On 11 Dec 2013, at 15:29, Jason Hunter <[email protected]> wrote:
>
>> Try this pattern:
>>
>> xdmp:exists(cts:search(doc(), $query))
>>
>> -jh-
>>
>> On Dec 11, 2013, at 10:18 AM, Ellis Pritchard <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I want to find if *any* document in the database matches a cts:query.
>>>
>>> I had assumed that cts:contains() [1] would be the appropriate  
>>> function to use, because it doesn't return any documents, and running  
>>> this under the profiler, verses an xdmp:estimate(cts:search())  
>>> equivalent [2] shows cts:contains() to slightly faster, but just want  
>>> to check that's the right assumption, because:
>>>
>>> 1) xdmp:query-trace() seems to indicate that everything in my query is  
>>> searchable when using cts:contains(), but gives me a "Selected n  
>>> fragments to filter" message, which the xdmp:estimate(cts:search())  
>>> version doesn't. It also doesn't mention that it's using the  
>>> element-range-index, whereas the cts:search() version does.
>>>
>>> 2) xdmp:query-meters() shows that URIs are being returned from the  
>>> cluster for cts:contains() but the xdmp:estimate() version doesn't  
>>> show this.
>>>
>>> 3) The profiler shows a fn:collection()/ad:Audit step taking 90% of  
>>> the time for cts;contains() but has no such step for cts:search().
>>>
>>>
>>> That said, on 1.2 million documents, both are within 0.01S of each  
>>> other, with cts:contains() the faster, but can someone confirm one or  
>>> the other is best and explain why?
>>>
>>> Cheers!
>>>
>>> Ellis.
>>>
>>> [1]  
>>> cts:contains(/ad:Audit,cts:element-range-query(xs:QName('ad:TimeStamp'),'<',xs:dateTime('2013-12-01T00:00:00Z')))
>>>
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: xdmp:eval("xquery  
>>> version &quot;1.0-ml&quot;;&#10;&#10;declare namespace ad...", (),  
>>> <options  
>>> xmlns="xdmp:eval"><database>4116264424646523266</database><modules>494024936077796...</options>)
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Analyzing path:  
>>> fn:collection()/ad:Audit
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 1 is  
>>> searchable: fn:collection()
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 2 is  
>>> searchable: ad:Audit
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Path is fully  
>>> searchable.
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Gathering  
>>> constraints.
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Step 2  
>>> contributed 1 constraint: ad:Audit
>>> 2013-12-11 15:10:29.001 Info: App-Services: at 7:15: Executing search.
>>> 2013-12-11 15:10:29.003 Info: App-Services: at 7:15: Selected 1047  
>>> fragments to filter
>>>
>>>
>>> [2]  
>>> xdmp:estimate(cts:search(/ad:Audit,cts:element-range-query(xs:QName('ad:TimeStamp'),'<',xs:dateTime('2013-12-01T00:00:00Z'))))
>>>
>>> 2013-12-11 15:11:36.241 Info: App-Services: at 7:26: xdmp:eval("xquery  
>>> version &quot;1.0-ml&quot;;&#10;&#10;declare namespace ad...", (),  
>>> <options  
>>> xmlns="xdmp:eval"><database>4116264424646523266</database><modules>494024936077796...</options>)
>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Analyzing path  
>>> for search: fn:collection()/ad:Audit
>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 1 is  
>>> searchable: fn:collection()
>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 2 is  
>>> searchable: ad:Audit
>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Path is fully  
>>> searchable.
>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Gathering  
>>> constraints.
>>> 2013-12-11 15:11:36.253 Info: App-Services: at 7:26: Step 2  
>>> contributed 1 constraint: ad:Audit
>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Comparison  
>>> contributed dateTime range value constraint: ad:TimeStamp <  
>>> xs:dateTime("2013-12-01T00:00:00Z")
>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Search query  
>>> contributed 1 constraint:  
>>> cts:element-range-query(xs:QName("ad:TimeStamp"), "<",  
>>> xs:dateTime("2013-12-01T00:00:00Z"), (), 1)
>>> 2013-12-11 15:11:36.276 Info: App-Services: at 7:26: Executing search.
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to