Some follow up info.
Here's a full xquery I profiled:

xquery version '1.0-ml';

for $m in collection("logsummary")
let $b := base-uri($m)
return ($b , exists(cts:collection-match($b)))


Profiled 39071 expressions in PT1M3.792322S

location

expression

count

shallow-%

shallow-µs

deep-%

deep-µs

.main: 5

cts:collection-match($b)

13023

51

32399272

51

32399272

.main: 5

fn:exists(cts:collection-match($b))

13023

48

30818399

99

63217671

.main: 3

fn:collection("logsummary")

1

0.69

437268

100

63786656

.main: 4

fn:base-uri($m)

13023

0.21

131717

0.21

131717

.main: 3

for $m in fn:collection("logsummary") let $b := fn:base-uri($m) return ($b, 
fn:exists(cts:collection-match($b)))

1

0.00019

119

100

63786775



Compare to

xquery version '1.0-ml';

for $m in collection("logsummary")
let $b := base-uri($m)
return ($b , exists(collection($b)))

Profiled 39071 expressions in PT1.274159S

location

expression

count

shallow-%

shallow-µs

deep-%

deep-µs

.main: 5

fn:collection($b)

13023

81

1027239

81

1027239

.main: 3

fn:collection("logsummary")

1

13

168090

100

1268566

.main: 4

fn:base-uri($m)

13023

3.4

43440

3.4

43440

.main: 5

fn:exists(fn:collection($b))

13023

2.3

29797

83

1057036

.main: 3

for $m in fn:collection("logsummary") let $b := fn:base-uri($m) return ($b, 
fn:exists(fn:collection($b)))

1

0.0023

29

100

1268595



----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
[email protected]<mailto:[email protected]>
812-482-5224

From: [email protected] 
[mailto:[email protected]] On Behalf Of Lee, David
Sent: Monday, August 29, 2011 3:45 PM
To: General Mark Logic Developer Discussion ([email protected])
Subject: [MarkLogic Dev General] Interesting performance difference between 
collection() and cts:collection-match()

I have a set of "summary" documents and want to use a collection from each 
document to group a set of detail documents (which may or may not be loaded).
As an experiment I'm listing all the summary documents with some stats, and 
then printing true/false if the collection exists (presumably meaning documents 
have been loaded for that collection.
For simplicity I'm using the URI of the summary document to be the collection 
name for the details.

My first attempt was :
for $m in ...
return ....
                 exists( cts:collection-match($m/base-uri()) )



With a result set of about 100 documents ($m) this took > 20 seconds !

I changed it to

                 exists( cts:collection ($m/base-uri()) )

and now it takes 3 seconds

The 3 seconds is really doing all the other work.   But only changing that one 
line is a 17 sec difference.

Any ideas why such a performance difference ? I would have thought 
cts:collection-match() would be more efficient ... as it should use the 
collections lexicon.

(I checked and the collection lexicon is turned on).



----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
[email protected]<mailto:[email protected]>
812-482-5224

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to