Hi Rachel,

Can you pass a cts:query into your cts:uri-match call?

How many forests do you have?  More forests might help depending upon what you 
are doing.

But if all of your URIs in your db follow this pattern, ultimately it is going 
to have to search through a lot of URIs.  You could make your URI space a 
little more selective which might speed it up.  Maybe the strings in your URIs 
are all very similar (the URI match is essentially a string compare)?

What kind of hardware are you running on?  The speed of your memory and cpu can 
be a factor here too.

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of Rachel Wilson
Sent: Wednesday, October 22, 2014 9:05 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Surprising slowness of cts:uri-match

Hi,

I was wondering if anyone had a reply to this.

We're digging even deeper into improving our performance for an API and in 
several places (because we use it liberally) cts:uri-match ends up being the 
bottleneck.  We are happy to redesign our data and queries where we can to 
avoid it, but it continues to surprise us that this is the case because we 
thought the uris are indexed and the function is designed to use wildcards 
because it's a matcher.

A typical call would be

   let $uris := cts:uri-match("/project/" || $projectId ||"/jobs/*",

But we're most surprised by this one, we used as a test, because there aren't 
even any wildcards.

   let $thereShouldBeOnlyOne := cts:uri-match("/project/" || $projectId || 
"/content/" || $contentId)

Some insight into the inner workings of that function would be great


From: Rachel Wilson <[email protected]<mailto:[email protected]>>
Date: Thursday, 16 October 2014 17:25
To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Subject: Surprising slowness of cts:uri-match

In our experience cts:uri-match is surprisingly slow.  For example when 
profiling a pretty complicated query taking 0.7 seconds, the single 
cts:uri-match() call takes 70-80% of the total time.  (Shallow% and Deep% being 
the same)

But we thought it should be reading the URI lexicon and so in a database with 
only 483,475 docs should be lightening fast.   We've had to stop using 
cts:uri-match calls in loops for this reason.

Are there any match patterns to be avoided perhaps?  Wildcards in the middle of 
the pattern, rather than trailing wildcards for example?
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to