It seems like those use cases could be implemented more efficiently without uris-match. The first one could be done with cts:uris and cts:directory-query. The second could use exists(doc($uri)), or cts:uris with cts:directory-query.
Depending on the work uris-match decides to do, it may need to scan the entire uri lexicon for matches. That's O(n) with the number of URIs, probably something like 1M/sec. -- Mike > On Oct 23, 2014, at 01:05, Rachel Wilson <[email protected]> wrote: > > Hi, > > I was wondering if anyone had a reply to this. > > We're digging even deeper into improving our performance for an API and in > several places (because we use it liberally) cts:uri-match ends up being the > bottleneck. We are happy to redesign our data and queries where we can to > avoid it, but it continues to surprise us that this is the case because we > thought the uris are indexed and the function is designed to use wildcards > because it's a matcher. > > A typical call would be > > let $uris := cts:uri-match("/project/" || $projectId ||"/jobs/*", > > But we're most surprised by this one, we used as a test, because there aren't > even any wildcards. > > let $thereShouldBeOnlyOne := cts:uri-match("/project/" || $projectId || > "/content/" || $contentId) > > Some insight into the inner workings of that function would be great > > > From: Rachel Wilson <[email protected]> > Date: Thursday, 16 October 2014 17:25 > To: MarkLogic Developer Discussion <[email protected]> > Subject: Surprising slowness of cts:uri-match > > In our experience cts:uri-match is surprisingly slow. For example when > profiling a pretty complicated query taking 0.7 seconds, the single > cts:uri-match() call takes 70-80% of the total time. (Shallow% and Deep% > being the same) > > But we thought it should be reading the URI lexicon and so in a database with > only 483,475 docs should be lightening fast. We've had to stop using > cts:uri-match calls in loops for this reason. > > Are there any match patterns to be avoided perhaps? Wildcards in the middle > of the pattern, rather than trailing wildcards for example? > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
