I didn't think it was a problem as such, I wasn't trying to prematurely optimise I promise but I was curious about the workings under the hood since we use these functions a lot including our slower running queries - investigating those is how this question came up. Think about this as settling a bet ;)
So, I"m still curious - what is dereferencing? is that indeed what happens? Say we have a a database node returned from a query, which isn't the document node, and we call base-uri on it, would the whole document itself necessarily have been put in the expanded tree cache in order to resolve the query? I'm still learning about the roles of the different caches and its turning out to be very helpful to know. PS. We don't have subfragments -----Original Message----- From: Michael Blakeley <[email protected]> Reply-To: MarkLogic Developer Discussion <[email protected]> Date: Monday, 21 October 2013 18:39 To: MarkLogic Developer Discussion <[email protected]> Subject: Re: [MarkLogic Dev General] derferencing documents with document-uri and base-uri? I wouldn't worry about it unless it's clearly a problem: avoid premature optimization. If you have a database node in memory, then it's in the expanded tree cache. So repeated accessor calls for its URI can drive cache lookups and CPU cycles, but should never result in cache misses. Check the xdmp:query-meters output to see this for yourself: you should be able to correlate the number of URI accesses to the expanded-tree-cache-hit count. Things might get a little more expensive if you have subfragments, because crossing fragment boundaries can be expensive. A call to base-uri inside subfragment might have to traverse to the parent fragment - or maybe not, I'd have to design a test to say for certain. But the time to worry is when you have a performance problem, and your test case shows the URI accessor in the profiler output. Then you could think about ways to minimize URI lookups. Switching to functionality, I almost always use xdmp:node-uri rather than document-uri or base-uri. I avoid document-uri simply because I don't want to worry about traversing to root for document-uri, and base-uri because I don't want the behavior where an ancestor element specifies its own base-uri value. That's rare in most XML, but base-uri checks for it and honors it. Checking for that probably slows things down a bit, and honoring it generally doesn't do what I want. So I always use xdmp:node-uri instead. -- Mike On 21 Oct 2013, at 09:54 , Rachel Wilson <[email protected]> wrote: > > I have heard on the grapevine that to use document-uri() or base-uri() >functions is bad for performance, although I can't seem to find anything >about that in MarkLogic's docs or elsewhere on the internet. One of the >reasons given was that using those functions "dereference the document", >or that MarkLogic Server has to go to disk to resolve the uri. Although >I'm not sure what is really meant by "dereference" > > Could someone clear this up. Has the grapevine got the wrong end of the >stick or is it perhaps how the function is used, perhaps in loops, that >is the reason behind this thinking? We use those two functions so much, >particularly base-uri(), in our code that we would consider some rewrites >if it really is something to minimise. > > Many thanks, > Rachel > > > > ---------------------------- > > http://www.bbc.co.uk > This e-mail (and any attachments) is confidential and may contain >personal views which are not the views of the BBC unless specifically >stated. > If you have received it in error, please delete it from your system. > Do not use, copy or disclose the information in any way nor act in >reliance on it and notify the sender immediately. > Please note that the BBC monitors e-mails sent or received. > Further communication will signify your consent to this. > > --------------------- > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general ----------------------------- http://www.bbc.co.uk This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. ----------------------------- _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
