So here's a little more color on this, if anyone is still interested. When I profile this code, where $query is a fairly complex serialized query that was previously computed and stored in a database:
declare variable $q1 := cts:registered-query (cts:register (cts:query ($query)), "unfiltered"); cts:search (fn:doc(), $q1)[1 to 5] The top two items on the profile output are: Shallow% Shallow usecs Deep% Deep usecs Expression 80 125000 90 140000 cts:query($query) 10 16000 100 156000 cts:registered-query (cts:register (cts:query ($query)), "unfiltered") Time spent on the actual search is so small it rounded to zero. Doing this repeatedly yields similar timing, so it's not a cold cache situation or anything like that. Profiling this: declare variable $q2 := cts:registered-query (9156609332438599120, "unfiltered"); cts:search (fn:doc(), $q2)[1 to 5] Yields times too fast to measure (all rounded to zero) So, the potentially expensive to create query is being built every time and possibly being re-registered as well, given that cts:registered-query is taking a non-trivial amount of time. On Jul 31, 2013, at 8:38 AM, Ron Hitchens <[email protected]> wrote: > > The overall entitlement query on each request is composed > of many sub-queries, some of which are static and registered, > some of which are dependent on the current time. But even the > static ones are not finite, new ones can be created at any time > as part of a new entitlement definition. > > I'm working on a scheme to catch and re-register all the > static queries in a given query tree when a search fails due > to a missing registration. That should lazily re-register > on first use after a server restart as well. > > --- > Ron Hitchens {mailto:[email protected]} Ronsoft Technologies > +44 7879 358 212 (voice) http://www.ronsoft.com > +1 707 924 3878 (fax) Bit Twiddling At Its Finest > "No amount of belief establishes any fact." -Unknown > > > On Jul 30, 2013, at 8:30 PM, Geert Josten <[email protected]> wrote: > >> Hi Ron, >> >> Are your queries such that you would have a finite number of sub-queries, >> if you would break them into smaller subparts? Perhaps you can combine >> multiple registered queries.. >> >> Cheers, >> Geert >> >>> -----Oorspronkelijk bericht----- >>> Van: [email protected] [mailto:general- >>> [email protected]] Namens Ron Hitchens >>> Verzonden: dinsdag 30 juli 2013 2:29 >>> Aan: MarkLogic Developer Discussion >>> Onderwerp: Re: [MarkLogic Dev General] Registered Query Best Practices >>> >>> >>> Hi Geert, >>> >>> I've done something before where we stored reg ids in a map for >>> easy re-use. In that case, there was a 1:1 correspondence between >>> the reg id and a meaningful business domain number. On this project >>> that's not the case. >>> >>> Also, there is not a finite set of queries that need to be registered >>> so it's not feasible to pre-register everything once. New ones can be >>> created >>> dynamically. And the complicated queries are persisted in another >>> database >>> and can be referenced later. This means the queries which should be >>> registered >>> will persist across server restarts. Which means there must be a way to >>> register the queries on first use, then make use of those registered >> queries >>> on subsequent requests. >>> >>> The re-register-before-each-use pattern solves that nicely, but not >> if >>> the query construction cost must be re-paid each time. It looks like >> the >>> robust solution is going to have to be catching exceptions for >> unregistered >>> queries and reconstructing the registrations. It's a shame because that >> is >>> going to add unnecessary complexity to the code. >>> >>> --- >>> Ron Hitchens {mailto:[email protected]} Ronsoft Technologies >>> +44 7879 358 212 (voice) http://www.ronsoft.com >>> +1 707 924 3878 (fax) Bit Twiddling At Its Finest >>> "No amount of belief establishes any fact." -Unknown >>> >>> >>> On Jul 29, 2013, at 8:15 PM, Geert Josten <[email protected]> wrote: >>> >>>> Hi Ron, >>>> >>>> I recently saw a strategy where they deliberately took a different >>>> approach. In their case the calculation of the queries was not >>>> straight-forward and could run into 30k search terms. Additionally, >>>> registering the query, and warming up cache by doing one initial >> search >>>> after registering each query took most time. They were searching >> roughly >>>> 40mln docs. The searches themselves were subsec.. >>>> >>>> Their approach was to store all registered query id's somewhere, and >> have >>>> them readily available at actual search time. They also used a try >> catch >>>> to catch unregistered queries, though in their case they shouldn't >>>> actually occur, and these dramatically pulled down the average on >>>> performance tests. >>>> >>>> How much chance is there that a query is unregistered, if you would >>>> prepare all queries beforehand? >>>> >>>> Cheers, >>>> Geert >>>> >>>>> -----Oorspronkelijk bericht----- >>>>> Van: [email protected] [mailto:general- >>>>> [email protected]] Namens Michael Blakeley >>>>> Verzonden: maandag 29 juli 2013 21:08 >>>>> Aan: MarkLogic Developer Discussion >>>>> Onderwerp: Re: [MarkLogic Dev General] Registered Query Best >> Practices >>>>> >>>>> I think you're using registered query as intended. That behavior >> sounds >>>> odd >>>>> to me. I would expect (2) to be cheap, just a hash operation on the >>>> query >>>>> terms, and I would (3) to be the expensive step. >>>>> >>>>> So I would contact support and see what they think. >>>>> >>>>> -- Mike >>>>> >>>>> On 29 Jul 2013, at 11:03 , Ron Hitchens <[email protected]> wrote: >>>>> >>>>>> >>>>>> What is the best practice these days for using registered >>>>>> queries? I was under the impression that the pattern should be: >>>>>> >>>>>> 1) Create your query: >>>>>> $query := cts:and-query ((blah blah blah)) >>>>>> 2) Register it and make a registered query from it in one step: >>>>>> $reg-query := cts:resistered-query (cts:register ($query), >>>> "unfiltered") >>>>>> 3) Use it in a search: >>>>>> cts:search (fn:doc(), $reg-query) >>>>>> >>>>>> The theory being that if the cts:query described by $query is >>>>>> already registered, then the registration is essentially a no-op >>>>>> and you'll get back the same ID. And doing this every time insures >>>>>> that if the registered query has been evicted for some reason then >>>>>> it's re-registered and all is well. >>>>>> >>>>>> It's a nice theory but seems to be based on the assumption that >>>>>> creating a cts:query object is very cheap. Unfortunately, I'm >> finding >>>>>> that this is often not the case, especially when there are lots of >>>>>> documents in the database. I have a test case where performing Step >> 2 >>>>>> above on a moderately complicated query takes roughly 200ms every >>>>> time. >>>>>> Others take even longer and all seem to be proportional to database >>>> size. >>>>>> But running Step 3 with cts:registered-query(<regid>) is very, very >>>>>> fast (~0ms). Re-creating the query for re-registering every time is >>>>>> destroying the benefit of using a registered query. >>>>>> >>>>>> I can obviously save the registration ID obtained from calling >>>>>> cts:register and then make a cts:registered-query each time, but >> then >>>>>> I'm not protected from the query becoming unregistered. And there >> is >>>>>> no lightweight way to test if an ID is still registered. The only >> way >>>>>> I know to make this robust is to put a loop and try/catch around the >>>>>> code that does the search. But that requires passing along enough >>>>>> context to re-construct and re-register the queries (there can be >>>>>> dozens of them in this case). This is obviously a lot harder than >>>>>> building the complex query in one module and then passing it along >>>>>> to the search code somewhere else. >>>>>> >>>>>> What's the generally accepted best usage pattern for registered >>>>>> queries? And is it my imagination or has the cost of running >> queries >>>>>> been moving from query evaluation into query construction? >>>>>> >>>>>> Thanks. >>>>>> >>>>>> --- >>>>>> Ron Hitchens {mailto:[email protected]} Ronsoft Technologies >>>>>> +44 7879 358 212 (voice) http://www.ronsoft.com >>>>>> +1 707 924 3878 (fax) Bit Twiddling At Its Finest >>>>>> "No amount of belief establishes any fact." -Unknown >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> General mailing list >>>>>> [email protected] >>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>> >>>>> >>>>> _______________________________________________ >>>>> General mailing list >>>>> [email protected] >>>>> http://developer.marklogic.com/mailman/listinfo/general >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
