Hi Ron, Are your queries such that you would have a finite number of sub-queries, if you would break them into smaller subparts? Perhaps you can combine multiple registered queries..
Cheers, Geert > -----Oorspronkelijk bericht----- > Van: [email protected] [mailto:general- > [email protected]] Namens Ron Hitchens > Verzonden: dinsdag 30 juli 2013 2:29 > Aan: MarkLogic Developer Discussion > Onderwerp: Re: [MarkLogic Dev General] Registered Query Best Practices > > > Hi Geert, > > I've done something before where we stored reg ids in a map for > easy re-use. In that case, there was a 1:1 correspondence between > the reg id and a meaningful business domain number. On this project > that's not the case. > > Also, there is not a finite set of queries that need to be registered > so it's not feasible to pre-register everything once. New ones can be > created > dynamically. And the complicated queries are persisted in another > database > and can be referenced later. This means the queries which should be > registered > will persist across server restarts. Which means there must be a way to > register the queries on first use, then make use of those registered queries > on subsequent requests. > > The re-register-before-each-use pattern solves that nicely, but not if > the query construction cost must be re-paid each time. It looks like the > robust solution is going to have to be catching exceptions for unregistered > queries and reconstructing the registrations. It's a shame because that is > going to add unnecessary complexity to the code. > > --- > Ron Hitchens {mailto:[email protected]} Ronsoft Technologies > +44 7879 358 212 (voice) http://www.ronsoft.com > +1 707 924 3878 (fax) Bit Twiddling At Its Finest > "No amount of belief establishes any fact." -Unknown > > > On Jul 29, 2013, at 8:15 PM, Geert Josten <[email protected]> wrote: > > > Hi Ron, > > > > I recently saw a strategy where they deliberately took a different > > approach. In their case the calculation of the queries was not > > straight-forward and could run into 30k search terms. Additionally, > > registering the query, and warming up cache by doing one initial search > > after registering each query took most time. They were searching roughly > > 40mln docs. The searches themselves were subsec.. > > > > Their approach was to store all registered query id's somewhere, and have > > them readily available at actual search time. They also used a try catch > > to catch unregistered queries, though in their case they shouldn't > > actually occur, and these dramatically pulled down the average on > > performance tests. > > > > How much chance is there that a query is unregistered, if you would > > prepare all queries beforehand? > > > > Cheers, > > Geert > > > >> -----Oorspronkelijk bericht----- > >> Van: [email protected] [mailto:general- > >> [email protected]] Namens Michael Blakeley > >> Verzonden: maandag 29 juli 2013 21:08 > >> Aan: MarkLogic Developer Discussion > >> Onderwerp: Re: [MarkLogic Dev General] Registered Query Best Practices > >> > >> I think you're using registered query as intended. That behavior sounds > > odd > >> to me. I would expect (2) to be cheap, just a hash operation on the > > query > >> terms, and I would (3) to be the expensive step. > >> > >> So I would contact support and see what they think. > >> > >> -- Mike > >> > >> On 29 Jul 2013, at 11:03 , Ron Hitchens <[email protected]> wrote: > >> > >>> > >>> What is the best practice these days for using registered > >>> queries? I was under the impression that the pattern should be: > >>> > >>> 1) Create your query: > >>> $query := cts:and-query ((blah blah blah)) > >>> 2) Register it and make a registered query from it in one step: > >>> $reg-query := cts:resistered-query (cts:register ($query), > > "unfiltered") > >>> 3) Use it in a search: > >>> cts:search (fn:doc(), $reg-query) > >>> > >>> The theory being that if the cts:query described by $query is > >>> already registered, then the registration is essentially a no-op > >>> and you'll get back the same ID. And doing this every time insures > >>> that if the registered query has been evicted for some reason then > >>> it's re-registered and all is well. > >>> > >>> It's a nice theory but seems to be based on the assumption that > >>> creating a cts:query object is very cheap. Unfortunately, I'm finding > >>> that this is often not the case, especially when there are lots of > >>> documents in the database. I have a test case where performing Step 2 > >>> above on a moderately complicated query takes roughly 200ms every > >> time. > >>> Others take even longer and all seem to be proportional to database > > size. > >>> But running Step 3 with cts:registered-query(<regid>) is very, very > >>> fast (~0ms). Re-creating the query for re-registering every time is > >>> destroying the benefit of using a registered query. > >>> > >>> I can obviously save the registration ID obtained from calling > >>> cts:register and then make a cts:registered-query each time, but then > >>> I'm not protected from the query becoming unregistered. And there is > >>> no lightweight way to test if an ID is still registered. The only way > >>> I know to make this robust is to put a loop and try/catch around the > >>> code that does the search. But that requires passing along enough > >>> context to re-construct and re-register the queries (there can be > >>> dozens of them in this case). This is obviously a lot harder than > >>> building the complex query in one module and then passing it along > >>> to the search code somewhere else. > >>> > >>> What's the generally accepted best usage pattern for registered > >>> queries? And is it my imagination or has the cost of running queries > >>> been moving from query evaluation into query construction? > >>> > >>> Thanks. > >>> > >>> --- > >>> Ron Hitchens {mailto:[email protected]} Ronsoft Technologies > >>> +44 7879 358 212 (voice) http://www.ronsoft.com > >>> +1 707 924 3878 (fax) Bit Twiddling At Its Finest > >>> "No amount of belief establishes any fact." -Unknown > >>> > >>> > >>> > >>> > >>> _______________________________________________ > >>> General mailing list > >>> [email protected] > >>> http://developer.marklogic.com/mailman/listinfo/general > >>> > >> > >> _______________________________________________ > >> General mailing list > >> [email protected] > >> http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > > General mailing list > > [email protected] > > http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
