I think there is more to it. Count forces actual data to be retrieved from the database nodes, while xdmp:estimate uses memory-based indexes. So it can save a lot of latency as well..
Kind regards, Geert -----Oorspronkelijk bericht----- Van: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] Namens Paul M Verzonden: vrijdag 2 december 2011 17:28 Aan: general@developer.marklogic.com Onderwerp: Re: [MarkLogic Dev General] Issue with Mark Logic Query (Michael Blakeley) So if count is O(n), xdmp:estimate is a log n or some such ? Just curious. ----- Original Message ----- From: "general-requ...@developer.marklogic.com" <general-requ...@developer.marklogic.com> To: general@developer.marklogic.com Cc: Sent: Thursday, December 1, 2011 3:00 PM Subject: General Digest, Vol 90, Issue 3 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than "Re: Contents of General digest..." Today's Topics: 1. Re: Issue with Mark Logic Query (Michael Blakeley) 2. large (?) number of range indexes (Mike Sokolov) ---------------------------------------------------------------------- Message: 1 Date: Thu, 1 Dec 2011 09:05:50 -0800 From: Michael Blakeley <m...@blakeley.com> Subject: Re: [MarkLogic Dev General] Issue with Mark Logic Query To: General MarkLogic Developer Discussion <general@developer.marklogic.com> Cc: rakesh.yadav12...@gmail.com Message-ID: <efb6938d-b47f-46d3-9a5b-0c7a35cc9...@blakeley.com> Content-Type: text/plain; charset=us-ascii To query the value of an element, use an element-value-query term like this: cts:element-value-query(xs:QName('meta:DateLoaded'), '2011*') But since that uses a wildcard glob, it won't resolve from indexes unless you also have appropriate wildcards enabled. If you have an element range index on meta:DateLoaded with type=date, it would probably be better to specify a range instead of a wildcard: cts:element-range-query(xs:QName('meta:DateLoaded'), '>=', xs:date('2011-01-01')), cts:element-range-query(xs:QName('meta:DateLoaded'), '<', xs:date('2012-01-01')) Finally, it may be faster to evaluate the entire cts:query using xdmp:estimate(cts:search($query)) rather than count(cts:uris($query)). Using count() will be O(n) with the number of results. Note that both count and estimate support an optional limit argument, which might be useful for your '1 to 1000000' limit. -- Mike On 1 Dec 2011, at 01:46 , amit gope wrote: > Hi All, > > I have a database where the element range index is on the element date, and now i am executing a query where i have used element value query on one of the elements, but the results fetched are not adhering to the query, please suggest the changes that i need to make. > > let $uri :=(cts:uris('', ('document','limit=1000000'), > (cts:and-query((cts:directory-query('/content/', 'infinity'), > cts:element-query((xs:QName('meta:DateLoaded')),'2011*'), > cts:element-query((xs:QName('meta:PubName')),'Springer'), > cts:element-query(xs:QName('Affiliation'), cts:and-query((), ())), > cts:element-query(xs:QName('meta:Institution'),cts:and-query((),())), > cts:not-query(cts:element-query(xs:QName("meta:GeoOrg"), cts:and-query((), ()))) > ), ())), (), ()))[1 to 1000000] > return (count($uri),$uri) > > > In the above query it is fetching me uri's of those articles where the meta dateloaded is 2010. Please suggest > > -- > Regards > Amit > > > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general ------------------------------ Message: 2 Date: Thu, 01 Dec 2011 14:23:07 -0500 From: Mike Sokolov <soko...@ifactory.com> Subject: [MarkLogic Dev General] large (?) number of range indexes To: general@developer.marklogic.com Message-ID: <4ed7d41b.9000...@ifactory.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed I've found that cts:element-values() is *much* faster when you don't use a query to filter. For example, cts:element-values (xs:QName("foo"), "a") is 25x faster than cts:element-values (xs:QName("foo"), "a", cts:element-value-query(xs:QName("bar"), "baz")) when every document indexed by foo in fact has bar=baz, ie when the query is essentially a no-op. Consequently, we're taking what used to be a bunch of large range indexes and breaking them up into a lot of smaller range indexes, each of which we can query independently (faster). What I'm wondering is if anybody would care to speculate on whether having a large number of small(er) indexes will pose some other performance problem. Presumably at least some of the keys will be shared across these indexes, but the values (the fragment/document references) should not, so overall storage should be only slightly larger? -- Michael Sokolov Engineering Director www.ifactory.com ------------------------------ _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general End of General Digest, Vol 90, Issue 3 ************************************** _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general