You are on the right track: you'll want a dateTime element range index on SubmitDate. However I suspect the way the query is written is causing problems. You could check this using xdmp:plan.
The problem may be that you're doing too much work in an XPath predicate. It's easy to pretend that an XPath predicate acts like an array index, but its performance characteristics are different. Most importantly, the predicate expression can be evaluated many times: once for every item in the input. You can read more about this at http://blakeley.com/blogofile/archives/518/ To get around this, bind as much of the predicate as possible to a variable. let $fortnight := current-dateTime() - xs:dayTimeDuration("P14D") return xdmp:estimate( collection()/example[State="TX"][SubmitDate gt $fortnight]) Use xdmp:plan to verify that the right indexes are used. Alternatively you could use a cts:search and cts:element-range-query instead of an XPath predicate. -- Mike > On 11 Dec 2014, at 14:52 , Alexei Betin <[email protected]> wrote: > > Hi, > > I am very new to both MarkLogic and xQuery and this is my first post here. My > question is as follows: > > I am trying to count documents that meet certain criteria and also fall into > particular date range (such as within 14 days window from today). I am > experimenting with fn:count and xdmp:estimate, e.g.: > > let $count3D := fn:count( fn:collection()/example[State="TX" and ( > xs:dateTime( SubmitDate ) > ( fn:current-dateTime() - xs:dayTimeDuration( > "P14D") ) )]) > > or > > let $count3D := xdmp:estimate( fn:collection()/example[State="TX" and ( > xs:dateTime( SubmitDate ) > ( fn:current-dateTime() - xs:dayTimeDuration( > "P14D") ) )]) > > Sure enough, the fn:count gives the correct answer but is rather slow, > whereas xdmp:estimate() is very fast but it appears to be only filtering the > count by state and completely ignores the dateTime-based criteria so it’s > grossly incorrect. > > Any advice on where I go from here – for either making fn:count() faster or > making xdmp:estimate() more accurate? Either creating some kind of index or > improving the query syntax or both? > > I tried creating a range path index on example/SubmitDate path but it did not > seem to help anything so I am not sure I am on the right track – I’d > appreciate any clues or pointers on how to approach this correctly. > > Thanks, > > <image001.gif> > <image002.gif> > Alexei Betin > Principal Architect; Big Data > P: (817) 928-1643 | Elevate.com > 4150 International Plaza, Suite 300 > Fort Worth, TX 76109 > > Privileged and Confidential. This e-mail, and any attachments thereto, is > intended only for use by the addressee(s) named herein and may contain > privileged and/or confidential information. If you have received this e-mail > in error, please notify me immediately by a return e-mail and delete this > e-mail. You are hereby notified that any dissemination, distribution or > copying of this e-mail and/or any attachments thereto, is strictly > prohibited. > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
