Betty, If you have any flexibility in your data, (certainly not down to the millisecond), maybe you could create just a date element. I'd think that your queries against a range index that has 365*[years] would be a lot more efficient than running against a range index with as many 'rows' (quote unquote) as you have records.
Matthew -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Betty Harvey Sent: Tuesday, March 20, 2012 2:35 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Struggling with Query Time Out (I'm satisfied) The raw command is now down to 31 minutes and brought back 74,447 results. I can live with that since this is going to be a monthly report. It will be interesting to see if it increases more when I add more variables and include the Excel vocabulary. This is the final query: for $ACE in cts:search(xdmp:directory('/opt/MOR/ACE/')/descendant::ns1:ACE/ns1:ModifiedDate, cts:and-query(( cts:element-range-query (xs:QName('ns1:ModifiedDate'), '>', xs:dateTime('2011-04-01T16:00:00.00') ), cts:element-range-query (xs:QName('ns1:ModifiedDate'), '<', xs:dateTime('2011-05-01T16:00:00.00') ) )) ) Thanks everyone for your help and advice!!! Betty > It may be frustrating, but I'd say you are making progress. The old query > might have taken 8 hours and this one might take 90 minutes, for example. > Both might time out, but the new query is searchable and that's an > improvement. > > How many documents will this query return, and what are you trying to do > with them? You can get the match count via xdmp:estimate(cts:search(...)) > around your cts:search below. > > If you want to prove out your query and see how long it takes for a subset > of your inputs, you could add a positional predicate outside the > cts:search call: > > for $ACE in cts:search( > xdmp:directory('/opt/MOR/ACE/')/descendant::ns1:ACE/ns1:ModifiedDate, > cts:element-range-query( > xs:QName('ns1:ModifiedDate'), '<=', > xs:dateTime('2011-04-01T16:00:00.00') ) )[1 to 10] > ... > > But it may be that the '...' is the important part. Are you simply trying > to get all the values? If so, you can read them directly from the range > index: > > cts:element-values( > (), > (), > cts:and-query( > (cts:directory-query('/opt/MOR/ACE/', 'infinity'), > cts:element-query( > xs:QName('ns1:ACE/ns1:ModifiedDate'), > cts:element-range-query( > xs:QName('ns1:ModifiedDate'), > '<=', xs:dateTime('2011-04-01T16:00:00.00')))))) > > -- Mike > > On 20 Mar 2012, at 08:32 , Betty Harvey wrote: > >> Hi Evan: >> >> This is a great tool. I ran the command and the predicate doesn't work. >> I decided to try another approach and use another element that is >> indexed >> but there is only 1 in each object. There can be up to 20 Events in a >> object. I have tried running both in CQ and http application and both >> time out. >> >> The xdmp:plan command says it is fully searchable. I am obviously >> doing >> something wrong. >> >> for $ACE in >> cts:search(xdmp:directory('/opt/MOR/ACE/')/descendant::ns1:ACE/ns1:ModifiedDate, >> cts:element-range-query (xs:QName('ns1:ModifiedDate'), '<=', >> xs:dateTime('2011-04-01T16:00:00.00') ) ) >> >> >> Thanks again! >> >> Betty >> >>> As a quick tip, Betty, you can easily check whether a given expression >>> is >>> searchable or not by using Query Console. I just ran this: >>> >>> declare namespace ns1="whatever"; >>> xdmp:plan( >>> collection()/descendant::ns1:ACE/ns1:EventSet/ns1:GeneralEvent[1] >>> ) >>> >>> Whose output included this: >>> <qry:info-trace>Step 4 is unsearchable: >>> ns1:GeneralEvent[1]</qry:info-trace> >>> >>> This tells me where the problem is, and verifies Mike's suspicion. >>> >>> Evan >>> >>> >>> On 3/19/12 4:16 PM, "Michael Blakeley" >>> <[email protected]<mailto:[email protected]>> wrote: >>> >>> Betty, I think it's the '[1]' that makes that expression unsearchable. >>> Normally the XPath indexes simply record the presence of elements, not >>> their position. >>> >>> -- Mike >>> >>> On 16 Mar 2012, at 15:03 , Betty Harvey wrote: >>> >>> Thanks!!! >>> I set an element range index on the main database and have apparently >>> run >>> out of disk space - I will deal with that issue later. It is running on >>> a >>> VM machine. >>> I also set a range index on EventDate in the 'documents' database for >>> test >>> purposes. I rewrote the query to use cts:search and it comes back on >>> the >>> 'documents' database that the "Expression is unsearchable" so it looks >>> like >>> I am not sure what this error message means but I think it might not be >>> recognizing the range index. >>> Am I missing something significant. The documents have 3 namespaces. >>> The EventDate is in the 'ns1' namespace. I only used one >>> cts:element-range-query as a test. >>> Revised test code: >>> for $ACE in >>> cts:search(collection()/descendant::ns1:ACE/ns1:EventSet/ns1:GeneralEvent[1], >>> cts:element-range-query (xs:QName('EventDate'), '<', >>> xs:dateTime('2011-03-01T00:00:00') ) ) >>> let $ACEId := $ACE/ancestor::ns1:ACE/ns1:ACEId >>> let $EventDate := $ACE/ns1:EventDate >>> return >>> <a> >>> {$ACEId} >>> {$EventDate} >>> <time>{xdmp:elapsed-time()}</time> >>> </a> >>> Hi Betty, >>> Using a cts:search like David suggests could speed up considerably, >>> indeed. You can use xdmp:directory as searchable expression, I thought, >>> but you can also add it to the query part using cts:directory-query. >>> Note though that if you rewrite the date predicates to >>> cts:element-range-query's, that it may make a lot of difference whether >>> ACE is a fragment root or not. If you include /descendant::ACE in your >>> searchable path, then the end result is filtered to make sure each ACE >>> matches the query, but there could be a lot of false positives (and >>> hence >>> xdmp:estimate could return a too high value). >>> Kind regards, >>> Geert >>> -----Oorspronkelijk bericht----- >>> Van: >>> [email protected]<mailto:[email protected]> >>> [mailto:general- >>> [email protected]<mailto:[email protected]>] >>> Namens David Lee >>> Verzonden: vrijdag 16 maart 2012 19:54 >>> Aan: MarkLogic Developer Discussion >>> Onderwerp: Re: [MarkLogic Dev General] Struggling with Query Time Out >>> First off cts:search is exactly what you want for this. >>> Second you are doing string compares against datetime values. To help >>> with this >>> you may need to create a range index on EventDate and compare against >>> xs:dateTime('xxxxxx') >>> Thirdly your doing a directory search which you might not actually need >>> if these >>> documents are in know namespaces. >>> But hold off on that until you get the first two worked out. >>> cts:search() is really your friend in this case, but you do want to >>> make >>> a range >>> index so that the system knows the values are dates otherwise "gt" will >>> do string >>> not date comparisons >>> Once you get both those working your searches should be nearly instant. >>> -------------------------------------------------------------------------- >>> --- >>> David Lee >>> Lead Engineer >>> MarkLogic Corporation >>> [email protected]<mailto:[email protected]> >>> Phone: +1 650-287-2531 >>> Cell: +1 812-630-7622 >>> www.marklogic.com >>> This e-mail and any accompanying attachments are confidential. The >>> information is intended solely for the use of the individual to whom it >>> is >>> addressed. Any review, disclosure, copying, distribution, or use of >>> this >>> e-mail >>> communication by others is strictly prohibited. If you are not the >>> intended >>> recipient, please notify us immediately by returning this message to >>> the >>> sender >>> and delete all copies. Thank you for your cooperation. >>> -----Original Message----- >>> From: >>> [email protected]<mailto:[email protected]> >>> [mailto:general- >>> [email protected]<mailto:[email protected]>] >>> On Behalf Of Betty Harvey >>> Sent: Friday, March 16, 2012 3:17 PM >>> To: MarkLogic Developer Discussion >>> Subject: [MarkLogic Dev General] Struggling with Query Time Out >>> I have been unable to get this query to run successfully without >>> timing >>> out. To make sure my logic was correct I placed 100 documents in the >>> 'documents' database and query runs successfully and very quickly. In >>> the >>> large database 1.7 million objects the query always times out. >>> I am not sure cts:search will help. I played around with it without >>> success. The goal of the query is to gather information for a >>> particular >>> month based on when the document was created. Below is the code: >>> for $ACE in xdmp:directory('opt/MOR/ACE/')/descendant::ACE >>> [EventSet/GeneralEvent[1]/EventDate gt '2011-03-01T00:00:00'] >>> [EventSet/era:GeneralEvent[1]/EventDate lt '2011-04-01T00:00:00'] >>> let $ACEId := $ACE/ACEId >>> let $EventDate := $ACE/EventSet/era:GeneralEvent[1]/era:EventDate >>> return >>> <a> >>> {$ACEId} >>> {$EventDate} >>> </a> >>> Any ideas are appreciated! >>> Betty >>> _______________________________________________ >>> General mailing list >>> [email protected]<mailto:[email protected]> >>> http://developer.marklogic.com/mailman/listinfo/general >>> _______________________________________________ >>> General mailing list >>> [email protected]<mailto:[email protected]> >>> http://developer.marklogic.com/mailman/listinfo/general >>> _______________________________________________ >>> General mailing list >>> [email protected]<mailto:[email protected]> >>> http://developer.marklogic.com/mailman/listinfo/general >>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ >>> Betty Harvey | Phone: 410-787-9200 FAX: 9830 >>> Electronic Commerce Connection, Inc. | >>> [email protected]<mailto:[email protected]> | >>> Washington,DC XML Users Grp >>> URL: http://www.eccnet.com | http://www.eccnet.com/xmlug >>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ >>> Member of XML Guild (www.xmlguild.org) >>> _______________________________________________ >>> General mailing list >>> [email protected]<mailto:[email protected]> >>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> [email protected]<mailto:[email protected]> >>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >>> >> >> >> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ >> Betty Harvey | Phone: 410-787-9200 FAX: 9830 >> Electronic Commerce Connection, Inc. | >> [email protected] | Washington,DC XML Users Grp >> URL: http://www.eccnet.com | http://www.eccnet.com/xmlug >> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ >> Member of XML Guild (www.xmlguild.org) >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 410-787-9200 FAX: 9830 Electronic Commerce Connection, Inc. | [email protected] | Washington,DC XML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/xmlug /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ Member of XML Guild (www.xmlguild.org) _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
