The raw command is now down to 31 minutes and brought back 74,447
results. I can live with that since this is going to be a monthly report.
It will be interesting to see if it increases more when I add more
variables and include the Excel vocabulary.
This is the final query:
for $ACE in
cts:search(xdmp:directory('/opt/MOR/ACE/')/descendant::ns1:ACE/ns1:ModifiedDate,
cts:and-query((
cts:element-range-query (xs:QName('ns1:ModifiedDate'), '>',
xs:dateTime('2011-04-01T16:00:00.00') ),
cts:element-range-query (xs:QName('ns1:ModifiedDate'), '<',
xs:dateTime('2011-05-01T16:00:00.00') )
)) )
Thanks everyone for your help and advice!!!
Betty
> It may be frustrating, but I'd say you are making progress. The old query
> might have taken 8 hours and this one might take 90 minutes, for example.
> Both might time out, but the new query is searchable and that's an
> improvement.
>
> How many documents will this query return, and what are you trying to do
> with them? You can get the match count via xdmp:estimate(cts:search(...))
> around your cts:search below.
>
> If you want to prove out your query and see how long it takes for a subset
> of your inputs, you could add a positional predicate outside the
> cts:search call:
>
> for $ACE in cts:search(
> xdmp:directory('/opt/MOR/ACE/')/descendant::ns1:ACE/ns1:ModifiedDate,
> cts:element-range-query(
> xs:QName('ns1:ModifiedDate'), '<=',
> xs:dateTime('2011-04-01T16:00:00.00') ) )[1 to 10]
> ...
>
> But it may be that the '...' is the important part. Are you simply trying
> to get all the values? If so, you can read them directly from the range
> index:
>
> cts:element-values(
> (),
> (),
> cts:and-query(
> (cts:directory-query('/opt/MOR/ACE/', 'infinity'),
> cts:element-query(
> xs:QName('ns1:ACE/ns1:ModifiedDate'),
> cts:element-range-query(
> xs:QName('ns1:ModifiedDate'),
> '<=', xs:dateTime('2011-04-01T16:00:00.00'))))))
>
> -- Mike
>
> On 20 Mar 2012, at 08:32 , Betty Harvey wrote:
>
>> Hi Evan:
>>
>> This is a great tool. I ran the command and the predicate doesn't work.
>> I decided to try another approach and use another element that is
>> indexed
>> but there is only 1 in each object. There can be up to 20 Events in a
>> object. I have tried running both in CQ and http application and both
>> time out.
>>
>> The xdmp:plan command says it is fully searchable. I am obviously
>> doing
>> something wrong.
>>
>> for $ACE in
>> cts:search(xdmp:directory('/opt/MOR/ACE/')/descendant::ns1:ACE/ns1:ModifiedDate,
>> cts:element-range-query (xs:QName('ns1:ModifiedDate'), '<=',
>> xs:dateTime('2011-04-01T16:00:00.00') ) )
>>
>>
>> Thanks again!
>>
>> Betty
>>
>>> As a quick tip, Betty, you can easily check whether a given expression
>>> is
>>> searchable or not by using Query Console. I just ran this:
>>>
>>> declare namespace ns1="whatever";
>>> xdmp:plan(
>>> collection()/descendant::ns1:ACE/ns1:EventSet/ns1:GeneralEvent[1]
>>> )
>>>
>>> Whose output included this:
>>> <qry:info-trace>Step 4 is unsearchable:
>>> ns1:GeneralEvent[1]</qry:info-trace>
>>>
>>> This tells me where the problem is, and verifies Mike's suspicion.
>>>
>>> Evan
>>>
>>>
>>> On 3/19/12 4:16 PM, "Michael Blakeley"
>>> <[email protected]<mailto:[email protected]>> wrote:
>>>
>>> Betty, I think it's the '[1]' that makes that expression unsearchable.
>>> Normally the XPath indexes simply record the presence of elements, not
>>> their position.
>>>
>>> -- Mike
>>>
>>> On 16 Mar 2012, at 15:03 , Betty Harvey wrote:
>>>
>>> Thanks!!!
>>> I set an element range index on the main database and have apparently
>>> run
>>> out of disk space - I will deal with that issue later. It is running on
>>> a
>>> VM machine.
>>> I also set a range index on EventDate in the 'documents' database for
>>> test
>>> purposes. I rewrote the query to use cts:search and it comes back on
>>> the
>>> 'documents' database that the "Expression is unsearchable" so it looks
>>> like
>>> I am not sure what this error message means but I think it might not be
>>> recognizing the range index.
>>> Am I missing something significant. The documents have 3 namespaces.
>>> The EventDate is in the 'ns1' namespace. I only used one
>>> cts:element-range-query as a test.
>>> Revised test code:
>>> for $ACE in
>>> cts:search(collection()/descendant::ns1:ACE/ns1:EventSet/ns1:GeneralEvent[1],
>>> cts:element-range-query (xs:QName('EventDate'), '<',
>>> xs:dateTime('2011-03-01T00:00:00') ) )
>>> let $ACEId := $ACE/ancestor::ns1:ACE/ns1:ACEId
>>> let $EventDate := $ACE/ns1:EventDate
>>> return
>>> <a>
>>> {$ACEId}
>>> {$EventDate}
>>> <time>{xdmp:elapsed-time()}</time>
>>> </a>
>>> Hi Betty,
>>> Using a cts:search like David suggests could speed up considerably,
>>> indeed. You can use xdmp:directory as searchable expression, I thought,
>>> but you can also add it to the query part using cts:directory-query.
>>> Note though that if you rewrite the date predicates to
>>> cts:element-range-query's, that it may make a lot of difference whether
>>> ACE is a fragment root or not. If you include /descendant::ACE in your
>>> searchable path, then the end result is filtered to make sure each ACE
>>> matches the query, but there could be a lot of false positives (and
>>> hence
>>> xdmp:estimate could return a too high value).
>>> Kind regards,
>>> Geert
>>> -----Oorspronkelijk bericht-----
>>> Van:
>>> [email protected]<mailto:[email protected]>
>>> [mailto:general-
>>> [email protected]<mailto:[email protected]>]
>>> Namens David Lee
>>> Verzonden: vrijdag 16 maart 2012 19:54
>>> Aan: MarkLogic Developer Discussion
>>> Onderwerp: Re: [MarkLogic Dev General] Struggling with Query Time Out
>>> First off cts:search is exactly what you want for this.
>>> Second you are doing string compares against datetime values. To help
>>> with this
>>> you may need to create a range index on EventDate and compare against
>>> xs:dateTime('xxxxxx')
>>> Thirdly your doing a directory search which you might not actually need
>>> if these
>>> documents are in know namespaces.
>>> But hold off on that until you get the first two worked out.
>>> cts:search() is really your friend in this case, but you do want to
>>> make
>>> a range
>>> index so that the system knows the values are dates otherwise "gt" will
>>> do string
>>> not date comparisons
>>> Once you get both those working your searches should be nearly instant.
>>> --------------------------------------------------------------------------
>>> ---
>>> David Lee
>>> Lead Engineer
>>> MarkLogic Corporation
>>> [email protected]<mailto:[email protected]>
>>> Phone: +1 650-287-2531
>>> Cell: +1 812-630-7622
>>> www.marklogic.com
>>> This e-mail and any accompanying attachments are confidential. The
>>> information is intended solely for the use of the individual to whom it
>>> is
>>> addressed. Any review, disclosure, copying, distribution, or use of
>>> this
>>> e-mail
>>> communication by others is strictly prohibited. If you are not the
>>> intended
>>> recipient, please notify us immediately by returning this message to
>>> the
>>> sender
>>> and delete all copies. Thank you for your cooperation.
>>> -----Original Message-----
>>> From:
>>> [email protected]<mailto:[email protected]>
>>> [mailto:general-
>>> [email protected]<mailto:[email protected]>]
>>> On Behalf Of Betty Harvey
>>> Sent: Friday, March 16, 2012 3:17 PM
>>> To: MarkLogic Developer Discussion
>>> Subject: [MarkLogic Dev General] Struggling with Query Time Out
>>> I have been unable to get this query to run successfully without
>>> timing
>>> out. To make sure my logic was correct I placed 100 documents in the
>>> 'documents' database and query runs successfully and very quickly. In
>>> the
>>> large database 1.7 million objects the query always times out.
>>> I am not sure cts:search will help. I played around with it without
>>> success. The goal of the query is to gather information for a
>>> particular
>>> month based on when the document was created. Below is the code:
>>> for $ACE in xdmp:directory('opt/MOR/ACE/')/descendant::ACE
>>> [EventSet/GeneralEvent[1]/EventDate gt '2011-03-01T00:00:00']
>>> [EventSet/era:GeneralEvent[1]/EventDate lt '2011-04-01T00:00:00']
>>> let $ACEId := $ACE/ACEId
>>> let $EventDate := $ACE/EventSet/era:GeneralEvent[1]/era:EventDate
>>> return
>>> <a>
>>> {$ACEId}
>>> {$EventDate}
>>> </a>
>>> Any ideas are appreciated!
>>> Betty
>>> _______________________________________________
>>> General mailing list
>>> [email protected]<mailto:[email protected]>
>>> http://developer.marklogic.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]<mailto:[email protected]>
>>> http://developer.marklogic.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]<mailto:[email protected]>
>>> http://developer.marklogic.com/mailman/listinfo/general
>>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
>>> Betty Harvey | Phone: 410-787-9200 FAX: 9830
>>> Electronic Commerce Connection, Inc. |
>>> [email protected]<mailto:[email protected]> |
>>> Washington,DC XML Users Grp
>>> URL: http://www.eccnet.com | http://www.eccnet.com/xmlug
>>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/
>>> Member of XML Guild (www.xmlguild.org)
>>> _______________________________________________
>>> General mailing list
>>> [email protected]<mailto:[email protected]>
>>> http://developer.marklogic.com/mailman/listinfo/general
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]<mailto:[email protected]>
>>> http://developer.marklogic.com/mailman/listinfo/general
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>>
>>
>>
>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
>> Betty Harvey | Phone: 410-787-9200 FAX: 9830
>> Electronic Commerce Connection, Inc. |
>> [email protected] | Washington,DC XML Users Grp
>> URL: http://www.eccnet.com | http://www.eccnet.com/xmlug
>> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/
>> Member of XML Guild (www.xmlguild.org)
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Betty Harvey | Phone: 410-787-9200 FAX: 9830
Electronic Commerce Connection, Inc. |
[email protected] | Washington,DC XML Users Grp
URL: http://www.eccnet.com | http://www.eccnet.com/xmlug
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/
Member of XML Guild (www.xmlguild.org)
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general