By the way, I've just noticed in your example that you put all the information in one single document. CTS and indexes are of no help here. A good practice is to put each "record" (I like to use the word "entity", here it is each of your "Document" element) in its own document, instead of putting them all in one single document, within an artificial *-List element (here DocumentList).
Regards, -- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/ On 24 August 2015 at 11:21, Florent Georges wrote: > Hi, > > It looks to me that what you really want to have, is a list of "active" > documents (each document with the same number being considered the same, > with only one active at any time). So you can easily constraint any search > only on the active documents. > > If this is the case, I would simply maintain them all in the same > collection (the collection for active documents). Every time you ingest a > new document, you have to check whether is must be added to the active > collection (and if it is the case, whether there was already an active > document with the same number, in which case it has to be put out of the > collection). > > Hope that helps, regards, > > -- > Florent Georges > http://fgeorges.org/ > http://h2oconsulting.be/ > > > On 24 August 2015 at 11:04, Kapoor, Pragya wrote: > >> Hi Geert, >> >> >> I have a docList which has metadata info for each document. So ,I need to >> first find the distinct Number nodes which should be ordered by Date >> element( descending ), as in docList there could be more than one entry for >> a single Number and then return the Document node satisfying the above >> criteria. >> >> >> For expamle : >> >> Number = 0000004 >> >> For this, lets assume there are 3 document entries which has Number= >> 0000340 >> >> So I need to pick only the document node with the latest date. >> >> >> docList : >> >> <DocumentList> >> >> <Document> >> >> <DocumentType>VM</DocumentType> >> >> <ID>/docs/0000002-0000000-0000340-2011-06-08_18-51-29-589.xml</ID> >> >> <Number>0000340</Number> >> >> <Date Year="2011" Month="06" Day="08">2011 Jun 08</Date> >> >> <Hidden/> >> >> </Document> >> >> <Document> >> >> <DocumentType>MA</DocumentType> >> >> <ID>/docs/0000002-0000000-0000340-2011-06-08_18-51-29-256.xml</ID> >> >> <Number>0000340</Number> >> >> <Date Year="2011" Month="07" Day="10">2011 July 10</Date> >> >> <Hidden/> >> >> </Document> >> >> <Document> >> >> <DocumentType>AM</DocumentType> >> >> <ID>/docs/0000002-0000000-0000340-2011-06-08_18-51-29-592.xml</ID> >> >> <Number>0000340</Number> >> >> <Date Year="2015" Month="06" Day="15">2015 Jun 15</Date> >> >> <Hidden/> >> >> </Document> >> >> </DocumentList> >> >> >> >> Thanks >> >> Pragya >> >> >> ------------------------------ >> *From:* [email protected] < >> [email protected]> on behalf of Geert Josten < >> [email protected]> >> *Sent:* Monday, August 24, 2015 2:14 PM >> *To:* MarkLogic Developer Discussion >> *Subject:* Re: [MarkLogic Dev General] distinct values on huge data >> >> Hi Pragya, >> >> Could you tell first in a bit more detail what question you are trying to >> answer? >> >> Cheers, >> Geert >> >> From: <[email protected]> on behalf of "Kapoor, >> Pragya" <[email protected]> >> Reply-To: MarkLogic Developer Discussion <[email protected] >> > >> Date: Monday, August 24, 2015 at 9:07 AM >> To: MarkLogic Developer Discussion <[email protected]> >> Subject: [MarkLogic Dev General] distinct values on huge data >> >> Hi, >> >> >> I want to the run below code on 50 lacs entries in DocList.xml: >> >> >> let $docList := >> >> functx:distinct-deep( >> >> >> cts:search(fn:doc("/misc/DocList.xml")/DocumentList/Document/Number, >> cts:and-query(())) >> >> ) >> >> for $each in $docList >> >> order by $each/../Date descending >> >> return $each/.. >> >> >> This is code is giving error on huge data sets. I have already created a >> range index on Date element >> >> >> Please suggest. >> >> >> Thanks >> >> Pragya >> "This e-mail and any attachments transmitted with it are for the sole use >> of the intended recipient(s) and may contain confidential , proprietary or >> privileged information. If you are not the intended recipient, please >> contact the sender by reply e-mail and destroy all copies of the original >> message. Any unauthorized review, use, disclosure, dissemination, >> forwarding, printing or copying of this e-mail or any action taken in >> reliance on this e-mail is strictly prohibited and may be unlawful." >> "This e-mail and any attachments transmitted with it are for the sole use >> of the intended recipient(s) and may contain confidential , proprietary or >> privileged information. If you are not the intended recipient, please >> contact the sender by reply e-mail and destroy all copies of the original >> message. Any unauthorized review, use, disclosure, dissemination, >> forwarding, printing or copying of this e-mail or any action taken in >> reliance on this e-mail is strictly prohibited and may be unlawful." >> >> _______________________________________________ >> General mailing list >> [email protected] >> Manage your subscription at: >> http://developer.marklogic.com/mailman/listinfo/general >> >> > > >
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
