This is something of a faq: try searching MarkMail. http://markmail.org/search/list:com.marklogic.developer.general%20distinct-values
The second hit is fairly relevant: http://markmail.org/message/7zg4v67k3n5tp3rm
You might also be interested in cts:frequency(): http://developer.marklogic.com/pubs/3.2/apidocs/SearchBuiltins.html#frequency -- Mike Schouten, Edgar J. (RB-NL) wrote:
Hi, I have keywords in my documents. 750K documents. 54K documents have keywords. 20K are unique (distinct) keywords I wanted to make a list of the top 200 (most frequent) keywords. So I'm looping over the distinct keywords and do a xdmp:estimate(cts:search(//keyword,xdmp:element-value-query(xs:Qname("ke yword),$keyword))) Takes to long (over 10 minutes) So I spawn the task, put it on the Task Sever, and have an xdmp:node-insert-child(fn:doc("/report.xml")/table,<tr><td>{$keyword}</t d><td>{$number}</td></tr>) insert a row for each keyword. In the beginning, it would handle 2 searches a second, meaning 2.7h for all 20K keywords. Fair enough. But 12h later, the average has dropped to 2 seaches a minute, with still 2K keywords to do, meaning 16 more hours. Was the xdmp:node-insert-child a bad idea? Is there a better way to get the amount of documents containing a specific keyword? Or is there a better way to incrementally store results of very long queries? Anyone have any experience on doing statistics of this kind? Regards EdgarS -----Oorspronkelijk bericht----- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Dominic Mitchell Verzonden: vrijdag 30 november 2007 10:51 Aan: General Mark Logic Developer Discussion Onderwerp: Re: [MarkLogic Dev General] Needing to wait for document update Neil Bradley wrote:In any case, the page redirected to has the following the <head> tag to prevent such problems:<meta http-equiv="Expires" content="-1" /> <meta http-equiv="Pragma" content="no-cache"/> <meta http-equiv="Cache-Control" content="no-cache"/>I would not tend to rely on these. It's better to explicitly set the HTTP headers. -Dom _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
