My approach was similar, but tried to sum all frequencies per uri. Unfortunately, that approach gets slower with more documents, and more distinct file sizes. Adding a simple count attribute or element in the file somewhere would greatly simplify the run-time calculation, and that is what I would normally recommend. For the sake of completeness I’ll give it some more thought to see if there are ways to improve on the 3 minutes. A UDF might be useful, would have to try that..
Cheers, Geert From: Johan Mörén <[email protected]<mailto:[email protected]>> Reply-To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Saturday, June 27, 2015 at 1:23 AM To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Subject: Re: [MarkLogic Dev General] Find the document(s) with max occurrences of an element-attribute reference Hi Christopher I tried your approach but still without success. I think the case might be that your example is using a fixed vale for size ("yes"). And since frequency is based on the the value you get the right results. Regards, Johan On Sat, Jun 27, 2015 at 12:34 AM Christopher Hamlin <[email protected]<mailto:[email protected]>> wrote: Hi Johan, Maybe I'm not clear on what you want. I just tried something. I created documents in a database using xquery version "1.0-ml"; for $i in 1 to 100 let $doc := <doc>{(1 to $i)!<file size='yes'/>}</doc> let $uri := '/'||$i||'.xml' return xdmp:document-insert ($uri, $doc) so for example /1.xml => <doc> <file size="yes"/> </doc> and /2.xml => <doc> <file size="yes"/> <file size="yes"/> </doc> and so on. With a file/@size element-attribute range index, the query xquery version '1.0-ml'; let $uris := cts:uri-reference() let $ea := cts:element-attribute-reference (xs:QName ('file'), xs:QName ('size'), 'collation=http://marklogic.com/collation/codepoint') return for $tuple in cts:value-tuples(($uris, $ea), ('item-frequency','frequency-order','descending','limit=3')) return fn:concat ($tuple[1], ' -> ', cts:frequency ($tuple)) returns /100.xml -> 100 /99.xml -> 99 /98.xml -> 98 /97.xml -> 97 /96.xml -> 96 /95.xml -> 95 /94.xml -> 94 /93.xml -> 93 /92.xml -> 92 /91.xml -> 91 Is this close to what you want? Regards, Chris On Fri, Jun 26, 2015 at 12:41 PM, Johan Mörén <[email protected]<mailto:[email protected]>> wrote: > Hi Christopher! > > I'm not sure where you wan't me to use these options. But i tried to add > them to the cts:value-tuples() but that did not return the expected result. > > like this > > ... > for $tuple in > cts:value-tuples( > ( > cts:uri-reference(), > $sizeRef > ), > ("frequency-order","descending","limit=10") > > ) > ... > > Regards, > Johan > > On Fri, Jun 26, 2015 at 5:58 PM Christopher Hamlin > <[email protected]<mailto:[email protected]>> > wrote: >> >> If you just want something like top ten, I think it's more direct >> possibly. >> >> Can you try returning frequency-order, descending, limit=10? Are those >> options you can use? >> >> _______________________________________________ >> General mailing list >> [email protected]<mailto:[email protected]> >> Manage your subscription at: >> http://developer.marklogic.com/mailman/listinfo/general > > > _______________________________________________ > General mailing list > [email protected]<mailto:[email protected]> > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected]<mailto:[email protected]> Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
