Using both cts:frequence and cts:count-aggregate I get numbers that are closer 
to the correct count but are short by about 200. What would account for the 
difference?

Queries:

let $profiles := 
collection($collection)/enrprof:profiling-instance/enrprof:enrichment/enrprof:evalResult/prof:*
let $histograms := $profiles/prof:histogram
let $overall-elapsed := $profiles/prof:metadata/prof:overall-elapsed
let $durations := cts:element-values(xs:QName("prof:overall-elapsed"), (), 
"descending",
                     cts:collection-query($collection))
let $count-frequency := sum(for $dur in $durations return cts:frequency($dur))
let $overall-elapsed-ref := 
cts:element-reference(fn:QName("http://marklogic.com/xdmp/profile","overall-elapsed";),("type=dayTimeDuration"))

let $count-frequency := sum(for $dur in $durations return cts:frequency($dur))
let $count-aggregate := cts:count-aggregate($overall-elapsed-ref,(), 
cts:collection-query($collection))

Results:

<count-profiles>47539</count-profiles>
<count-histograms>47539</count-histograms>
<count-overall-elapsed>47539</count-overall-elapsed>
<count-frequency>47371</count-frequency>
<count-aggregate>47371</count-aggregate>
<count-durations>21219</count-durations>

Cheers,

E.
--
Eliot Kimber
http://contrext.com
 



On 8/14/17, 1:53 PM, "[email protected] on behalf of Mary 
Holstege" <[email protected] on behalf of 
[email protected]> wrote:

    
    That is overkill.  The results you get out of cts:element-values have a  
    frequency (accessible via cts:frequency). The cts: aggregates (e.g.  
    cts:count, cts:sum) take the frequency into account.
    
    //Mary
    
    On Mon, 14 Aug 2017 11:42:07 -0700, Oleksii Segeda  
    <[email protected]> wrote:
    
    > Eliot,
    >
    > You can do something like this:
    >   
cts:element-value-co-occurrences(xs:QName("prof:overall-elapsed"),xs:QName("xdmp:document"))
    > if you have only one element per document.
    >
    > Best,
    >
    > Oleksii Segeda
    > IT Analyst
    > Information and Technology Solutions
    > www.worldbank.org
    >
    >
    > -----Original Message-----
    > From: [email protected]  
    > [mailto:[email protected]] On Behalf Of Eliot  
    > Kimber
    > Sent: Monday, August 14, 2017 2:31 PM
    > To: MarkLogic Developer Discussion <[email protected]>
    > Subject: [MarkLogic Dev General] Count of cts:element-values() not equal  
    > to number of element instances--what's going on?
    >
    > I have this query:
    >
    > let $durations := cts:element-values(xs:QName("prof:overall-elapsed"),  
    > (), "descending",
    >                      cts:collection-query($collection))
    >
    > And this query:
    >
    > let $overall-elapsed := $profiles/prof:metadata/prof:overall-elapsed
    >
    > Where there an element range index for prof:overall-elapsed.
    >
    > Comparing the two results I get very different numbers when I expected  
    > them to be equal:
    >
    > <count-overall-elapsed>47539</count-overall-elapsed>
    > <count-durations>21219</count-durations>
    >
    > Doing this:
    >
    > count(distinct-values($overall-elapsed ! xs:dayTimeDuration(.))
    >
    > Returns 21219, making it clear that the range index is returning  
    > distinct values, not all values. It makes sense in terms of how I would  
    > expect a range index to be structured (a one-to-many mapping for values  
    > to elements) but doesn’t make sense as the return for a function named  
    > “element-values” (and not element-distinct-values).
    >
    > I didn’t see this behavior mentioned in the docs (although the  
    > introduction to the Lexicon reference section does describe lexicons as  
    > sets of unique values).
    >
    > My requirement is to *quickly* get a list of the durations for all  
    > prof:expression elements (which I use for both counting and for  
    > bucketing, so I need all values, not just all distinct values).
    >
    > Is there a way to do what I want using only indexes?
    >
    > Thanks,
    >
    > E.
    > --
    > Eliot Kimber
    > http://contrext.com
    >
    >
    >
    > _______________________________________________
    > General mailing list
    > [email protected]
    > Manage your subscription at:
    > http://developer.marklogic.com/mailman/listinfo/general
    > _______________________________________________
    > General mailing list
    > [email protected]
    > Manage your subscription at:
    > http://developer.marklogic.com/mailman/listinfo/general
    
    
    -- 
    Using Opera's revolutionary email client: http://www.opera.com/mail/
    _______________________________________________
    General mailing list
    [email protected]
    Manage your subscription at: 
    http://developer.marklogic.com/mailman/listinfo/general
    


_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to