Re: [MarkLogic Dev General] Another cts:uri question ... and sample, vs. truncate vs. limit

Michael Blakeley Wed, 27 Jul 2011 09:54:59 -0700

Probably limit=N is fastest because it simply stops looking after it finds N 
values. Naturally that doesn't gather enough information to score-order all the 
fragments.


>From 
>http://developer.marklogic.com/pubs/4.2/apidocs/Lexicons.html#cts:element-value-match
> it seems like the difference between the other two options is related to 
>analytics. As best I understand it, this text would also apply to cts:uris and 
>all the other lexicon functions that support these options.

> "limit=N"
> Return no more than N values.
> "sample=N"
> Return only values occurring in the first N fragments selected by the 
> cts:query; only values in fragments satisfying thects:query are returned, but 
> any analytics calculations (using cts:frequency, for example) use all the 
> lexicon values, not just the ones constrained by the cts:query. Only applies 
> when a $query parameter is specified.
> "truncate=N"
> Include only values from the first N fragments selected by the cts:query; 
> only values in fragments satisfying the cts:queryare returned, and only those 
> values are used in calculating any analytics (using cts:frequency, for 
> example). Only applies when a $query parameter is specified.


I have a hard time imagining how analytics would apply to document URIs, so I 
think you would be safe with truncate=N.

-- Mike

On 27 Jul 2011, at 08:21 , McBeath, Darin W (ELS-STL) wrote:

> Could someone explain to me the difference between 'sample' and 'truncate'?
> 
> Also, what I'm finding interesting is the significant difference in 
> performance between these 3 options.
> 
> In my case, I only want 1 value returned, so I either say "limit=1", 
> "sample=1", or "truncate=1".
> 
> But, the performance difference is profound.
> 
> Limit=1 is by far the fastest for my set of queries (about 6s).  
> Unfortunately, I can't use this option because the results are not ordered by 
> the 'score'.
> 
> truncate=1 is next with an avg of around 8s.  I can use this option because 
> the results are ordered by the 'score'.
> 
> sample=1 is the slowest with an avg of around 11s.  Returns same results as 
> truncate, but quite a bit slower.
> 
> So, for the exact same cts:query the performance varies quite a bit 
> (something that I wouldn't have expected).  Could anyone from the MarkLogic 
> side provide some insight here?
> 
> Thanks.
> 
> Darin.
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> 

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Another cts:uri question ... and sample, vs. truncate vs. limit

Reply via email to