Re: [MarkLogic Dev General] Speeding up xquery returning aggregates

Mary Holstege Fri, 23 Sep 2016 10:12:10 -0700

On Fri, 23 Sep 2016 09:36:16 -0700, Mark Shanks  
<markshanks...@hotmail.com> wrote:
...
>
> I'm still unclear of what is going on under the hood in Marklogic. The  
> following link (https://docs.marklogic.com/guide/search-dev/lexicon)  
> talks about value co-occurrrence lexicons. If this is built, then 2  
> facets could just refer to this and would result in the extremely fast  
> performance observed. On the other hand, 3 or more facets would not have  
> a pre-prepared lexicon to quiz. The documentation isn't clear whether a  
> co-occurrence lexicon is built whenever an index is built, or whether it  
> needs to be specifically configured. The documentation about creating  
> lexicons points you to the " 'Text Indexing' and 'Element/Attribute  
> Range Indexes and Lexicons' chapters of the Administrator's Guide", but  
> these then don't mention co-occurrence lexicons at all. So it isn't  
> clear how you actually get a co-occurrence lexicon built.


There is no such thing as a co-occurrence lexicon, so it is never built:  
there are co-occurrence lexicon calls. Co-occurrences are computed over  
lexicons when you ask. The more lexicons involved in that call, the more  
work that it needs to do. The other big driver for performance in  
cts:value-tuples calls is how many instances of each value. To find  
co-occurrences of A, B, and C, for each value of A, for each document that  
contains that value, for each value of B, for each document that contains  
that value, get all values of C. It isn't quire exponential because there  
is a certain amount of internal caching that happens to avoid rework, but  
every additional lexicon added to the call makes it harder. We don't cache  
the complete set of co-occurrences anywhere right now.

//Mary
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Speeding up xquery returning aggregates

Reply via email to