On Fri, 23 Sep 2016 09:36:16 -0700, Mark Shanks <markshanks...@hotmail.com> wrote: ... > > I'm still unclear of what is going on under the hood in Marklogic. The > following link (https://docs.marklogic.com/guide/search-dev/lexicon) > talks about value co-occurrrence lexicons. If this is built, then 2 > facets could just refer to this and would result in the extremely fast > performance observed. On the other hand, 3 or more facets would not have > a pre-prepared lexicon to quiz. The documentation isn't clear whether a > co-occurrence lexicon is built whenever an index is built, or whether it > needs to be specifically configured. The documentation about creating > lexicons points you to the " 'Text Indexing' and 'Element/Attribute > Range Indexes and Lexicons' chapters of the Administrator's Guide", but > these then don't mention co-occurrence lexicons at all. So it isn't > clear how you actually get a co-occurrence lexicon built.
There is no such thing as a co-occurrence lexicon, so it is never built: there are co-occurrence lexicon calls. Co-occurrences are computed over lexicons when you ask. The more lexicons involved in that call, the more work that it needs to do. The other big driver for performance in cts:value-tuples calls is how many instances of each value. To find co-occurrences of A, B, and C, for each value of A, for each document that contains that value, for each value of B, for each document that contains that value, get all values of C. It isn't quire exponential because there is a certain amount of internal caching that happens to avoid rework, but every additional lexicon added to the call makes it harder. We don't cache the complete set of co-occurrences anywhere right now. //Mary _______________________________________________ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general