I narrowed down the problem to 3+ word phrases. With that hunch, I enabled word 
positions, and after reindexing the estimates are now correct.

I was thinking, incorrectly, that estimates would still be accurate with only 
fast phrase searches (and not word positions) enabled. But now that I look back 
at how that works, it’s clear that would only be true of 2-word phrases.

-Will


On Nov 19, 2013, at 3:23 PM, Michael Blakeley <[email protected]> wrote:

> Which release is this? Is the problem limited to a particular word? If so, 
> what words?
> 
> Have you tried a query trace or xdmp:plan yet? If you can run that with ML7 
> that is even more useful.
> 
> -- Mike
> 
> On 19 Nov 2013, at 12:43 , Will Thompson <[email protected]> wrote:
> 
>> I’m trying to determine why some search result estimates are overcounted. 
>> Documents generally look like:
>> 
>> <chapter>
>>   <subchapter>
>>       <doc>
>>           <section>
>> 
>> Fragment root is set on <doc> (and no ancestors or descendants of <doc>). 
>> count(//doc) = xdmp:estimate(//doc) => true. The searchable expression is 
>> xdmp:directory((‘dir1’, ‘dir2’, …), ‘infinity’)//doc. The word query 
>> specification explicitly includes <doc> and excludes document root. 
>> 
>> The documentation suggests to prevent overcounting we just ensure that 1) 
>> searchable expressions always select a fragment, and 2) there are no 
>> predicates applied to the searchable expression. Are there any other 
>> conditions that may cause overcounting of a simple word query?
>> 
>> -Will
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> 
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> 

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to