I narrowed down the problem to 3+ word phrases. With that hunch, I enabled word positions, and after reindexing the estimates are now correct.
I was thinking, incorrectly, that estimates would still be accurate with only fast phrase searches (and not word positions) enabled. But now that I look back at how that works, it’s clear that would only be true of 2-word phrases. -Will On Nov 19, 2013, at 3:23 PM, Michael Blakeley <[email protected]> wrote: > Which release is this? Is the problem limited to a particular word? If so, > what words? > > Have you tried a query trace or xdmp:plan yet? If you can run that with ML7 > that is even more useful. > > -- Mike > > On 19 Nov 2013, at 12:43 , Will Thompson <[email protected]> wrote: > >> I’m trying to determine why some search result estimates are overcounted. >> Documents generally look like: >> >> <chapter> >> <subchapter> >> <doc> >> <section> >> >> Fragment root is set on <doc> (and no ancestors or descendants of <doc>). >> count(//doc) = xdmp:estimate(//doc) => true. The searchable expression is >> xdmp:directory((‘dir1’, ‘dir2’, …), ‘infinity’)//doc. The word query >> specification explicitly includes <doc> and excludes document root. >> >> The documentation suggests to prevent overcounting we just ensure that 1) >> searchable expressions always select a fragment, and 2) there are no >> predicates applied to the searchable expression. Are there any other >> conditions that may cause overcounting of a simple word query? >> >> -Will >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
