We're attempting to get explains for the top 1000 documents we fetch from Lucene and running into performance problems. Is there a good way to speed these up?
In our profiling, we see the advance() methods in each of the subqueries dominating the CPU time. The explain API takes 1 document at a time and each subquery scorer does an advance(), so the time seems to be a function of docs * queryTerms. I can't see an easy way to speed this up. Maybe making a temporary in-memory index that contained the 1000 docs to minimize the cost of advance()? Maybe an AtomicReader that filters anything but the docs we care about? Thanks. --Gregg