Paul Elschot wrote:
For a large number of indexes, it may be necessary to do this over
multiple indexes by first getting the doc numbers for all indexes,
then sorting these per index, then retrieving them
from all indexes, and repeating the whole thing using terms determined
from the retrieved docs.
Well this was a BIG win. Just benchmarking it out shows a 10x -> 50x
performance increase.
Times in milliseconds:
Before:
duration: 1127
duration: 449
duration: 394
duration: 564
After:
duration: 182
duration: 39
duration: 12
duration: 11
The values of 2-4 I'm sure are due to the filesystem buffer cache but I
can't imagine why they'd be faster in the second round. It might be
that Linux is deciding not to buffer the document blocks.
Kevin
--
Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com.
See irc.freenode.net #rojo if you want to chat.
Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
Kevin A. Burton, Location - San Francisco, CA
AIM/YIM - sfburtonator, Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]