I've repeated Sebastiano's experiments (and so did he). A few quotes from the communication.
> The index appears to be larger now--43.1GB. Probably they have better > skipping structures that take more space. > > From what I can see the format is the same as before--the .frq file contains > document pointers and positions. So my SearchFiles class still reads > documents *and* counts. > > But the most interesting part I've read in a blog is that now Lucene has a > pluggable index format. This means that someone can actually write a QS index > for Lucene and test what happens in production. That's a most interesting > change! and: > Well, they made a great job: > > trec-40-text unscored terms result: 5511 494901 > trec-40-text unscored and result: 2193 769110 > trec-40-text unscored phrase result: 6615 148663 > trec-40-text unscored spans result: 12407 545090 > > So conjunction is still better, but by a really smaller margin. The worst > part is term scanning--they are now significantly faster than QS indices. Dawid On Sun, Jun 24, 2012 at 9:31 AM, Dawid Weiss <[email protected]> wrote: > Fyi. I contacted Sebastiano and will get hold of the data set and > benchmarks he used to repeat his experiment with current trunk > (curiosity). Any hints on which configuration should be used will be > welcome. > > Dawid > > On Sat, Jun 23, 2012 at 12:38 PM, Li Li <[email protected]> wrote: >> http://mg4j.di.unimi.it/ >> http://vigna.di.unimi.it/papers.php#VigQSI >> >> sounds very interesting and attractive. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
