I've repeated Sebastiano's experiments (and so did he). A few quotes
from the communication.

> The index appears to be larger now--43.1GB. Probably they have better 
> skipping structures that take more space.
>
> From what I can see the format is the same as before--the .frq file contains 
> document pointers and positions. So my SearchFiles class still reads 
> documents *and* counts.
>
> But the most interesting part I've read in a blog is that now Lucene has a 
> pluggable index format. This means that someone can actually write a QS index 
> for Lucene and test what happens in production. That's a most interesting 
> change!

and:

> Well, they made a great job:
>
> trec-40-text    unscored        terms   result: 5511    494901
> trec-40-text    unscored        and     result: 2193 769110
> trec-40-text    unscored        phrase  result: 6615 148663
> trec-40-text    unscored        spans   result: 12407 545090
>
> So conjunction is still better, but by a really smaller margin. The worst 
> part is term scanning--they are now significantly faster than QS indices.

Dawid



On Sun, Jun 24, 2012 at 9:31 AM, Dawid Weiss
<[email protected]> wrote:
> Fyi. I contacted Sebastiano and will get hold of the data set and
> benchmarks he used to repeat his experiment with current trunk
> (curiosity). Any hints on which configuration should be used will be
> welcome.
>
> Dawid
>
> On Sat, Jun 23, 2012 at 12:38 PM, Li Li <[email protected]> wrote:
>> http://mg4j.di.unimi.it/
>> http://vigna.di.unimi.it/papers.php#VigQSI
>>
>> sounds very interesting and attractive.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to