[jira] Updated: (LUCENE-1410) PFOR implementation

Michael McCandless (JIRA) Sun, 12 Oct 2008 13:03:36 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael McCandless updated LUCENE-1410:
---------------------------------------

    Attachment: TermQueryTests.tgz

In order to understand how time is spent overall during searching, I
took TermQuery and reimplemented it in several different ways.

While each implementatation is correct (checksum of final top N docs
matches) they are very much prototypes and nowhere near committable.

I then took the 100 most frequent terms (total 23.3 million hits) from
my Wikipedia index and ran them each in sequence.  Each result is best
of 25 runs (all results are from the OS's IO cache):

||Test||Time (msec)||Hits/msec||Speedup||
|Baseline|674|34496|1.00X|
|+ Code Speedups|591|39340|1.14X|
|+ Code Speedups + PFOR|295|78814|2.28X|
|+ Code Speedups + BITS|247|94130|2.73X|
|+ Code Speedups + BITS (native)|230|101088|2.93X|

Here's what the test names mean:

  * Baseline is the normal TermQuery, searching with TopDocsCollector
    for top 10 docs.

  * Code Speedups means some basic optimizations, eg made my own
    specialized priority queue, unrolled loops, etc.

  * PFOR means switching to PFOR for storing docs & freqs as separate
    streams.  Each term's posting starts a new PFOR block.

  * BITS just means using packed n-bit ints for each block (ie, it has
    no exceptions, so it sets N so that all ints will fit).  The
    resulting frq file was 18% bigger and doc file was 10% bigger --
    but this is just for the 100 most frequent terms.

  * BITS (native) is BITS but running as JNI (C++) code.

Next, I tried running the same things above, but I turned off
collection of hits.  So this really just tests decode time:

||Test||Time (msec)||Hits/msec||Speedup||
|+ Code Speedups|384|60547|1.76X|
|+ Code Speedups + PFOR|91|255497|7.41X|
|+ Code Speedups + BITS|49|474496|13.76X|
|+ Code Speedups + BITS (native)|32|726572|21.06X|

Some observations:

  * PFOR really does speed up TermQuery overall, so, I think we should
    pursue it and get it committed.

  * BITS is a good speedup beyond PFOR, but we haven't optimized PFOR
    yet.  Also, BITS would be very seekable.  We could also use PFOR
    but increase the bit size of blocks, to get the same thing.

  * Once we swap in PFOR and/or BITS or other interesting int block
    compression, accumulating hits becomes the slowest part of
    TermQuery.

  * Native BITS decoding code is quite a bit faster for decoding, but,
    probably not worth pursuing for now since with PFOR decode time
    becomes a small part of the overall time.

  * TermQuery is the simplest query; other queries will spend more
    CPU time coordinating, or time handling positions, so we can't
    yet conclude how much of an impact PFOR will have on them.


> PFOR implementation
> -------------------
>
>                 Key: LUCENE-1410
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1410
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Other
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: autogen.tgz, LUCENE-1410b.patch, LUCENE-1410c.patch, 
> TermQueryTests.tgz, TestPFor2.java, TestPFor2.java, TestPFor2.java
>
>   Original Estimate: 21840h
>  Remaining Estimate: 21840h
>
> Implementation of Patched Frame of Reference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1410) PFOR implementation

Reply via email to