[ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1483:
---------------------------------------

    Attachment: sortCollate.py
                sortBench.py


OK I ran a bunch of sort perf tests, on trunk & with the patch.
(Attached the two Python sources for doing this... though they require
some small local mods to run properly).

Each alg is run with "java -Xms1024M -Xmx1024M -Xbatch -server" on OS
X 10.5.5, java 1.6.0_07-b06-153.

I use two indexes, each with 2M docs.  One is docs from Wikipedia
(labeled "wiki"), the other is SortableSimpleDocMaker docs augmented
to include a random country field (labeled "simple").  For each I
created 1-segment, 10-segment and 100-segment indices.  I sort by
score, doc, string (val, ord = true ord+subord, ordval = ord +
fallback).  Queue size is 10.

I ran various queries... query "147" hits ~5k docs, query "text" hits
~97K docs, query "1" hits 386K docs and alldocs query hits 2M docs.  qps
is queries per sec and warm is time for first warmup query, from
trunk.  qpsnew & warmnew are with patch.  pctg shows % gain in
qps performance:

||numSeg||index||sortBy||query||topN||meth||hits||warm||qps||warmnew||qpsnew||pctg||
|1|wiki|score|147|10| |   4984|   0.2|5717.6|   0.2|5627.5| -1.6%|
|1|wiki|score|text|10| |  97191|   0.3| 340.9|   0.3| 348.8|  2.3%|
|1|wiki|score|1|10| | 386435|   0.3|  86.7|   0.3|  89.3|  3.0%|
|1|wiki|doc|147|10| |   4984|   0.3|4071.7|   0.3|4649.0| 14.2%|
|1|wiki|doc|text|10| |  97191|   0.3| 225.4|   0.3| 253.7| 12.6%|
|1|wiki|doc|1|10| | 386435|   0.3|  56.9|   0.3|  65.8| 15.6%|
|1|wiki|doc|<all>|10| |2000000|   0.1|  23.0|   0.1|  38.6| 67.8%|
|1|simple|int|text|10| |2000000|   0.7|  10.7|   0.7|  13.5| 26.2%|
|1|simple|int|<all>|10| |2000000|   0.6|  21.1|   0.6|  34.7| 64.5%|
|1|simple|country|text|10|ord|2000000|   0.6|  10.7|   0.6|  13.2| 23.4%|
|1|simple|country|text|10|ordval|2000000|   0.6|  10.7|   0.6|  13.3| 24.3%|
|1|simple|country|<all>|10|ord|2000000|   0.5|  20.7|   0.6|  32.5| 57.0%|
|1|simple|country|<all>|10|ordval|2000000|   0.5|  20.7|   0.6|  34.6| 67.1%|
|1|wiki|title|147|10|ord|   4984|   2.1|3743.8|   2.0|4210.5| 12.5%|
|1|wiki|title|147|10|ordval|   4984|   2.1|3743.8|   2.0|4288.2| 14.5%|
|1|wiki|title|text|10|ord|  97191|   2.1| 144.2|   2.1| 160.3| 11.2%|
|1|wiki|title|text|10|ordval|  97191|   2.1| 144.2|   2.1| 163.5| 13.4%|
|1|wiki|title|1|10|ord| 386435|   2.1|  51.2|   2.1|  63.2| 23.4%|
|1|wiki|title|1|10|ordval| 386435|   2.1|  51.2|   2.1|  64.6| 26.2%|
|1|wiki|title|<all>|10|ord|2000000|   2.1|  21.1|   2.1|  33.2| 57.3%|
|1|wiki|title|<all>|10|ordval|2000000|   2.1|  21.1|   2.1|  35.4| 67.8%|
||numSeg||index||sortBy||query||topN||meth||hits||warm||qps||warmnew||qpsnew||pctg||
|10|wiki|score|147|10| |   4984|   0.3|4228.3|   0.3|4510.6|  6.7%|
|10|wiki|score|text|10| |  97191|   0.3| 294.7|   0.3| 341.5| 15.9%|
|10|wiki|score|1|10| | 386435|   0.4|  75.0|   0.4|  87.0| 16.0%|
|10|wiki|doc|147|10| |   4984|   0.3|3332.2|   0.3|4033.9| 21.1%|
|10|wiki|doc|text|10| |  97191|   0.4| 217.0|   0.4| 277.0| 27.6%|
|10|wiki|doc|1|10| | 386435|   0.4|  54.6|   0.4|  70.5| 29.1%|
|10|wiki|doc|<all>|10| |2000000|   0.1|  12.7|   0.1|  38.6|203.9%|
|10|simple|int|text|10| |2000000|   1.2|  10.3|   0.6|  13.5| 31.1%|
|10|simple|int|<all>|10| |2000000|   1.1|  11.8|   0.8|  34.6|193.2%|
|10|simple|country|text|10|ord|2000000|   0.7|  10.4|   0.5|  13.2| 26.9%|
|10|simple|country|text|10|ordval|2000000|   0.7|  10.4|   0.5|  13.3| 27.9%|
|10|simple|country|<all>|10|ord|2000000|   0.7|  11.5|   0.5|  32.5|182.6%|
|10|simple|country|<all>|10|ordval|2000000|   0.7|  11.5|   0.5|  34.1|196.5%|
|10|wiki|title|147|10|ord|   4984|  12.5|3004.5|   2.1|3124.0|  4.0%|
|10|wiki|title|147|10|ordval|   4984|  12.5|3004.5|   2.1|3353.5| 11.6%|
|10|wiki|title|text|10|ord|  97191|  12.7| 139.4|   2.1| 156.7| 12.4%|
|10|wiki|title|text|10|ordval|  97191|  12.7| 139.4|   2.1| 160.9| 15.4%|
|10|wiki|title|1|10|ord| 386435|  12.7|  50.3|   2.1|  62.3| 23.9%|
|10|wiki|title|1|10|ordval| 386435|  12.7|  50.3|   2.1|  64.1| 27.4%|
|10|wiki|title|<all>|10|ord|2000000|  12.7|  11.4|   2.1|  33.1|190.4%|
|10|wiki|title|<all>|10|ordval|2000000|  12.7|  11.4|   2.1|  35.3|209.6%|
||numSeg||index||sortBy||query||topN||meth||hits||warm||qps||warmnew||qpsnew||pctg||
|100|wiki|score|147|10| |   4984|   0.3|1282.2|   1.7|1162.3| -9.4%|
|100|wiki|score|text|10| |  97191|   0.4| 232.4|   1.3| 275.6| 18.6%|
|100|wiki|score|1|10| | 386435|   0.4|  65.1|   1.4|  80.4| 23.5%|
|100|wiki|doc|147|10| |   4984|   0.4|1170.0|   0.4|1132.0| -3.2%|
|100|wiki|doc|text|10| |  97191|   0.4| 171.7|   0.4| 230.1| 34.0%|
|100|wiki|doc|1|10| | 386435|   0.4|  46.7|   0.4|  67.9| 45.4%|
|100|wiki|doc|<all>|10| |2000000|   0.2|   7.8|   0.1|  41.6|433.3%|
|100|simple|int|text|10| |2000000|   3.3|   8.9|   4.0|  11.0| 23.6%|
|100|simple|int|<all>|10| |2000000|   3.4|   7.7|   1.1|  36.5|374.0%|
|100|simple|country|text|10|ord|2000000|   1.0|   8.8|   0.6|  10.8| 22.7%|
|100|simple|country|text|10|ordval|2000000|   1.0|   8.8|   0.6|  11.0| 25.0%|
|100|simple|country|<all>|10|ord|2000000|   1.0|   7.6|   0.5|  35.0|360.5%|
|100|simple|country|<all>|10|ordval|2000000|   1.0|   7.6|   0.5|  36.3|377.6%|
|100|wiki|title|147|10|ord|   4984|  94.6|1066.9|   2.1| 583.7|-45.3%|
|100|wiki|title|147|10|ordval|   4984|  94.6|1066.9|   2.1| 750.1|-29.7%|
|100|wiki|title|text|10|ord|  97191|  94.9| 110.2|   2.1| 122.7| 11.3%|
|100|wiki|title|text|10|ordval|  97191|  94.9| 110.2|   2.1| 128.4| 16.5%|
|100|wiki|title|1|10|ord| 386435|  94.3|  47.9|   2.1|  58.2| 21.5%|
|100|wiki|title|1|10|ordval| 386435|  94.3|  47.9|   2.1|  60.1| 25.5%|
|100|wiki|title|<all>|10|ord|2000000|  94.6|   7.8|   2.5|  35.6|356.4%|
|100|wiki|title|<all>|10|ordval|2000000|  94.6|   7.8|   2.4|  37.0|374.4%|

It's a ridiculous amount of of data to digest... but here are some
initial thoughts:

  * These are only single term queries; I'd expect for multi term
    queries the gain would be less since net/net less %tg of the time
    is spent collecting.

  * Ord + val fallback (ordval) is generally faster than pure
    ord/subord.  I think for now we should run with ord + val
    fallback?  (We can leave ord + subord commented out?).

  * It's great that we see decent speedups for "sort by score" which
    is presumably the most common sort used.

  * We do get slower in certain cases (neg pctg in the rightmost
    column): all not-in-the-noise slowdowns were with query "147" on
    the 100 segment index.  This query hits relatively few docs (~5K)
    So, this is expected, because the new approach spends some time
    updating its queue for each subreader.  If the time spent
    searching is relatively tiny then this queue update time becomes
    relatively big.  I think with larger queue size the slowdown will
    be worse.  However, I think this is an acceptable tradeoff.

  * The gains for field sorting on a single segment (optimized) index
    are impressive.  Generally, the more hits encountered the better
    the gains.  It's amazing that we see ~67.8% gain sorting by docID,
    country, and title for alldocs query.  My only guess for this is
    better cache hit rate (because we gain locality by copying values
    to local compact arrays).

  * Across the board the alldocs query shows very sizable (5X faster
    for 100 segment index; 3X faster for 10 segment index)
    improvements.

  * I didn't break out %tg difference, but warming time with the patch
    is waaaay faster than trunk when index has > 1 segment.  Reopen
    time should also be fantastically faster (we are sidestepping
    something silly happing w/ FieldCache on a Multi*Reader).  Warming
    on trunk takes ~95 seconds with the 100 segment index!


> Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-1483
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1483
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> sortBench.py, sortCollate.py
>
>
> FieldCache and Filters are forced down to a single segment reader, allowing 
> for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to