[ 
https://issues.apache.org/jira/browse/LUCENE-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2886:
--------------------------------

    Attachment: LUCENE-2886.patch

I spent some more time on Simple64, took Mike's previous patch and added some 
minor improvements:

# Switched the decoding logic to the "Simple-8-4b" referred to in the paper. 
This is the same encoding, but we process with ints instead of longs.
# Because our buffers are so tiny (for example 32 bytes), the overhead of NIO 
hurts rather than helps, so I switched to native arrays.

The performance is looking much more reasonable. Here's my tests on windows, 
maybe i can convince Mike to sanity-check it on linux.

64-bit SimpleFS
||Query||QPS BulkVInt||QPS Simple64VarInt4||Pct diff||||
|"united states"~3|3.79|3.67|{color:red}-3.3%{color}|
|doctitle:.*[Uu]nited.*|2.45|2.46|{color:green}0.3%{color}|
|spanNear([unit, state], 10, true)|22.13|22.64|{color:green}2.3%{color}|
|uni*|14.04|14.43|{color:green}2.7%{color}|
|united~0.75|6.83|7.04|{color:green}3.2%{color}|
|unit*|25.39|26.21|{color:green}3.2%{color}|
|doctimesecnum:[10000 TO 60000]|8.83|9.16|{color:green}3.6%{color}|
|united~0.6|4.29|4.47|{color:green}4.2%{color}|
|united states|9.35|9.74|{color:green}4.2%{color}|
|un*d|12.88|13.50|{color:green}4.8%{color}|
|"united states"|6.86|7.21|{color:green}5.1%{color}|
|unit~0.7|14.11|14.85|{color:green}5.3%{color}|
|unit~0.5|8.17|8.60|{color:green}5.3%{color}|
|u*d|5.70|6.05|{color:green}6.1%{color}|
|states|30.02|31.90|{color:green}6.3%{color}|
|spanFirst(unit, 5)|86.56|94.15|{color:green}8.8%{color}|
|+united +states|11.10|12.55|{color:green}13.1%{color}|
|+nebraska +states|46.72|57.90|{color:green}23.9%{color}|

32-bit SimpleFS
||Query||QPS BulkVInt||QPS Simple64VarInt4||Pct diff||||
|spanFirst(unit, 5)|95.67|91.02|{color:red}-4.9%{color}|
|"united states"|5.47|5.25|{color:red}-4.1%{color}|
|"united states"~3|3.37|3.32|{color:red}-1.6%{color}|
|unit*|20.45|20.33|{color:red}-0.6%{color}|
|uni*|11.10|11.06|{color:red}-0.3%{color}|
|doctimesecnum:[10000 TO 60000]|7.15|7.16|{color:green}0.0%{color}|
|doctitle:.*[Uu]nited.*|2.26|2.27|{color:green}0.4%{color}|
|unit~0.5|7.73|7.77|{color:green}0.5%{color}|
|un*d|10.80|10.87|{color:green}0.6%{color}|
|united~0.75|6.77|6.97|{color:green}2.8%{color}|
|unit~0.7|12.97|13.41|{color:green}3.4%{color}|
|united~0.6|4.10|4.26|{color:green}3.7%{color}|
|u*d|4.91|5.10|{color:green}4.0%{color}|
|spanNear([unit, state], 10, true)|20.50|21.72|{color:green}5.9%{color}|
|states|30.00|33.15|{color:green}10.5%{color}|
|+united +states|9.71|10.78|{color:green}11.1%{color}|
|united states|9.65|10.96|{color:green}13.6%{color}|
|+nebraska +states|43.93|54.38|{color:green}23.8%{color}|

64-bit MMap
||Query||QPS BulkVInt||QPS Simple64VarInt4||Pct diff||||
|"united states"|8.99|8.41|{color:red}-6.4%{color}|
|states|38.21|36.16|{color:red}-5.4%{color}|
|spanFirst(unit, 5)|118.11|112.19|{color:red}-5.0%{color}|
|doctimesecnum:[10000 TO 60000]|10.78|10.35|{color:red}-4.0%{color}|
|spanNear([unit, state], 10, true)|33.78|32.51|{color:red}-3.7%{color}|
|"united states"~3|4.68|4.54|{color:red}-3.0%{color}|
|unit*|30.00|29.26|{color:red}-2.4%{color}|
|uni*|17.48|17.06|{color:red}-2.4%{color}|
|united states|11.60|11.35|{color:red}-2.1%{color}|
|+united +states|13.95|14.08|{color:green}1.0%{color}|
|united~0.75|10.76|10.87|{color:green}1.1%{color}|
|united~0.6|7.75|7.88|{color:green}1.7%{color}|
|un*d|17.16|17.66|{color:green}2.9%{color}|
|doctitle:.*[Uu]nited.*|3.85|3.98|{color:green}3.3%{color}|
|unit~0.7|27.00|28.08|{color:green}4.0%{color}|
|unit~0.5|16.64|17.46|{color:green}4.9%{color}|
|u*d|8.68|9.31|{color:green}7.2%{color}|
|+nebraska +states|83.30|96.53|{color:green}15.9%{color}|



> Adaptive Frame Of Reference 
> ----------------------------
>
>                 Key: LUCENE-2886
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2886
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Codecs
>            Reporter: Renaud Delbru
>             Fix For: 4.0
>
>         Attachments: LUCENE-2886.patch, LUCENE-2886.patch, LUCENE-2886.patch, 
> LUCENE-2886_simple64.patch, LUCENE-2886_simple64_varint.patch, 
> lucene-afor.tar.gz
>
>
> We could test the implementation of the Adaptive Frame Of Reference [1] on 
> the lucene-4.0 branch.
> I am providing the source code of its implementation. Some work needs to be 
> done, as this implementation is working on the old lucene-1458 branch. 
> I will attach a tarball containing a running version (with tests) of the AFOR 
> implementation, as well as the implementations of PFOR and of Simple64 
> (simple family codec working on 64bits word) that has been used in the 
> experiments in [1].
> [1] http://www.deri.ie/fileadmin/documents/deri-tr-afor.pdf

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to