[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851372#action_12851372 ]
Michael McCandless commented on LUCENE-2111: -------------------------------------------- Towards wrapping up flex, I ran a set of tests to benchmark flex's search performance vs trunk. All tests are on a 5M doc Wikipedia index, best qps of 5 runs where each run runs the query for 5.0 seconds. Env is: {noformat} JAVA: java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode) OS: Linux centos 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009 x86_64 x86_64 x86_64 GNU/Linux {noformat} First table compares trunk against "flex on flex", ie, a flex index (fully reindexed after upgrading to flex): ||Query||Tot hits||Sort||QPS trunk||QPS new||Pct change|| |1|591225| |68.36|80.64|{color:green}18.0%{color}| | | |title|64.12|68.53|{color:green}6.9%{color}| |1 OR 2|953081| |19.35|20.80|{color:green}7.5%{color}| | | |title|16.50|17.48|{color:green}5.9%{color}| |1 OR 2 OR 3|1131679| |14.37|15.50|{color:green}7.9%{color}| | | |title|12.42|13.26|{color:green}6.8%{color}| |1 OR 2 OR 3 OR 4|1266805| |10.94|12.76|{color:green}16.6%{color}| | | |title|10.36|11.05|{color:green}6.7%{color}| |1 AND 2|239303| |21.19|22.32|{color:green}5.3%{color}| | | |title|22.77|24.25|{color:green}6.5%{color}| |1 AND 2 AND 3|109513| |18.83|19.17|{color:green}1.8%{color}| | | |title|19.30|20.06|{color:green}3.9%{color}| |1 AND 2 AND 3 AND 4|60795| |16.21|17.51|{color:green}8.0%{color}| | | |title|16.75|18.29|{color:green}9.2%{color}| |"united states"|528845| |7.54|8.54|{color:green}13.3%{color}| | | |title|7.36|8.14|{color:green}10.6%{color}| |"united states of america"|12144| |20.64|21.48|{color:green}4.1%{color}| | | |title|20.45|21.06|{color:green}3.0%{color}| |un*|2250238| |9.31|11.54|{color:green}24.0%{color}| | | |title|8.42|10.96|{color:green}30.2%{color}| |*ent|2482701| |0.32|0.92|{color:green}187.5%{color}| | | |title|0.32|0.91|{color:green}184.4%{color}| |u*t|169192| |18.53|47.97|{color:green}158.9%{color}| | | |title|17.26|40.10|{color:green}132.3%{color}| |uni*|1308332| |18.54|23.49|{color:green}26.7%{color}| | | |title|16.28|20.02|{color:green}23.0%{color}| |un*t|124623| |62.13|105.23|{color:green}69.4%{color}| | | |title|50.38|74.99|{color:green}48.8%{color}| |?t|554722| |0.51|29.31|{color:green}5647.1%{color}| | | |title|0.51|26.25|{color:green}5047.1%{color}| |??t|1605437| |0.60|6.69|{color:green}1015.0%{color}| | | |title|0.60|6.22|{color:green}936.7%{color}| |???t|3100067| |0.54|1.92|{color:green}255.6%{color}| | | |title|0.53|1.89|{color:green}256.6%{color}| |????t|2973045| |0.51|0.71|{color:green}39.2%{color}| | | |title|0.51|0.70|{color:green}37.3%{color}| |?????t|2323871| |0.51|0.39|{color:red}-23.5%{color}| | | |title|0.50|0.39|{color:red}-22.0%{color}| |??????t|2459025| |0.49|0.31|{color:red}-36.7%{color}| | | |title|0.48|0.15|{color:red}-68.7%{color}| |un?t|86664| |92.45|241.46|{color:green}161.2%{color}| | | |title|72.59|151.28|{color:green}108.4%{color}| |un??t|2860| |222.11|408.52|{color:green}83.9%{color}| | | |title|220.91|405.84|{color:green}83.7%{color}| |un???t|5828| |117.38|99.64|{color:red}-15.1%{color}| | | |title|111.47|98.64|{color:red}-11.5%{color}| |un????t|1426| |207.03|100.60|{color:red}-51.4%{color}| | | |title|207.23|101.36|{color:red}-51.1%{color}| |united~0.5|872873| |0.35|0.31|{color:red}-11.4%{color}| | | |title|0.35|0.31|{color:red}-11.4%{color}| |united~0.6|764041| |0.46|5.22|{color:green}1034.8%{color}| | | |title|0.45|5.00|{color:green}1011.1%{color}| |united~0.7|695756| |0.59|21.19|{color:green}3491.5%{color}| | | |title|0.60|19.10|{color:green}3083.3%{color}| |united~0.8|693134| |0.59|21.44|{color:green}3533.9%{color}| | | |title|0.58|19.55|{color:green}3270.7%{color}| |united~0.9|692299| |57.06|67.80|{color:green}18.8%{color}| | | |title|55.28|57.87|{color:green}4.7%{color}| I also ran the same queries through, but this time using the trunk (pre-flex) index with flex, ie to perf test the "flex on pre-flex" emulation layer. This is the initial experience users will see if they upgrade to flex but don't reindex: ||Query||Tot hits||Sort||QPS trunk||QPS new||Pct change|| |1|591225| |68.36|66.91|{color:red}-2.1%{color}| | | |title|64.12|58.47|{color:red}-8.8%{color}| |1 OR 2|953081| |19.35|19.06|{color:red}-1.5%{color}| | | |title|16.50|16.03|{color:red}-2.8%{color}| |1 OR 2 OR 3|1131679| |14.37|14.14|{color:red}-1.6%{color}| | | |title|12.42|12.11|{color:red}-2.5%{color}| |1 OR 2 OR 3 OR 4|1266805| |10.94|11.61|{color:green}6.1%{color}| | | |title|10.36|10.04|{color:red}-3.1%{color}| |1 AND 2|239303| |21.19|21.12|{color:red}-0.3%{color}| | | |title|22.77|22.46|{color:red}-1.4%{color}| |1 AND 2 AND 3|109513| |18.83|18.81|{color:red}-0.1%{color}| | | |title|19.30|19.29|{color:red}-0.1%{color}| |1 AND 2 AND 3 AND 4|60795| |16.21|17.18|{color:green}6.0%{color}| | | |title|16.75|17.46|{color:green}4.2%{color}| |"united states"|528845| |7.54|7.63|{color:green}1.2%{color}| | | |title|7.36|7.12|{color:red}-3.3%{color}| |"united states of america"|12144| |20.64|19.33|{color:red}-6.3%{color}| | | |title|20.45|19.50|{color:red}-4.6%{color}| |un*|2250238| |9.31|9.79|{color:green}5.2%{color}| | | |title|8.42|9.65|{color:green}14.6%{color}| |*ent|2482701| |0.32|0.45|{color:green}40.6%{color}| | | |title|0.32|0.45|{color:green}40.6%{color}| |u*t|169192| |18.53|24.75|{color:green}33.6%{color}| | | |title|17.26|21.96|{color:green}27.2%{color}| |uni*|1308332| |18.54|19.39|{color:green}4.6%{color}| | | |title|16.28|15.86|{color:red}-2.6%{color}| |un*t|124623| |62.13|59.73|{color:red}-3.9%{color}| | | |title|50.38|48.51|{color:red}-3.7%{color}| |?t|554722| |0.51|23.65|{color:green}4537.3%{color}| | | |title|0.51|21.42|{color:green}4100.0%{color}| |??t|1605437| |0.60|5.13|{color:green}755.0%{color}| | | |title|0.60|4.61|{color:green}668.3%{color}| |???t|3100067| |0.54|1.28|{color:green}137.0%{color}| | | |title|0.53|1.24|{color:green}134.0%{color}| |????t|2973045| |0.51|0.55|{color:green}7.8%{color}| | | |title|0.51|0.54|{color:green}5.9%{color}| |?????t|2323871| |0.51|0.29|{color:red}-43.1%{color}| | | |title|0.50|0.29|{color:red}-42.0%{color}| |??????t|2459025| |0.49|0.18|{color:red}-63.3%{color}| | | |title|0.48|0.21|{color:red}-56.2%{color}| |un?t|86664| |92.45|202.48|{color:green}119.0%{color}| | | |title|72.59|134.55|{color:green}85.4%{color}| |un??t|2860| |222.11|187.05|{color:red}-15.8%{color}| | | |title|220.91|186.81|{color:red}-15.4%{color}| |un???t|5828| |117.38|69.30|{color:red}-41.0%{color}| | | |title|111.47|68.59|{color:red}-38.5%{color}| |un????t|1426| |207.03|60.98|{color:red}-70.5%{color}| | | |title|207.23|60.62|{color:red}-70.7%{color}| |united~0.5|872873| |0.35|0.23|{color:red}-34.3%{color}| | | |title|0.35|0.23|{color:red}-34.3%{color}| |united~0.6|764041| |0.46|3.84|{color:green}734.8%{color}| | | |title|0.45|3.76|{color:green}735.6%{color}| |united~0.7|695756| |0.59|17.45|{color:green}2857.6%{color}| | | |title|0.60|15.53|{color:green}2488.3%{color}| |united~0.8|693134| |0.59|17.56|{color:green}2876.3%{color}| | | |title|0.58|15.97|{color:green}2653.4%{color}| |united~0.9|692299| |57.06|56.02|{color:red}-1.8%{color}| | | |title|55.28|49.26|{color:red}-10.9%{color}| There are alot of numbers to absorb... but here's my take: * Flex is generally faster. * Fuzzy queries and certain wildcard queries (using AutomatonQuery) are insanely faster. * There are certain specific wildcard corner cases where we are slower, but these are likely rarely used in practice (many ?'s followed by a suffix). * Flex API on a trunk index does take a perf hit but it looks contained enough that we don't need to spend any time optimizing that emulation layer... I also ran an indexing test (index first 10M docs of wikipedia) and flex and trunk had similar times. I think net/net we are good to land flex! > Wrapup flexible indexing > ------------------------ > > Key: LUCENE-2111 > URL: https://issues.apache.org/jira/browse/LUCENE-2111 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: Flex Branch > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: 3.1 > > Attachments: benchUtil.py, flex_backwards_merge_912395.patch, > flex_merge_916543.patch, flexBench.py, LUCENE-2111-EmptyTermsEnum.patch, > LUCENE-2111-EmptyTermsEnum.patch, LUCENE-2111.patch, LUCENE-2111.patch, > LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, > LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, > LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, > LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, LUCENE-2111.patch, > LUCENE-2111.patch, LUCENE-2111_bytesRef.patch, > LUCENE-2111_experimental.patch, LUCENE-2111_fuzzy.patch, > LUCENE-2111_mtqNull.patch, LUCENE-2111_mtqTest.patch, > LUCENE-2111_toString.patch > > > Spinoff from LUCENE-1458. > The flex branch is in fairly good shape -- all tests pass, initial search > performance testing looks good, it survived several visits from the Unicode > policeman ;) > But it still has a number of nocommits, could use some more scrutiny > especially on the "emulate old API on flex index" and vice/versa code paths, > and still needs some more performance testing. I'll do these under this > issue, and we should open separate issues for other self contained fixes. > The end is in sight! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org