[ https://issues.apache.org/jira/browse/LUCENE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291884#comment-13291884 ]
Michael McCandless commented on LUCENE-4123: -------------------------------------------- Results for 5M doc index: {noformat} Task QPS base StdDev base QPS cachedStdDev cached Pct diff Respell 104.06 7.63 108.59 7.55 -9% - 20% TermGroup1M 57.94 1.59 60.70 0.30 1% - 8% TermBGroup1M 103.28 2.54 108.51 2.54 0% - 10% Fuzzy2 43.07 2.96 45.32 3.06 -8% - 20% Fuzzy1 72.64 4.73 76.92 4.38 -6% - 19% TermBGroup1M1P 90.14 3.03 95.95 3.81 -1% - 14% IntNRQ 16.01 0.95 17.17 0.33 0% - 16% PKLookup 86.21 2.51 92.55 2.59 1% - 13% Wildcard 65.51 3.13 71.00 1.45 1% - 16% OrHighMed 21.64 1.83 23.56 1.24 -4% - 25% Prefix3 105.33 4.94 114.75 2.46 1% - 16% OrHighHigh 17.39 1.45 18.97 0.95 -4% - 24% AndHighHigh 30.05 1.14 33.42 0.88 4% - 18% Term 243.13 9.03 273.92 8.26 5% - 20% SloppyPhrase 15.80 0.28 17.84 0.78 6% - 19% SpanNear 10.52 0.14 11.97 0.25 9% - 17% AndHighMed 117.60 3.54 135.91 2.49 10% - 21% Phrase 20.15 0.78 24.22 0.26 14% - 26% {noformat} > Add CachingRAMDirectory > ----------------------- > > Key: LUCENE-4123 > URL: https://issues.apache.org/jira/browse/LUCENE-4123 > Project: Lucene - Java > Issue Type: Bug > Components: core/store > Reporter: Michael McCandless > Assignee: Michael McCandless > Attachments: LUCENE-4123.patch > > > The directory is very simple and useful if you have an index that you > know fully fits into available RAM. You could also use FileSwitchDir if > you want to leave some files (eg stored fields or term vectors) on disk. > It wraps any other Directory and delegates all writing (IndexOutput) to > it, but for reading (IndexInput), it allocates a single byte[] and fully > reads the file in and then serves requests off that single byte[]. It's > more GC friendly than RAMDir since it only allocates a single array per > file. > It has a few nocommits still, but all tests pass if I wrap the delegate > inside MockDirectoryWrapper using this. > I tested with 1M Wikipedia english index (would like to test w/ 10M docs > but I don't have enough RAM...); it seems to give a nice speedup: > {noformat} > Task QPS base StdDev base QPS cachedStdDev cached > Pct diff > Respell 197.00 7.27 203.19 8.17 -4% - > 11% > PKLookup 121.12 2.80 125.46 3.20 -1% - > 8% > Fuzzy2 66.62 2.62 69.91 2.85 -3% - > 13% > Fuzzy1 206.20 6.47 222.21 6.52 1% - > 14% > TermGroup100K 160.14 6.62 175.71 3.79 3% - > 16% > Phrase 34.85 0.40 38.75 0.61 8% - > 14% > TermBGroup100K 363.75 15.74 406.98 13.23 3% - > 20% > SpanNear 53.08 1.11 59.53 2.94 4% - > 20% > TermBGroup100K1P 222.53 9.78 252.86 5.96 6% - > 21% > SloppyPhrase 70.36 2.05 79.95 4.48 4% - > 23% > Wildcard 238.10 4.29 272.78 4.97 10% - > 18% > OrHighMed 123.49 4.85 149.32 4.66 12% - > 29% > Prefix3 288.46 8.10 350.40 5.38 16% - > 26% > OrHighHigh 76.46 3.27 93.13 2.96 13% - > 31% > IntNRQ 92.25 2.12 113.47 5.74 14% - > 32% > Term 757.12 39.03 958.62 22.68 17% - > 36% > AndHighHigh 103.03 4.48 133.89 3.76 21% - > 39% > AndHighMed 376.36 16.58 493.99 10.00 23% - > 40% > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org