[ https://issues.apache.org/jira/browse/LUCENE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431486#comment-13431486 ]
Michael McCandless commented on LUCENE-4283: -------------------------------------------- Thanks Billy, patch looks good... I also see some improvements in the skip heavy queries: {noformat} Task QPS base StdDev base QPS for StdDev for Pct diff HighSpanNear 1.70 0.05 1.66 0.02 -6% - 2% PKLookup 192.84 3.29 190.09 2.97 -4% - 1% MedSloppyPhrase 6.86 0.09 6.79 0.13 -4% - 2% HighSloppyPhrase 1.97 0.04 1.96 0.08 -6% - 5% MedSpanNear 4.88 0.12 4.85 0.06 -4% - 3% OrHighMed 23.40 0.74 23.31 0.73 -6% - 6% LowSloppyPhrase 7.58 0.12 7.56 0.18 -4% - 3% OrHighLow 27.00 0.92 26.93 0.86 -6% - 6% Wildcard 52.66 0.43 52.54 0.32 -1% - 1% Prefix3 82.44 0.90 82.36 0.87 -2% - 2% IntNRQ 11.61 0.02 11.60 0.02 0% - 0% LowTerm 513.72 0.95 513.40 2.77 0% - 0% OrHighHigh 11.27 0.35 11.27 0.35 -6% - 6% HighTerm 36.10 0.07 36.10 0.03 0% - 0% MedTerm 198.76 0.26 198.85 0.23 0% - 0% Respell 61.52 1.12 61.88 0.36 -1% - 3% Fuzzy1 74.60 1.37 75.07 0.58 -1% - 3% Fuzzy2 62.36 1.33 63.09 0.33 -1% - 3% AndHighHigh 23.62 0.08 24.07 0.21 0% - 3% LowSpanNear 9.65 0.22 9.88 0.06 0% - 5% LowPhrase 22.08 0.37 22.63 0.31 0% - 5% HighPhrase 1.77 0.10 1.83 0.09 -6% - 14% MedPhrase 13.09 0.29 13.54 0.25 0% - 7% AndHighLow 662.00 1.45 700.98 24.76 1% - 9% AndHighMed 69.58 0.18 75.15 1.28 5% - 10% {noformat} > Support more frequent skip with Block Postings Format > ----------------------------------------------------- > > Key: LUCENE-4283 > URL: https://issues.apache.org/jira/browse/LUCENE-4283 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Han Jiang > Priority: Minor > Attachments: LUCENE-4283-buggy.patch, LUCENE-4283-buggy.patch, > LUCENE-4283-codes-cleanup.patch, LUCENE-4283-record-next-skip.patch, > LUCENE-4283-record-skip&inlining-scanning.patch, LUCENE-4283-slow.patch, > LUCENE-4283-small-interval-fully.patch, > LUCENE-4283-small-interval-partially.patch > > > This change works on the new bulk branch. > Currently, our BlockPostingsFormat only supports skipInterval==blockSize. > Every time the skipper reaches the last level 0 skip point, we'll have to > decode a whole block to read doc/freq data. Also, a higher level skip list > will be created only for those df>blockSize^k, which means for most terms, > skipping will just be a linear scan. If we increase current blockSize for > better bulk i/o performance, current skip setting will be a bottleneck. > For ForPF, the encoded block can be easily splitted if we set > skipInterval=32*k. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org