[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Rutherglen updated LUCENE-1476: ------------------------------------- Attachment: searchdeletes.alg searchdeletes.alg uses reuters, deletes many docs, then performs searches. If it's working properly, iteration rather than calling BitVector.get has some serious performance drawbacks. Compare: DocIdSet SrchNewRdr_8: 32.0 rec/s DelDoc.get SrchNewRdr_8: 2,959.5 rec/s Next step is running JProfiler. Perhaps BitVector needs to be replaced by OpenBitSet for iterating, or there's some other issue. BitVector.get: [java] Operation round mrg buf runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem [java] CreateIndex 0 10 100 1 1 17.2 0.06 3,953,984 9,072,640 [java] CloseIndex - - - - 0 10 100 - - 1 - - - - 1 - - 1,000.0 - - 0.00 - 3,953,984 - - 9,072,640 [java] Populate 0 10 100 1 200003 6,539.7 30.58 8,665,528 10,420,224 [java] Deletions - - - - 0 10 100 - - 1 - - - 8002 - 533,466.7 - - 0.01 - 8,665,528 - 10,420,224 [java] OpenReader(false) 0 10 100 1 1 1,000.0 0.00 8,691,040 10,420,224 [java] Seq_8000 - - - - 0 10 100 - - 1 - - - 8000 - 800,000.0 - - 0.01 - 8,833,912 - 10,420,224 [java] CloseReader 0 10 100 9 1 2,250.0 0.00 7,672,217 10,420,224 [java] SrchNewRdr_8 - - - 0 10 100 - - 1 - - - 4016 - - 2,959.5 - - 1.36 - 8,232,384 - 10,420,224 [java] OpenReader 0 10 100 8 1 1,333.3 0.01 7,579,584 10,420,224 [java] Seq_500 - - - - - 0 10 100 - - 8 - - - 500 - - 2,963.0 - - 1.35 - 8,591,199 - 10,420,224 DocIdSet: [java] Operation round mrg buf runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem [java] CreateIndex 0 10 100 1 1 17.5 0.06 3,954,376 9,076,736 [java] CloseIndex - - - - 0 10 100 - - 1 - - - - 1 - - 1,000.0 - - 0.00 - 3,954,376 - - 9,076,736 [java] Populate 0 10 100 1 200003 6,503.1 30.75 5,951,816 10,321,920 [java] Deletions - - - - 0 10 100 - - 1 - - - 8002 - 500,125.0 - - 0.02 - 6,190,816 - 10,321,920 [java] OpenReader(false) 0 10 100 1 1 1,000.0 0.00 5,976,960 10,321,920 [java] Seq_8000 - - - - 0 10 100 - - 1 - - - 8000 - 727,272.8 - - 0.01 - 6,122,904 - 10,321,920 [java] CloseReader 0 10 100 9 1 3,000.0 0.00 7,727,980 10,321,920 [java] SrchNewRdr_8 - - - 0 10 100 - - 1 - - - 4016 - - - 32.0 - - 125.67 - 7,960,824 - 10,321,920 [java] OpenReader 0 10 100 8 1 1,333.3 0.01 7,742,855 10,321,920 [java] Seq_500 - - - - - 0 10 100 - - 8 - - - 500 - - - 31.8 - - 125.66 - 8,744,057 - 10,321,920 > BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs > ----------------------------------------------------------------------- > > Key: LUCENE-1476 > URL: https://issues.apache.org/jira/browse/LUCENE-1476 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Priority: Trivial > Attachments: LUCENE-1476.patch, LUCENE-1476.patch, LUCENE-1476.patch, > quasi_iterator_deletions.diff, quasi_iterator_deletions_r2.diff, > searchdeletes.alg > > Original Estimate: 12h > Remaining Estimate: 12h > > Update BitVector to implement DocIdSet. Expose deleted docs DocIdSet from > IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org