[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1476:
---------------------------------------
Attachment: sortCollate2.py
sortBench2.py
I'm attaching the Python code I use to run (it's adapted from lucene-1483).
You also need the following nocommit diff applied:
{code}
Index:
contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java
===================================================================
---
contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java
(revision 738896)
+++
contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java
(working copy)
@@ -62,6 +62,9 @@
super(runData);
}
+ // nocommit
+ static boolean first = true;
+
public int doLogic() throws Exception {
int res = 0;
boolean closeReader = false;
@@ -102,6 +105,12 @@
}
//System.out.println("q=" + q + ":" + hits.totalHits + " total hits");
+ // nocommit
+ if (first) {
+ System.out.println("NUMHITS=" + hits.totalHits);
+ first = false;
+ }
+
if (withTraverse()) {
final ScoreDoc[] scoreDocs = hits.scoreDocs;
int traversalSize = Math.min(scoreDocs.length, traversalSize());
{code}
Just run sortBench2.py in contrib/benchmark of trunk & patch areas. Then run
sortCollate2.py to make the Jira table (-jira) or print a human readable output
(default). You'll have to make your own Wikipedia indices with the pctg
deletes, then edit sortBench2.py & sortCollate2.py to fix the paths.
All they do is write an alg file, run the test, and parse the output file to
gather best of 5.
> BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs
> -----------------------------------------------------------------------
>
> Key: LUCENE-1476
> URL: https://issues.apache.org/jira/browse/LUCENE-1476
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: 2.4
> Reporter: Jason Rutherglen
> Priority: Trivial
> Attachments: LUCENE-1476.patch, LUCENE-1476.patch, LUCENE-1476.patch,
> LUCENE-1476.patch, LUCENE-1476.patch, quasi_iterator_deletions.diff,
> quasi_iterator_deletions_r2.diff, searchdeletes.alg, sortBench2.py,
> sortCollate2.py, TestDeletesDocIdSet.java
>
> Original Estimate: 12h
> Remaining Estimate: 12h
>
> Update BitVector to implement DocIdSet. Expose deleted docs DocIdSet from
> IndexReader.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]