[ 
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1476:
---------------------------------------

    Attachment: sortCollate2.py
                sortBench2.py


I'm attaching the Python code I use to run (it's adapted from lucene-1483).  
You also need the following nocommit diff applied:

{code}
Index: 
contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java
===================================================================
--- 
contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java
   (revision 738896)
+++ 
contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ReadTask.java
   (working copy)
@@ -62,6 +62,9 @@
     super(runData);
   }
 
+  // nocommit
+  static boolean first = true;
+
   public int doLogic() throws Exception {
     int res = 0;
     boolean closeReader = false;
@@ -102,6 +105,12 @@
         }
         //System.out.println("q=" + q + ":" + hits.totalHits + " total hits"); 
 
+        // nocommit
+        if (first) {
+          System.out.println("NUMHITS=" + hits.totalHits);
+          first = false;
+        }
+
         if (withTraverse()) {
           final ScoreDoc[] scoreDocs = hits.scoreDocs;
           int traversalSize = Math.min(scoreDocs.length, traversalSize());
{code}

Just run sortBench2.py in contrib/benchmark of trunk & patch areas.  Then run 
sortCollate2.py to make the Jira table (-jira) or print a human readable output 
(default).  You'll have to make your own Wikipedia indices with the pctg 
deletes, then edit sortBench2.py & sortCollate2.py to fix the paths.

All they do is write an alg file, run the test, and parse the output file to 
gather best of 5.

> BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-1476
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1476
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Trivial
>         Attachments: LUCENE-1476.patch, LUCENE-1476.patch, LUCENE-1476.patch, 
> LUCENE-1476.patch, LUCENE-1476.patch, quasi_iterator_deletions.diff, 
> quasi_iterator_deletions_r2.diff, searchdeletes.alg, sortBench2.py, 
> sortCollate2.py, TestDeletesDocIdSet.java
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> Update BitVector to implement DocIdSet.  Expose deleted docs DocIdSet from 
> IndexReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to