I applied your patch, but the result is the same error message...

Michael Nebel wrotte:

Hi,

I fixed the problem with the following patch:

--- IndexOptimizer.java 2005-08-04 12:55:54.000000000 +0200
+++ IndexOptimizer.java.~1.6.~  2005-01-21 00:48:50.000000000 +0100
@@ -138,7 +138,7 @@

         if (score > minScore) {
           sdq.put(new ScoreDoc(doc, score));
-          if (sdq.size() >= count) {               // if sdq overfull
+          if (sdq.size() > count) {               // if sdq overfull
sdq.pop(); // remove lowest in sdq
             minScore = ((ScoreDoc)sdq.top()).score; // reset minScore
           }

My index shrinked from 8.5 GB to 0.5 GB. I found no documentation about the background of this tool. Can anyone tell me, what's the idea behind?

Regards

    Michael



Andy Liu wrote:

I believe this tool is unfinished and unsupported.

On 7/22/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

I found an IndexOptimzer in nutch.
When I run it, it dorps an exception:
....
Optimizing url:http from 226957 to 22696
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 22697 at org.apache.lucene.util.PriorityQueue.put(PriorityQueue.java:46)
       at
org.apache.nutch.indexer.IndexOptimizer$OptimizingTermPositions.seek(IndexOptimizer.java:153)
       at
org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:325)
       at
org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:296)
       at
org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:270)
       at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:234)
       at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96)
       at
org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:578)
       at
org.apache.nutch.indexer.IndexOptimizer.optimize(IndexOptimizer.java:215)
       at
org.apache.nutch.indexer.IndexOptimizer.main(IndexOptimizer.java:235)






-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to