The problem I have is that when I try to execute an optimize on my Lucene
index I get the following error thrown (see below).

If anyone can help, and the answer requires some digging, then I have the
very index tarred and gzipped for anon FTP access at ftp.catalyst.net.nz (in
the "pub" sub-directory). This is 462Mb, and unpacks to roughly twice that
size. There is also a README file there.

Here is the error I get very quickly when optimize runs:

--- CUT ---
java.lang.ArrayIndexOutOfBoundsException: 111 >= 23
� � � � at java.util.Vector.elementAt(Vector.java(Compiled Code))
� � � � at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java(Compiled 
Code))
� � � � at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java(Compiled 
Code))
� � � � at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java(Compiled 
Code))
� � � � at 
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java(Compiled 
Code))
� � � � at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:92)
� � � � at 
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:473)
� � � � at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:354)
� � � � at nz.net.catalyst.lucene.server.Optimize.execute(Optimize.java:80)
� � � � at nz.net.catalyst.lucene.server.Control.optimize(Control.java:87)
� � � � at nz.net.catalyst.lucene.server.Control.execute(Control.java:49)
� � � � at nz.net.catalyst.lucene.server.Dialogue.process(Dialogue.java:111)
� � � � at nz.net.catalyst.lucene.server.Session.communicate(Session.java:125)
� � � � at nz.net.catalyst.SocketClient.run(SocketClient.java:70)
� � � � at java.lang.Thread.run(Thread.java:512)
--- CUT ---

This was actually thrown by Lucene v1.4-rc2, which I was testing to see if it
solved my problem. I am currently running v1.3-Final on my live site and this 
does the same thing. This is running on Debian Linux, Woody, and is using the
IBM Runtime Environment for Linux Java(TM) 2 Technology Edition, Version
1.3.1, JRE.

It should be noted that I have had this problem before, and I solved it by
completely re-indexing the article set from scratch (starting with no index at
all). After that process, the optimize worked fine. Then somewhere along the
line of many days indexing new articles, and doing an optimise every day at
about 3.30am, the problem has returned.

The articles being indexed are all homogeneous in terms of fields being 
indexed, details below:

FIELD DEFINITIONS
Field name � � �Field type � � �Stored? � � � � Indexed?
---------- � � �---------- � � �------- � � � � --------
Domain � � � � �Text � � � � � �STORED � � � � �INDEXED
Id � � � � � � �Id � � � � � � �STORED � � � � �INDEXED
date � � � � � �Date � � � � � �STORED � � � � �INDEXED
datetime � � � �Date � � � � � �STORED � � � � �INDEXED
added � � � � � Date � � � � � �STORED � � � � �INDEXED
category � � � �Text � � � � � �STORED � � � � �INDEXED
subcategory � � Text � � � � � �STORED � � � � �INDEXED
source � � � � �Text � � � � � �STORED � � � � �INDEXED
title � � � � � Text � � � � � �STORED � � � � �NOT INDEXED
slug � � � � � �Text � � � � � �STORED � � � � �NOT INDEXED
type � � � � � �Text � � � � � �STORED � � � � �NOT INDEXED
sourcetype � � �Text � � � � � �NOT STORED � � �INDEXED


Any help greatly appreciated.

Cheers,
Paul.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to