I posted this to the lucene-user list a few days ago, to a resounding silence and so thought I'd try my luck here on the dev list. ;-)
Cheers, Paul. The problem I have is that when I try to execute an optimize on my Lucene index I get the following error thrown (see below). If anyone can help, and the answer requires some digging, then I have the very index tarred and gzipped for anon FTP access at ftp.catalyst.net.nz (in the "pub" sub-directory). This is 462Mb, and unpacks to roughly twice that size. There is also a README file there. Here is the error I get very quickly when optimize runs: --- CUT --- java.lang.ArrayIndexOutOfBoundsException: 111 >= 23 at java.util.Vector.elementAt(Vector.java(Compiled Code)) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java(Compiled Code)) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java(Compiled Code)) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java(Compiled Code)) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java(Compiled Code)) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:92) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:473) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:354) at nz.net.catalyst.lucene.server.Optimize.execute(Optimize.java:80) at nz.net.catalyst.lucene.server.Control.optimize(Control.java:87) at nz.net.catalyst.lucene.server.Control.execute(Control.java:49) at nz.net.catalyst.lucene.server.Dialogue.process(Dialogue.java:111) at nz.net.catalyst.lucene.server.Session.communicate(Session.java:125) at nz.net.catalyst.SocketClient.run(SocketClient.java:70) at java.lang.Thread.run(Thread.java:512) --- CUT --- This was actually thrown by Lucene v1.4-rc2, which I was testing to see if it solved my problem. I am currently running v1.3-Final on my live site and this does the same thing. This is running on Debian Linux, Woody, and is using the IBM Runtime Environment for Linux Java(TM) 2 Technology Edition, Version 1.3.1, JRE. It should be noted that I have had this problem before, and I solved it by completely re-indexing the article set from scratch (starting with no index at all). After that process, the optimize worked fine. Then somewhere along the line of many days indexing new articles, and doing an optimise every day at about 3.30am, the problem has returned. The articles being indexed are all homogeneous in terms of fields being indexed, details below: FIELD DEFINITIONS Field name Field type Stored? Indexed? ---------- ---------- ------- -------- Domain Text STORED INDEXED Id Id STORED INDEXED date Date STORED INDEXED datetime Date STORED INDEXED added Date STORED INDEXED category Text STORED INDEXED subcategory Text STORED INDEXED source Text STORED INDEXED title Text STORED NOT INDEXED slug Text STORED NOT INDEXED type Text STORED NOT INDEXED sourcetype Text NOT STORED INDEXED Any help greatly appreciated. Cheers, Paul. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]