I did see that bug, which made me suspect Lucene. In my case, I tracked down the problem. It was my own application. I was using Java's FileChannel.transferTo functions to copy my index from one location to another. One of the files is bigger than 2^31-1 bytes. So, one of my files was corrupted during the copy because I was just doing one pass. I now loop the copy function until the entire file is copied and everything works fine.
DOH! ----- Original Message ---- From: Yonik Seeley <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, January 16, 2008 4:57:08 PM Subject: Re: IOException: read past EOF during optimize phase This may be a Lucene bug... IIRC, I saw at least one other lucene user with a similar stack trace. I think the latest lucene version (2.3 dev) should fix it if that's the case. -Yonik On Jan 16, 2008 3:07 PM, Kevin Osborn <[EMAIL PROTECTED]> wrote: > I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then copies it back to the working location. There are currently not any instances of Solr reading this index. Also, I commit after every 100000 rows. The schema.xml and solrconfig.xml files have not changed. > > Here is my function call. > protected void optimizeProducts() throws IOException { > UpdateHandler updateHandler = m_SolrCore.getUpdateHandler(); > CommitUpdateCommand commitCmd = new CommitUpdateCommand(true); > commitCmd.optimize = true; > > updateHandler.commit(commitCmd); > > log.info("Optimized index"); > } > > So, during the optimize phase, I get the following stack trace: > java.io.IOException: read past EOF > at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:89) > at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34) > at org.apache.lucene.store.IndexInput.readChars(IndexInput.java:107) > at org.apache.lucene.store.IndexInput.readString(IndexInput.java:93) > at org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:211) > at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:119) > at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:323) > at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:206) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) > at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) > at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) > at ... > > There are no exceptions or anything else that appears to be incorrect during the adds or commits. After this, the index files are still non-optimized. > > I know there is not a whole lot to go on here. Anything in particular that I should look at? > >