"Andi Vajda" <[EMAIL PROTECTED]> wrote: > > I tried all morning to isolate the problem but I seem to be unable > to reproduce it in a simple unit test. In my application, I've been > able to get errors by doing even less: just creating a FSDirectory > and adding documents with fields with term vectors fails when > optimizing the index with the error below. I even tried to add the > same documents, in the same order, in the unit test but to no > avail. It just works.
Are you trying your unit test first in Python (using PyLucene)? > What is different about my environment ? Well, I'm running PyLucene, > but the new one, the one using a Apple's Java VM, the same VM I'm > using to run the unit test. And I'm not doing anything special like > calling back into Python or something, I'm just calling regular > Lucene APIs adding documents into an IndexWriter on an FSDirectory > using a StandardAnalyzer. If I stop using term vectors, all is > working fine. Spooky. It's definitely possible something is broken (there is alot of new code in 2.3). Are your documents irregular wrt term vectors? (Ie some docs have none, others store the terms but not positions/offsets, etc?). Any interesting changes to Lucene's defaults (autoCommit=false, etc)? > I'd like to get to the bottom of this but could use some help. Does > the stacktrace below ring a bell ? Is there a way to run the whole > indexing and optimizing in one single thread ? You can easily turn off the concurrent (background) merges by doing this: writer.setMergeScheduler(new SerialMergeScheduler()) though that probably isn't punched through to Python in PyLucene. You can also build a Lucene JAR w/ a small change to IndexWriter.java to do the same thing. That stacktrace is happening while merging term vectors during an optimize. It's specifically occuring when loading the term vectors for a given doc X; we read a position from the index stream (tvx) just fine, but then when we try to read the first vInt from the document stream (tvd) we hit the EOF exception. So that position was too large or the tvd file was somehow truncated. Weird. Can you call "writer.setInfoStream(System.out)" and get the error to occur and then post the resulting log? It may shed some light here.... Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]