[ http://issues.apache.org/jira/browse/LUCENE-415?page=comments#action_12371385 ]
Andy Hind commented on LUCENE-415: ---------------------------------- The problem is that the output is going into a file that already exists. I assume it leaves and then finds old bits during random access and gets confused. If a merge fails while it is writing its output segment file you have a segment file that contains rubbish. This can occur if you are unlucky when you kill the JVM (and to repeat the problem, set a break point and kill the JVM just before the segment write completes). The next time a merge takes place it writes to the segment file that already exists - as the same file name is generated for the new segment file. It always blows with an error similar to that reported for this bug. The file.getChannel() solved some fairly odd but repeatable issues with stale/invalid file handles under windows XP. > Merge error during add to index (IndexOutOfBoundsException) > ----------------------------------------------------------- > > Key: LUCENE-415 > URL: http://issues.apache.org/jira/browse/LUCENE-415 > Project: Lucene - Java > Type: Bug > Components: Index > Versions: 1.4 > Environment: Operating System: Linux > Platform: Other > Reporter: Daniel Quaroni > Assignee: Lucene Developers > > I've been batch-building indexes, and I've build a couple hundred indexes > with > a total of around 150 million records. This only happened once, so it's > probably impossible to reproduce, but anyway... I was building an index with > around 9.6 million records, and towards the end I got this: > java.lang.IndexOutOfBoundsException: Index: 54, Size: 24 > at java.util.ArrayList.RangeCheck(ArrayList.java:547) > at java.util.ArrayList.get(ArrayList.java:322) > at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:155) > at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:151) > at > org.apache.lucene.index.SegmentTermEnum.readTerm(SegmentTermEnum.java > :149) > at org.apache.lucene.index.SegmentTermEnum.next > (SegmentTermEnum.java:115) > at org.apache.lucene.index.SegmentMergeInfo.next > (SegmentMergeInfo.java:52) > at org.apache.lucene.index.SegmentMerger.mergeTermInfos > (SegmentMerger.java:294) > at org.apache.lucene.index.SegmentMerger.mergeTerms > (SegmentMerger.java:254) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:93) > at org.apache.lucene.index.IndexWriter.mergeSegments > (IndexWriter.java:487) > at org.apache.lucene.index.IndexWriter.maybeMergeSegments > (IndexWriter.java:458) > at > org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:310) > at > org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:294) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]