[ https://issues.apache.org/jira/browse/LUCENE-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463294 ]
Michael McCandless commented on LUCENE-140: ------------------------------------------- Jed, one question: when you tested the fix, you fully rebuilt your index from scratch, right? Just want to verify that. You have to re-index because once the index is corrupted it will eventually hit the "docs out of order" exception even if you fix the original cause. OK I've prepared a patch off 1.9.1 (just attached it). The patch passes all unit tests on 1.9.1. It has the changes I committed to the trunk yesterday, plus instrumentation (messages printed to a PrintStream) to catch places where doc numbers are not correct. All messages I added print to a newly added infoStream static member of SegmentMerger. You can do SegmentMerger.setInfoStream(...) to change it (it defaults to System.err). Jed if you could get the error to re-occur with this patch and then post the resulting messages, that would be great. Hopefully it gives us enough information to find the source here or at least to have another iteration with yet more instrumentation. Thanks! > docs out of order > ----------------- > > Key: LUCENE-140 > URL: https://issues.apache.org/jira/browse/LUCENE-140 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: unspecified > Environment: Operating System: Linux > Platform: PC > Reporter: legez > Assigned To: Michael McCandless > Attachments: bug23650.txt, corrupted.part1.rar, corrupted.part2.rar, > LUCENE-140-2007-01-09-instrumentation.patch > > > Hello, > I can not find out, why (and what) it is happening all the time. I got an > exception: > java.lang.IllegalStateException: docs out of order > at > org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:219) > at > org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:191) > at > org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:172) > at > org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:135) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88) > at > org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:341) > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:250) > at Optimize.main(Optimize.java:29) > It happens either in 1.2 and 1.3rc1 (anyway what happened to it? I can not > find > it neither in download nor in version list in this form). Everything seems > OK. I > can search through index, but I can not optimize it. Even worse after this > exception every time I add new documents and close IndexWriter new segments is > created! I think it has all documents added before, because of its size. > My index is quite big: 500.000 docs, about 5gb of index directory. > It is _repeatable_. I drop index, reindex everything. Afterwards I add a few > docs, try to optimize and receive above exception. > My documents' structure is: > static Document indexIt(String id_strony, Reader reader, String > data_wydania, > String id_wydania, String id_gazety, String data_wstawienia) > { > Document doc = new Document(); > doc.add(Field.Keyword("id", id_strony )); > doc.add(Field.Keyword("data_wydania", data_wydania)); > doc.add(Field.Keyword("id_wydania", id_wydania)); > doc.add(Field.Text("id_gazety", id_gazety)); > doc.add(Field.Keyword("data_wstawienia", data_wstawienia)); > doc.add(Field.Text("tresc", reader)); > return doc; > } > Sincerely, > legez -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]