[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-1044: --------------------------------------- Attachment: LUCENE-1044.take5.patch Initial patch attached: * Created new commit() method; deprecated public flush() method * Changed IndexWriter to not write segments_N when flushing, only when syncing (added new private sync() for this). The current "policy" is to sync only after merges are committed. When autoCommit=false we do not sync until close() or commit() is called * Added MockRAMDirectory.crash() to simulate a machine crash. It keeps track of un-synced files, and then in crash() it goes and corrupts any unsynced files rather aggressively. * Added a new unit test, TestCrash, to crash the MockRAMDirectory at various interesting times & make sure we can still load the resulting index. * Added new Directory.sync() method. In FSDirectory.sync, if I hit an IOException when opening or sync'ing, I retry (currently after waiting 5 msec, and retrying up to 5 times). If it still fails after that, the original exception is thrown and the new segments_N will not be written (and, the previous commit will also not be deleted). All tests now pass, but there is still alot to do, eg at least: * Javadocs * Refactor syncing code so DirectoryIndexReader.doCommit can use it as well. * Change format of segments_N to include a hash of its contents, at the end. I think this is now necessary in case we crash after writing segments_N but before we can sync it, to ensure that whoever next opens the reader can detect corruption in this segments_N file. > Behavior on hard power shutdown > ------------------------------- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 > Reporter: venkat rangan > Assignee: Michael McCandless > Fix For: 2.4 > > Attachments: FSyncPerfTest.java, LUCENE-1044.patch, > LUCENE-1044.take2.patch, LUCENE-1044.take3.patch, LUCENE-1044.take4.patch, > LUCENE-1044.take5.patch > > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]