[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539787 ]
Hoss Man commented on LUCENE-1044: ---------------------------------- first off: there have been *numerous* changes to the way lucene writes to files (particularly relating to segment files, write locks, and fault tollerance) between 2.0 and 2.2 (not to mention differences between 1.4.3 and 2.0 that i may not be aware of) -- so you may see many differences in behavior if you upgrade. second: to quote myself from a recent thread regarding lucene and "kill -9" ... http://www.nabble.com/Help-with-Lucene-Indexer-crash-recovery-tf4572570.html#a13068939 {quote} : That said, it should never in fact cause index corruption, as far as I : know. Lucene is "semi-transactional": at any & all moments you should : be able to destroy the JVM and the index will be unharmed. I would : really like to get to the bottom of why this is not the case here. At any point you can shutdown the JVM and the index will be unharmed, but "destroying" it with "kill -9" goes a little farther then that. Lucene can't make that claim because the JVM can't even garuntee that bytes are written to physical disk when we close() an OutputStream -- all it garuntees is that the bytes have been handed to the OS. When you "kill -9" a process the OS is free to make *EVERYTHING* about that process vanish without cleaning up after it ... i'm pretty sure even pending IO operations are fair game for disappearing. {quote} ...what's true for "kill -9" is true for hanking the power cord ... if the JVM isn't shut down cleanly, there is nothing Lucene or the JVM can do to guarantee that your index is in a consistent state. > Behavior on hard power shutdown > ------------------------------- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 > Reporter: venkat rangan > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]